- 07 Apr, 2016 8 commits
-
-
Morris Jette authored
-
Morris Jette authored
Document and log cases where max jobs per user or partition is equal or greater than the max jobs test. In that case, a single user can easily stop all backfill scheduling.
-
Danny Auble authored
-
Brian Christiansen authored
doing a ntasks_per_core=1
-
Danny Auble authored
-
Sami Ilvonen authored
-
Morris Jette authored
-
Morris Jette authored
Fix for job "--contiguous" option that could cause job allocation/launch failure or slurmctld crash. bug 2573
-
- 06 Apr, 2016 16 commits
-
-
Morris Jette authored
Conflicts: META NEWS
-
Morris Jette authored
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
constraints mattered in a job. Details include: A job doesn't request memory but the system is running with CR_*MEMORY with no default memory limit and the job requests nodes with features of different sizes. Previously the order of constraints mattered where the smaller memory node would need to be requested first or the job would fail. Bug 2608
-
Danny Auble authored
-
Danny Auble authored
This reverts commit f559a55c.
-
Danny Auble authored
constraints mattered in a job. Details include: A job doesn't request memory but the system is running with CR_*MEMORY with no default memory limit and the job requests nodes with features of different sizes. Previously the order of constraints mattered where the smaller memory node would need to be requested first or the job would fail. Bug 2608
-
Morris Jette authored
-
Danny Auble authored
-
Morris Jette authored
Previous logic would get an account and/or QOS time limit and use that value to overwrite the incoming RPC's NO_VAL value, which would change a job's time limit when changing an unrelated field (e.g. priority, QOS, etc.). bug 2610
-
Danny Auble authored
-
Morris Jette authored
Prevent use of NULL pointer and SEGV when changing a job's QOS when the slurmdbd is not configured.
-
Morris Jette authored
bug 2609
-
Morris Jette authored
These tests failed with MinJobAge=3, so when the tests looked for completed jobs, the job records had already been purged. Log this configuration as a possible reason for failure.
-
Tim Wickberg authored
-
- 05 Apr, 2016 8 commits
-
-
Janne Blomqvist authored
-
Morris Jette authored
Conflicts: src/plugins/sched/backfill/backfill.c
-
Morris Jette authored
Fix backfill scheduler race condition that could cause invalid pointer in select/cons_res plugin. Bug introduced in 15.08.9, commit: efd9d35e The scenario is as follows 1. Backfill scheduler is running, then releases locks 2. Main scheduling loop starts a job "A" 3. Backfill scheduler resumes, finds job "A" in its queue and resets it's partition pointer. 4. Job "A" completes and tries to remove resource allocation record from select/cons_res data structure, but fails to find it because it is looking in the table for the wrong partition. 5. Job "A" record gets purged from slurmctld 6. Select/cons_res plugin attempts to operate on resource allocation data structure, finds pointer into the now purged data structure of job "A" and aborts or gets SEGV Bug 2603
-
Danny Auble authored
misleading.
-
Danny Auble authored
-
Danny Auble authored
instead of ID to make things easier to read.
-
Danny Auble authored
-
Danny Auble authored
# Conflicts: # src/common/gres.c
-
- 04 Apr, 2016 4 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
canceled while launching.
-
Morris Jette authored
-
- 02 Apr, 2016 3 commits
-
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
-
- 01 Apr, 2016 1 commit
-
-
Morris Jette authored
-