- 03 Oct, 2018 3 commits
-
-
Michael Hinton authored
Bug 5231
-
Danny Auble authored
-
Morris Jette authored
-
- 02 Oct, 2018 5 commits
-
-
Danny Auble authored
-
Marshall Garey authored
Bug 5708
-
Jason Booth authored
This matches the functionality in code. Bug 5726
-
Marshall Garey authored
Bug 5795
-
Jason Booth authored
Continuation of 06582da8 (17.11.9) Poll was timing out too quickly due to an incorrect conversion of MessageTimeout. Added a multiplier so timeout reflects the correct millisecond value. Bug 5553
-
- 29 Sep, 2018 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
Coverity CID 188467
-
- 28 Sep, 2018 21 commits
-
-
Michael Hinton authored
"set errno to indicate error" made some think that the function had to explicitly set the errno before returning, when it is often done by sub functions. Bug 5376
-
Michael Hinton authored
Replace these with either SLURM_ERROR or SLURM_SUCCESS Bug 5392
-
Danny Auble authored
# Conflicts: # src/common/gres.c
-
Danny Auble authored
# Conflicts: # src/common/slurm_protocol_pack.c # src/slurmctld/controller.c
-
Morris Jette authored
This adds the ability for a step to explicitly clear a job's gres counter. For example, the job requests --gpus-per-task=2 and the step requests --gpus-per-task=0.
-
Morris Jette authored
change some comments and log messages. no change in functionality
-
Morris Jette authored
failing to propagate gpus-per-task for job step
-
Michael Hinton authored
Since sacct exits the program when it finds unknown parameters, the debug2() statements should be escalated to error() statements. The first debug2() even had "Error: " in it. Get rid of extra newline. Bug 5421
-
Marshall Garey authored
Bug 5786
-
Michael Hinton authored
Bug 5165
-
Morris Jette authored
Bug 5743. Co-authored-by: Dominik Bartkiewicz <bart@schedmd.com>
-
Dominik Bartkiewicz authored
Move reused GRES parsing logic to a shared function. (Corrected author; was previously mis-committed as 433257c6.) Bug 5743.
-
Tim Wickberg authored
This reverts commit 433257c6.
-
Morris Jette authored
This just tests the gpus-per-node option. Additional tests are needed for gpus-per-task, etc. as well as the logic to support those options for job steps.
-
Morris Jette authored
-
Danny Auble authored
Bug 5165 It was still possible for job priority to go down to 0, because I was letting last_prio get set down to 1, then later on we were using last_prio-1 as the new job priority. Since last_prio was 1, then new job priority was 1-1=0. This happened when the job priorities were all at 2 (which I never tested). Michael found it by running test2.26 in a case where initial job priorities were 2: spawn /home/marshall/slurm/18.08/voyager/bin/squeue --name=test2.26 -o JOB_ID=%A PRIO=%Q JOB_ID=JOBID PRIO=PRIORITY JOB_ID=10702 PRIO=2 JOB_ID=10703 PRIO=2 JOB_ID=10704 PRIO=2 Reset and test job priorities spawn /home/marshall/slurm/18.08/voyager/bin/scontrol top 10704,10703 spawn /home/marshall/slurm/18.08/voyager/bin/squeue --name=test2.26 -o JOB_ID=%A PRIO=%Q JOB_ID=JOBID PRIO=PRIORITY JOB_ID=10704 PRIO=2 JOB_ID=10703 PRIO=1 JOB_ID=10702 PRIO=0 Reset and test job priorities
-
Danny Auble authored
Fix memory leak in previous patch. Bug 5479 Coverity 188464
-
Dominik Bartkiewicz authored
The format for this field is "<inst>:<svc_id>:<comp_id>"; if inst or svc_id are unset then they're blank, and the parsing code had assumed they would always be blank as the systems slurmsmwd were developed on never used them. Bug 5411.
-
Morris Jette authored
If a job with running steps is resized, that will result in the job's core_bitmap size changing, but not that of the step, resulting in an error message. This adds a job resized flag and prints a more meaningful message when appropriate.
-
Morris Jette authored
change to bit_ffs/bit_fls rather than full scan, improves performance
-
Morris Jette authored
-
- 27 Sep, 2018 9 commits
-
-
Morris Jette authored
-
Morris Jette authored
check for TaskPlugin != none at start Print all hostnames associated with launched job (to diagnose problems)
-
Marshall Garey authored
Prevents job priorities from being lowered to 0 or from underflowing. Bug 5165
-
Michael Hinton authored
Bug 5231 Change slurmctld/controller.c's purge_thread_lock from static to global extern, so the mutex can be used in slurmctld/job_mgr.c slurm_cond_signal() needs to always be wrapped in the same mutex as slurm_cond_[timed]wait(), or else there is a possibility that slurm_cond_signal() will trigger before slurm_cond_[timed]wait() is even listening, most likely causing a deadlock.
-
Jason Booth authored
Bug 5703
-
Michael Hinton authored
initialize old fairshare association pointer correctly. Bug 5744
-
Broderick Gardner authored
Bug 5458
-
Broderick Gardner authored
Bug 5403
-
Morris Jette authored
Coverity CID 187779
-