- 28 Sep, 2018 19 commits
-
-
Danny Auble authored
# Conflicts: # src/common/gres.c
-
Danny Auble authored
# Conflicts: # src/common/slurm_protocol_pack.c # src/slurmctld/controller.c
-
Morris Jette authored
This adds the ability for a step to explicitly clear a job's gres counter. For example, the job requests --gpus-per-task=2 and the step requests --gpus-per-task=0.
-
Morris Jette authored
change some comments and log messages. no change in functionality
-
Morris Jette authored
failing to propagate gpus-per-task for job step
-
Michael Hinton authored
Since sacct exits the program when it finds unknown parameters, the debug2() statements should be escalated to error() statements. The first debug2() even had "Error: " in it. Get rid of extra newline. Bug 5421
-
Marshall Garey authored
Bug 5786
-
Michael Hinton authored
Bug 5165
-
Morris Jette authored
Bug 5743. Co-authored-by: Dominik Bartkiewicz <bart@schedmd.com>
-
Dominik Bartkiewicz authored
Move reused GRES parsing logic to a shared function. (Corrected author; was previously mis-committed as 433257c6.) Bug 5743.
-
Tim Wickberg authored
This reverts commit 433257c6.
-
Morris Jette authored
This just tests the gpus-per-node option. Additional tests are needed for gpus-per-task, etc. as well as the logic to support those options for job steps.
-
Morris Jette authored
-
Danny Auble authored
Bug 5165 It was still possible for job priority to go down to 0, because I was letting last_prio get set down to 1, then later on we were using last_prio-1 as the new job priority. Since last_prio was 1, then new job priority was 1-1=0. This happened when the job priorities were all at 2 (which I never tested). Michael found it by running test2.26 in a case where initial job priorities were 2: spawn /home/marshall/slurm/18.08/voyager/bin/squeue --name=test2.26 -o JOB_ID=%A PRIO=%Q JOB_ID=JOBID PRIO=PRIORITY JOB_ID=10702 PRIO=2 JOB_ID=10703 PRIO=2 JOB_ID=10704 PRIO=2 Reset and test job priorities spawn /home/marshall/slurm/18.08/voyager/bin/scontrol top 10704,10703 spawn /home/marshall/slurm/18.08/voyager/bin/squeue --name=test2.26 -o JOB_ID=%A PRIO=%Q JOB_ID=JOBID PRIO=PRIORITY JOB_ID=10704 PRIO=2 JOB_ID=10703 PRIO=1 JOB_ID=10702 PRIO=0 Reset and test job priorities
-
Danny Auble authored
Fix memory leak in previous patch. Bug 5479 Coverity 188464
-
Dominik Bartkiewicz authored
The format for this field is "<inst>:<svc_id>:<comp_id>"; if inst or svc_id are unset then they're blank, and the parsing code had assumed they would always be blank as the systems slurmsmwd were developed on never used them. Bug 5411.
-
Morris Jette authored
If a job with running steps is resized, that will result in the job's core_bitmap size changing, but not that of the step, resulting in an error message. This adds a job resized flag and prints a more meaningful message when appropriate.
-
Morris Jette authored
change to bit_ffs/bit_fls rather than full scan, improves performance
-
Morris Jette authored
-
- 27 Sep, 2018 20 commits
-
-
Morris Jette authored
-
Morris Jette authored
check for TaskPlugin != none at start Print all hostnames associated with launched job (to diagnose problems)
-
Marshall Garey authored
Prevents job priorities from being lowered to 0 or from underflowing. Bug 5165
-
Michael Hinton authored
Bug 5231 Change slurmctld/controller.c's purge_thread_lock from static to global extern, so the mutex can be used in slurmctld/job_mgr.c slurm_cond_signal() needs to always be wrapped in the same mutex as slurm_cond_[timed]wait(), or else there is a possibility that slurm_cond_signal() will trigger before slurm_cond_[timed]wait() is even listening, most likely causing a deadlock.
-
Jason Booth authored
Bug 5703
-
Michael Hinton authored
initialize old fairshare association pointer correctly. Bug 5744
-
Broderick Gardner authored
Bug 5458
-
Broderick Gardner authored
Bug 5403
-
Morris Jette authored
Coverity CID 187779
-
Morris Jette authored
Previous logic could abort in some cases
-
Morris Jette authored
The last merge from v18.08 (c6f696db) included changes that are incompatible with cgroup configuration read function changes that were made in master.
-
Morris Jette authored
-
Morris Jette authored
sinfo output in this test is grouping all nodes together in one output line (e.g. "nid[000001-000099'), while the data used to have one node per line. Adding the "-N" option to the test returns the output to one node per line, as required for the test to work.
-
Morris Jette authored
Bug 5750
-
Morris Jette authored
when the node_bitmap scan was changed from checking all nodes to just the nodes within the range set, the node_ptr became wrong, resulting in bad job counts on nodes and nodes marked as allocated when there were no jobs active (bug only in master).
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
No change in logic
-
Felip Moll authored
Bug 5748
-
- 26 Sep, 2018 1 commit
-
-
Danny Auble authored
-