- 27 Aug, 2018 4 commits
-
-
Morris Jette authored
Make test suite aware of how GPUs are bound to sockets
-
Morris Jette authored
Introduced in commit d382a477
-
Morris Jette authored
Modify a gpu/socket binding test to recognize newly available information from "scontrol show node" in commit 63b95516
-
Morris Jette authored
If GRES as associated with specific sockets, identify those sockets in the output of "scontrol show node". For example if all 4 GPUs on a node are all associated with socket zero, then "Gres=gpu:4(S:0)". If associated with sockets 0 and 1 then "Gres=gpu:4(S:0-1)". The information of which specific GPUs are associated with specific GPUs is not reported, but only available by parsing the gres.conf file.
-
- 25 Aug, 2018 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
Fix possible reference of unset bitmap Fix some logic problems related to non-socket bound GRES
-
- 24 Aug, 2018 8 commits
-
-
Tim Wickberg authored
-
Tim Wickberg authored
This has never been officially documented, and we would rather see sites use preempt/qos instead. Bug 5628.
-
Morris Jette authored
-
Morris Jette authored
bug 5629
-
Morris Jette authored
New --gpu options in the commands and some configuration file changes
-
Morris Jette authored
Coverity CID 187783
-
Morris Jette authored
Previously when a job was resized to zero and its resources moved to another job (--depend=expand) nothing ran epilog on its nodes to kill processes and otherwise clean up the allocation. This commit adds that logic
-
Morris Jette authored
This avoids a double node allocation as commit 9b55a09b moves the logic to a common location, removing the need for it to be specifically in the cons_tres plugin
-
- 23 Aug, 2018 9 commits
-
-
Danny Auble authored
Bug 5619 Tim approved
-
Danny Auble authored
Bug 5618
-
Tim Wickberg authored
Throw away all but the NEWS entry.
-
Tim Wickberg authored
Development continues on the master branch for 19.05, but the snapshot here should not be used and is thus being removed. Remove the commented-out documentation as well.
-
Tim Wickberg authored
-
Morris Jette authored
This error occurs when one job is used to expand the allocation of another job. The node record's "run_job_cnt" is decremented when the dependent job's epilog completes and the job getting those resources never has the "run_job_cnt" updated for it, which later results in the "comp_job_cnt" underflow when it ends. This bug was discovered in the course of select/cons_tres development, but impacts all select plugins.
-
Yu Watanabe authored
-
Morris Jette authored
when a batch job gets resized to 0 nodes, it instantly gets killed, leaving resize scripts around. Also on a Cray system, the resize can not happen while the job step NHC is still active. This patch fixes both issues
-
Morris Jette authored
-
- 22 Aug, 2018 12 commits
-
-
Morris Jette authored
Coverity CID 187784 and 187785
-
Morris Jette authored
Coverity CID 187782
-
Danny Auble authored
# Conflicts: # NEWS
-
Danny Auble authored
Bug 5608 Tim approved
-
Brian Christiansen authored
If the dbd comes up after a job array has been submitted to the controller, the controller calls _update_job_tres() which calls assoc_mgr_set_tres_cnt_array() which allocates memory for the job's tres_alloc_cnt. The job array gets scheduled, but job_array_split() doesn't NULL out the pending job's tres_alloc_cnt, so both the array task and the pending array job are pointing to the same memory. The array task calls job_set_alloc_tres() which free's the running job's tres_alloc_cnt and now the pending array job is pointing to bad memory and when the array splits again the new array task tries to free tres_alloc_cnt in job_set_alloc_tres() and segfaults. Bug 5604
-
Morris Jette authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
Update slurm.spec and slurm.spec-legacy as well.
-
Tim Wickberg authored
-
Danny Auble authored
-
Danny Auble authored
This will need to be revisited perhaps link to libslurmfull.so instead since we will need symbols not exported. See bug 5605
-
- 21 Aug, 2018 5 commits
-
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Danny Auble authored
The previous commit causes erroneous errors to be printed. test2.14 was made a victim of it.
-
Danny Auble authored
# Conflicts: # src/plugins/jobacct_gather/common/common_jag.c
-
Danny Auble authored
-