- 09 Jan, 2015 4 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
This is needed for setups like this TaskPlugin = affinity TaskPlugin = task/affinity,task/cgroup TaskPlugin = affinity,cgroup
-
- 08 Jan, 2015 1 commit
-
-
Brian Christiansen authored
-
- 07 Jan, 2015 7 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
David Bigagli authored
Slurm 14.03 nccs
-
Aaron Knister authored
-
Rémi Palancher authored
Intel MPI, on MPI jobs initialisation through PMI, uses to call PMI_KVS_Put() many many times from task at rank 0, and each on these call is followed by PMI_KVS_Commit(). Slurm implementation of PMI_KVS_Commit() imposes a delay to avoid DDOS on original srun. This delay is proportional to the total number. It could be up to 3 secs for large jobs for ex. with 7168 tasks. Therefore, when Intel MPI calls PMI_KVS_Commit() 475 times (mesured on a test case) from task at rank 0, 28 minutes are spent in delay function. All other tasks in the job are waiting for a PMI_Barrier. Therefore, there is no risk for a DDOS from this single task 0. The patch alters the delaying time calculation to make sure task at rank 0 will does not be delayed. All other tasks are globally spreaded in the same time range as before.
-
Aaron Knister authored
-
Artem Polyakov authored
-
- 06 Jan, 2015 10 commits
-
-
David Bigagli authored
-
Morris Jette authored
Ammendment to commit 744f114b
-
Morris Jette authored
-
David Bigagli authored
-
Morris Jette authored
Fix race condition that could start a job that is dependent upon a job array before all tasks of that job array complete. bug 1324
-
Danny Auble authored
Conflicts: src/sbatch/opt.c
-
Brian Christiansen authored
Bug 1350
-
Danny Auble authored
flag from a job while the job is waiting for a block to boot.
-
Danny Auble authored
because of the referenced commit. ntasks_set is always true on a BGQ at this point.
-
Danny Auble authored
-
- 05 Jan, 2015 2 commits
-
-
David Bigagli authored
-
David Bigagli authored
-
- 02 Jan, 2015 2 commits
-
-
Brian Christiansen authored
Bug 1346
-
Danny Auble authored
a normal job.
-
- 01 Jan, 2015 1 commit
-
-
Brian Christiansen authored
-
- 31 Dec, 2014 1 commit
-
-
Brian Christiansen authored
-
- 30 Dec, 2014 4 commits
-
-
Morris Jette authored
It largely prevents Slurm control over CPU frequency
-
David Bigagli authored
-
David Bigagli authored
-
Danny Auble authored
-
- 29 Dec, 2014 1 commit
-
-
David Bigagli authored
-
- 26 Dec, 2014 1 commit
-
-
Jason Bacon authored
-
- 24 Dec, 2014 1 commit
-
-
Morris Jette authored
All jobs count against the limit except those which are HELD, have a begin time in the future, or have unsatisfied dependencies.
-
- 23 Dec, 2014 4 commits
-
-
Morris Jette authored
Prevent invalid job array task ID value if a task is started using gang scheduling (i.e. the task starts in a SUSPENDED state). The task ID gets set to NO_VAL and the task string is also cleared.
-
Morris Jette authored
-
Morris Jette authored
Prevent a job manually suspended from being resumed by gang scheduler once free resources are available. bug 1335
-
Dorian Krause authored
we have hit the following problem that seems to be present in Slurm slurm-14-11-2-1 and previous versions. When a node is reserved and an overlapping maint reservation is created and later deleted the scontrol output will report the node as IDLE rather than RESERVED: + scontrol show node node1 + grep State State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 + scontrol create reservation starttime=now duration=120 user=usr01000 nodes=node1 ReservationName=X Reservation created: X + sleep 10 + scontrol show nodes node1 + grep State State=RESERVED ThreadsPerCore=1 TmpDisk=0 Weight=1 + scontrol create reservation starttime=now duration=120 user=usr01000 nodes=ALL flags=maint,ignore_jobs ReservationName=Y Reservation created: Y + sleep 10 + grep State + scontrol show nodes node1 State=MAINT ThreadsPerCore=1 TmpDisk=0 Weight=1 + scontrol delete ReservationName=Y + sleep 10 + scontrol show nodes node1 + grep State * State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1* + scontrol delete ReservationName=X + sleep 10 + scontrol show nodes node1 + grep State State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Note that the after the deletion of reservation "X" the State=IDLE instead of State=RESERVED. I think that the delete_resv() function in slurmctld/reservation.c should call set_node_maint_mode(true) like update_resv() does. With the patch pasted at the end of this e-mail I get the following output which matches my expectation: + scontrol show node node1 + grep State State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 + scontrol create reservation starttime=now duration=120 user=usr01000 nodes=node1 ReservationName=X Reservation created: X + sleep 10 + scontrol show nodes node1 + grep State State=RESERVED ThreadsPerCore=1 TmpDisk=0 Weight=1 + scontrol create reservation starttime=now duration=120 user=usr01000 nodes=ALL flags=maint,ignore_jobs ReservationName=Y Reservation created: Y + sleep 10 + scontrol show nodes node1 + grep State State=MAINT ThreadsPerCore=1 TmpDisk=0 Weight=1 + scontrol delete ReservationName=Y + sleep 10 + scontrol show nodes node1 + grep State * State=RESERVED ThreadsPerCore=1 TmpDisk=0 Weight=1* + scontrol delete ReservationName=X + sleep 10 + scontrol show nodes node1 + grep State State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Thanks, Dorian
-
- 22 Dec, 2014 1 commit
-
-
Daniel Ahlin authored
Correct parsing of AccountingStoragePass when specified in old format (just a path name)
-