- 28 Jan, 2015 1 commit
-
-
David Bigagli authored
-
- 27 Jan, 2015 1 commit
-
-
David Bigagli authored
-
- 26 Jan, 2015 1 commit
-
-
Aaron Knister authored
-
- 23 Jan, 2015 2 commits
-
-
Dorian Krause authored
-
Morris Jette authored
-
- 22 Jan, 2015 2 commits
-
-
David Bigagli authored
-
Danny Auble authored
-
- 21 Jan, 2015 4 commits
-
-
Morris Jette authored
If some tasks of a job array are runnable and the meta-job array record is not runable (e.g. held), the old logic could start a runable task then try to start the non-runable meta-job, discover it can not run, and set its reason to "BadConstraints". Test case: Make it so no jobs can start (partition stopped, slurmd down, etc.) submit a job array hold the job array release the first two tasks of the job array Make it so jobs can start
-
Morris Jette authored
Squeue modified to not merge tasks of a job array if their wait reasons differ. bug 1388
-
Morris Jette authored
No functions currently exist, only the plugin wrapper and stubbed functions.
-
Morris Jette authored
-
- 20 Jan, 2015 3 commits
-
-
David Bigagli authored
--export sbatch/srun command line option, propagate the users' environ to the execution side. #1367
-
Danny Auble authored
-
Morris Jette authored
Interpret a partition configuration of "Nodes=ALL" in slurm.conf as including all nodes defined in the cluster. but 1382
-
- 19 Jan, 2015 1 commit
-
-
jette authored
bug 1379
-
- 17 Jan, 2015 1 commit
-
-
jette authored
bug 1375
-
- 15 Jan, 2015 3 commits
-
-
Danny Auble authored
What this does is use the core level binding after each task is laid out to skip all the extra threads in the core so it doesn't give them to another task. It probably isn't perfect, but does solve all the scenarios I found.
-
Morris Jette authored
Fix for GRES scheduling in which there is CPU topology defined or GRES types defined and there is more than 1 GPU per topology record in slurmctld. Without this fix, only one GRES could be allocated from each defined topology. bug 1369
-
Morris Jette authored
The slurmctld could abort with a gres configuration having Type= configured, but no CPU binding configured.
-
- 14 Jan, 2015 3 commits
-
-
David Bigagli authored
-
Danny Auble authored
-
David Bigagli authored
-
- 13 Jan, 2015 3 commits
-
-
Danny Auble authored
-
Morris Jette authored
For advanced reservation, replace flag "License_only" with flag "Any_Nodes". It can be used to indicate the an advanced reservation resources (licenses and/or burst buffers) can be used with any compute nodes.
-
Danny Auble authored
Most of these don't matter as they are all NO_LOCK Fallout from commit f1ebdef1 when the resources were added.
-
- 12 Jan, 2015 2 commits
-
-
Morris Jette authored
This only adds the field to data structures and does not implement support
-
David Bigagli authored
-
- 09 Jan, 2015 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
- 08 Jan, 2015 1 commit
-
-
Brian Christiansen authored
Bug 1352
-
- 07 Jan, 2015 5 commits
-
-
Brian Christiansen authored
Bug 1352
-
Danny Auble authored
-
Aaron Knister authored
-
Rémi Palancher authored
Intel MPI, on MPI jobs initialisation through PMI, uses to call PMI_KVS_Put() many many times from task at rank 0, and each on these call is followed by PMI_KVS_Commit(). Slurm implementation of PMI_KVS_Commit() imposes a delay to avoid DDOS on original srun. This delay is proportional to the total number. It could be up to 3 secs for large jobs for ex. with 7168 tasks. Therefore, when Intel MPI calls PMI_KVS_Commit() 475 times (mesured on a test case) from task at rank 0, 28 minutes are spent in delay function. All other tasks in the job are waiting for a PMI_Barrier. Therefore, there is no risk for a DDOS from this single task 0. The patch alters the delaying time calculation to make sure task at rank 0 will does not be delayed. All other tasks are globally spreaded in the same time range as before.
-
Aaron Knister authored
-
- 06 Jan, 2015 5 commits
-
-
Morris Jette authored
Added Makefile for contribs/sgi file. Moved hypercube symbol definitions from select/linear to common. Minor format changes for consistency with other Slurm code. Moved a variable definition (l_distance) to start of code block to avoid error with some compilers. Fix for possible uninitialized variable use (leftover_nodes).
-
Morris Jette authored
Fix race condition that could start a job that is dependent upon a job array before all tasks of that job array complete. bug 1324
-
Brian Christiansen authored
Bug 1350
-
Danny Auble authored
flag from a job while the job is waiting for a block to boot.
-
Danny Auble authored
-