- 08 Sep, 2011 1 commit
-
-
Danny Auble authored
and vice versa the node->base partition lists will be displayed if setup in your .slurm/sviewrc file.
-
- 06 Sep, 2011 1 commit
-
-
Danny Auble authored
-
- 02 Sep, 2011 2 commits
-
-
Morris Jette authored
Fix bug which would crash slurmcld if job's owner (not root) tries to clear a job's licenses by setting value to "".
-
Morris Jette authored
If a job is deferred due to partition limits, then re-test those limits after a partition is modified. Patch from Don Lipari.
-
- 01 Sep, 2011 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
- 31 Aug, 2011 2 commits
-
-
Danny Auble authored
blocks.
-
Danny Auble authored
-
- 25 Aug, 2011 5 commits
-
-
Danny Auble authored
correct geometry of the system.
-
Danny Auble authored
shutdown and then the system size changed. This would probably only happen if you were emulating a system.
-
Danny Auble authored
-
Danny Auble authored
when we are on a small block, not a regular one.
-
Danny Auble authored
-
- 24 Aug, 2011 5 commits
-
-
Danny Auble authored
isn't sent to the slurmd where environment variables would be overwritten incorrectly.
-
Danny Auble authored
position in the block instead of absolute.
-
Morris Jette authored
If salloc was run as interactive, with job control, reset the foreground process group of the terminal to the process group of the parent pid before exiting. Patch from Don Albert, Bull.
-
Morris Jette authored
Add cray.conf parameter of SyncTimeout, maximum time to defer job scheduling if SLURM node or job state are out of synchronization with ALPS.
-
Danny Auble authored
-
- 23 Aug, 2011 1 commit
-
-
Danny Auble authored
-
- 22 Aug, 2011 2 commits
-
-
Danny Auble authored
_job_create() consistent with similar logic in select_nodes().
-
Danny Auble authored
partition in the slurm.conf.
-
- 19 Aug, 2011 1 commit
-
-
Morris Jette authored
One of our testers created an illegal topology.conf file. He has a config you probably wouldn't see in production, but can see in testing when you are sometimes given a collection of miscellaneous resources. |-- nodes switch1 --| |-- switch2 -- nodes He tried the topology.conf file below. Switch s1 is defined twice. Slurm accepted this config, but wouldn't allocate nodes from both switches to one job. SwitchName=s1 Nodes=xna[14-26] SwitchName=s2 Nodes=xna[41-43] SwitchName=s1 Switches=s2 I believe slurm shouldn't allow the second definition of switch s1. The attached patch checks for duplicate switch names. Patch from Rod Schultz, Bull.
-
- 17 Aug, 2011 1 commit
-
-
Danny Auble authored
This reverts commit 350ef5dc.
-
- 16 Aug, 2011 1 commit
-
-
Danny Auble authored
-
- 12 Aug, 2011 2 commits
-
-
Danny Auble authored
next parallel step is ran on a sub block, SLURM won't over subscribe cnodes.
-
Danny Auble authored
-
- 11 Aug, 2011 2 commits
-
-
Danny Auble authored
-
Morris Jette authored
BLUEGENE - Modify "scontrol show step" to show I/O nodes (BGL and BGP) or c-nodes (BGQ) allocated to each step. Change field name from "Nodes=" to "BP_List=".
-
- 10 Aug, 2011 3 commits
-
-
Danny Auble authored
cannot fit into the available shape.
-
Morris Jette authored
Previous code would fail when trying to launch more than 4096 tasks, which is a problem on BGQ systems where SLURM actually launches job steps.
-
Danny Auble authored
or not.
-
- 09 Aug, 2011 3 commits
-
-
Morris Jette authored
This change applies only to Cray systems and only when the srun wrapper for aprun. Map --exclusive to -F exclusive and --share to -F share. Note this does not consider the partition's Shared configuration, so it is an imperfect mapping of options.
-
Morris Jette authored
A node DOWN to ALPS will be marked DOWN to SLURM only after reaching SlurmdTimeout. In the interim, the node state will be NO_RESPOND. This change makes behavior makes SLURM handling of the node DOWN state more consistent with ALPS. This change effects only Cray systems.
-
Morris Jette authored
Fix the node state accounting to be consistent with the node state set by ALPS.
-
- 05 Aug, 2011 2 commits
-
-
Danny Auble authored
be the same.
-
Danny Auble authored
previously marked down by alps.
-
- 04 Aug, 2011 2 commits
-
-
Morris Jette authored
Require SchedulerTimeSlice configuration parameter to be at least 5 seconds to avoid thrashing slurmd daemon. Addresses Cray bug 774692
-
Morris Jette authored
Change in GRES behavior for job steps: A job step's default generic resource allocation will be set to that of the job. If a job step's --gres value is set to "none" then none of the generic resources which have been allocated to the job will be allocated to the job step. Add srun environment value of SLURM_STEP_GRES to set default --gres value for a job step.
-
- 03 Aug, 2011 2 commits
-
-
Morris Jette authored
On Bluegene systems, smap's command-line mode would generate an invalid memory reference due to an uninitialized variable.
-
Danny Auble authored
a POLLERR the dbd_fail callback is called.
-