- 23 May, 2011 4 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
Describe how to get dakota application to work with slurm and add contributor to the slurm team list
-
Morris Jette authored
Improve how the cray srun/aprun wrapper handles the ntasks and nnodes options to better distribute tasks over the allocated nodes. The support for these options is imperfect for resource allocations in which the number of tasks per node is not uniform, but that can not be properly handled due to differences between srun and aprun.
-
- 19 May, 2011 6 commits
-
-
Danny Auble authored
-
Danny Auble authored
Added Logic to make it so you can emulate a cray system and not have to enforce the things you do on a real one.
-
Moe Jette authored
-
Moe Jette authored
Transfer all of the node fields to reorder them by node_rank. Old logic only transferred some fields, which caused problems on heterogeneous clusters.
-
Moe Jette authored
Fix some typos and improve technical content of the SLURM design documents for job launch and gres support.
-
Moe Jette authored
Added a web page to describe the job launch and termination process and made a minor enhancement to the GRES design document.
-
- 18 May, 2011 19 commits
-
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
Conflicts: src/slurmctld/node_mgr.c src/slurmctld/node_scheduler.c
-
Moe Jette authored
Patch from Andriy Grytsenko (Massive Solutions Limited).
-
Moe Jette authored
Synchronize power-save module better with scheduler. Without this change, returning a node to service was typically delayed longer than necessary. Patch from Andriy Grytsenko (Massive Solutions Limited).
-
Moe Jette authored
Report scontrol job job PreemptTime=None rather than PreemptTime=NO_VAL if not set. Patch from Bill Brophy, Bull
-
Moe Jette authored
-
Moe Jette authored
This expands the description of how to build slurm using a git repository.
-
Moe Jette authored
Modify job expansion logic to support licenses, generic resources, and currently running job steps in the job which is expanding.
-
Danny Auble authored
-
Morris Jette authored
This improves the initial configuration code: a) Better handling of DownNodes lines The previous basil_geometry() would set the node Reason field on failure, irrespective of whether that node has been marked using a DownNode line. b) Check all cases of nodes being invisible to ALPS Up until now basil_geometry() had to be fixed each time a new source of discrepancy between ALPS and SDB state had been discovered (most recent case was NULL coordinates when taking out a blade). Depending on ALPS interface changes, there may be other possibilities. Instead of fixing the SLURM code for each new case, it is better to check whether SLURM and ALPS agree. The price is some tiny delay at SLURM initialisation time (since each node is first looked up in the ALPS inventory), but it pays well off as it eases system administration by pointing to the source of error. Any node that has suddenly disappeared from ALPS horizon will now show up in the logs, and also in marked down in sinfo. c) At initialisation time, give a summary as to how many ALPS nodes are online. d) Turn ALPS-node-invisibility error into warning message, since such nodes may already have been covered in a DownNodes statement. By merging basil_get_initial_state() into basil_geometry(), the previously separate knowledge about system state (database state, ALPS inventory) is combined, allowing to more easily identify sources of failure. Patch from Gerrit Renker, CSCS.
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
-
Danny Auble authored
-
Danny Auble authored
-
Moe Jette authored
Logging message was misleading and incorrect pointer used in another.
-
Moe Jette authored
Former logic failed to properly allocate resources to a job step when specifying both a task count and a node count range on a heterogeneous cluster.
-
- 17 May, 2011 11 commits
-
-
Danny Auble authored
-
Danny Auble authored
BLUEGENE - Fixed print of geo portion of the select_jobinfo struct to work correctly with the regession tests.
-
Danny Auble authored
-
Moe Jette authored
-
Danny Auble authored
-
Moe Jette authored
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
-
Morris Jette authored
Latest Cray-specific modifiations
-
Morris Jette authored
The enum is only needed and referenced in basil_geometry() and has otherwise no special meaning since it directly depends on the selected output columns. Patch from Gerrit Renker, CSCS.
-