- 19 Aug, 2011 4 commits
-
-
Morris Jette authored
One of our testers created an illegal topology.conf file. He has a config you probably wouldn't see in production, but can see in testing when you are sometimes given a collection of miscellaneous resources. |-- nodes switch1 --| |-- switch2 -- nodes He tried the topology.conf file below. Switch s1 is defined twice. Slurm accepted this config, but wouldn't allocate nodes from both switches to one job. SwitchName=s1 Nodes=xna[14-26] SwitchName=s2 Nodes=xna[41-43] SwitchName=s1 Switches=s2 I believe slurm shouldn't allow the second definition of switch s1. The attached patch checks for duplicate switch names. Patch from Rod Schultz, Bull.
-
Morris Jette authored
Bug in selec/cons_res using tasks-per-node and cpus-per-task
-
Danny Auble authored
-
Carles Fenoy authored
tasks_per_node and cpus_per_task not getting proper allocation and failing in srun with "Requested node configuration is not available"
-
- 18 Aug, 2011 10 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
window and not running in ncurses.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
through that extra logic when starting up from smap.
-
- 17 Aug, 2011 6 commits
-
-
Danny Auble authored
-
Danny Auble authored
into the block allocator system of bluegene.
-
Danny Auble authored
This reverts commit e245c7bd.
-
Danny Auble authored
-
Danny Auble authored
This reverts commit bb8477b3.
-
Danny Auble authored
This reverts commit 350ef5dc.
-
- 16 Aug, 2011 7 commits
-
-
Morris Jette authored
Modify salloc, sbatch, and srun node specification parsing to accept a number followed by a suffix of "M" to multiply the numeric value by 1,048,576 (1024 x 1024).
-
Morris Jette authored
Large-scale update of bluegene.conf man page to incorporate BlueGene/Q information plus some general updates.
-
Morris Jette authored
Major update to BlueGene web page specifically to include BlueGene/Q information.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
- 15 Aug, 2011 3 commits
-
-
Morris Jette authored
-
Morris Jette authored
The squeue command was printing bad I/O nodes or c-nodes information for pending jobs. There was on midplane name, just the I/O or c-node specification.
-
Morris Jette authored
test8.22 Bluegene/Q only: Stress test of running many job step allocations within the job's allocation test8.23 Bluegene/Q only: Test that multple jobs allocations are properly packed within a midplane
-
- 13 Aug, 2011 3 commits
-
-
Morris Jette authored
These changes more thouroughly test Bluegene/Q job step placement algorithms and validate several recent bug fixes in the SLURM code.
-
Morris Jette authored
On Bluegene/Q systems, the job step allocation needs to be larger than requested in some cases due to the job allocation geometry (e.g. a 5 cnode allocation needs to be scaled up to at least 6 cnodes). This enhancement fixes that logic if multiple size increases are needed.
-
Danny Auble authored
-
- 12 Aug, 2011 7 commits
-
-
Danny Auble authored
next parallel step is ran on a sub block, SLURM won't over subscribe cnodes.
-
Danny Auble authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
This reverts commit c5d63854 from 8/11/2011. The memory copy is not a leak, but is required to avoid memory corruption.
-
Morris Jette authored
make sure that a job has a step_list before creating an interator for it
-