- 15 Oct, 2014 4 commits
-
-
Nicolas Joly authored
This reverts commit 4d03d0b4. Make sure the correct Author is attributed here.
-
Danny Auble authored
This reverts commit 1891936e.
-
Danny Auble authored
This has apparently been broken from the get go. This fixes bug 1172. test21.22 should be updated to test the dump and load of a file that is generated.
-
Danny Auble authored
using --ntasks-per-node. This is related to bug 1145. What was happening is all the cpus were allocated on one socket instead of a cyclic method. While this is allowed it is strange and resulted in this bug. There appears to be a different bug as to why the tasks were laid out in a block fashion in the first place.
-
- 14 Oct, 2014 2 commits
-
-
Danny Auble authored
with no way to get them out. This fixes bug 1134. It is advised the pro/epilog to call xtprocadmin in the script instead of returning a non-zero exit code.
-
Nicolas Joly authored
Signed-off-by: Danny Auble <da@schedmd.com>
-
- 10 Oct, 2014 7 commits
-
-
Danny Auble authored
-
Brian Christiansen authored
Bug #1143
-
Dorian Krause authored
This commit fixes a bug we observed when combining select/linear with gres. If an allocation was requested with a --gres argument an srun execution within that allocation would stall indefinitely: -bash-4.1$ salloc -N 1 --gres=gpfs:100 salloc: Granted job allocation 384049 bash-4.1$ srun -w j3c017 -n 1 hostname srun: Job step creation temporarily disabled, retrying The slurmctld log showed: debug3: StepDesc: user_id=10034 job_id=384049 node_count=1-1 cpu_count=1 debug3: cpu_freq=4294967294 num_tasks=1 relative=65534 task_dist=1 node_list=j3c017 debug3: host=j3l02 port=33608 name=hostname network=(null) exclusive=0 debug3: checkpoint-dir=/home/user checkpoint_int=0 debug3: mem_per_node=62720 resv_port_cnt=65534 immediate=0 no_kill=0 debug3: overcommit=0 time_limit=0 gres=(null) constraints=(null) debug: Configuration for job 384049 complete _pick_step_nodes: some requested nodes j3c017 still have memory used by other steps _slurm_rpc_job_step_create for job 384049: Requested nodes are busy If srun --exclusive would have be used instead everything would work fine. The reason is that in exclusive mode the code properly checks whether memory is a reserved resource in the _pick_step_node() function. This commit modifies the alternate code path to do the same.
-
Danny Auble authored
(i.e ArchiveJobs PurgeJobs). This is only a cosmetic change.
-
Nicolas Joly authored
on slurmdbd startup.
-
Danny Auble authored
-
Danny Auble authored
lots of jobs.
-
- 09 Oct, 2014 2 commits
-
-
Danny Auble authored
did the ALPS reservation. Bug 1115
-
Morris Jette authored
Take more job options into consideration to estimate its node count.
-
- 08 Oct, 2014 3 commits
-
-
Danny Auble authored
-
inodb authored
At work in Sweden we often fika (coffee+buns and what have u) at 3PM. I sometimes accidentally give a start time of 'teatime', so when I return from 'fika' I see my job's just getting started. This fix should make life even easier for the Swedes.
-
Danny Auble authored
-
- 07 Oct, 2014 4 commits
-
-
Danny Auble authored
a reservation.
-
Danny Auble authored
which they have access to (rather then preventing them from seeing ANY reservation). Backport from 14.11 commit 77c2bd25.
-
Danny Auble authored
arbitrary layouts (test1.59).
-
Brian Christiansen authored
option (since it isn't).
-
- 04 Oct, 2014 1 commit
-
-
Morris Jette authored
Do not cause it to be rebooted (powered up).
-
- 03 Oct, 2014 4 commits
-
-
Morris Jette authored
When a node's state is set to power_down, then execute SuspendProgram even if previously executed for that node.
-
Morris Jette authored
Fix logic determining when job configuration (i.e. running node power up logic) is complete. (Will look at better solution for v14.11).
-
Morris Jette authored
When a node's state is set to power_up, then execute ResumeProgram even if previously executed for that node.
-
Danny Auble authored
different times when reservations are using the associations that are being deleted.
-
- 02 Oct, 2014 3 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
- 30 Sep, 2014 1 commit
-
-
Morris Jette authored
-
- 29 Sep, 2014 5 commits
-
-
Danny Auble authored
-
Morris Jette authored
Remove logic that was creating GRES bitmap for node when not needed (only needed when GRES mapped to specific files).
-
Morris Jette authored
Correct logic to support job GRES specification over 31 bits (problem in logic converting int to uint32_t).
-
Morris Jette authored
-
Danny Auble authored
-
- 26 Sep, 2014 1 commit
-
-
David Bigagli authored
when terminating the job.
-
- 25 Sep, 2014 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
different number of dimensions than the cluster.
-
- 24 Sep, 2014 1 commit
-
-
David Bigagli authored
when terminating the job.
-