- 02 Apr, 2013 6 commits
-
-
Morris Jette authored
-
Danny Auble authored
and when reading in state from DB2 we find a block that can't be created. You can now do a clean start to rid the bad block.
-
Danny Auble authored
the slurmctld there were software errors on some nodes.
-
Danny Auble authored
without it still existing there. This is extremely rare.
-
Danny Auble authored
a pending job on it we don't kill the job.
-
Danny Auble authored
while it was free cnodes would go into software error and kill the job.
-
- 01 Apr, 2013 1 commit
-
-
Morris Jette authored
Fix for bug 224
-
- 29 Mar, 2013 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
- 27 Mar, 2013 3 commits
-
-
Jason Bacon authored
-
Morris Jette authored
WIthout this patch, when the slurmd cold starts or slurmstepd terminates abnormally, the job script file can be left around. bug 243
-
Morris Jette authored
Previously such a job submitted to a DOWN partition would be queued. bug 187
-
- 26 Mar, 2013 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
a reservation when it has the "Ignore_Jobs" flag set. Since jobs could run outside of the reservation in it's nodes without this you could have double time.
-
- 25 Mar, 2013 2 commits
-
-
Morris Jette authored
This is not applicable with launch/aprun
-
Morris Jette authored
-
- 22 Mar, 2013 1 commit
-
-
Morris Jette authored
These changes are required so that select/cray can load select/linear, which is a bit more complex than the other select plugin structures. Export plugin_context_create and plugin_context_destroy symbols from libslurm.so. Correct typo in exported hostlist_sort symbol name Define some functions in select/cray to avoid undefined symbols if the plugin is loaded via libslurm rather than from a slurm command (which has all of the required symbols)
-
- 20 Mar, 2013 2 commits
-
-
Hongjia Cao authored
-
Danny Auble authored
cluster.
-
- 19 Mar, 2013 3 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
- 14 Mar, 2013 3 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
- 13 Mar, 2013 1 commit
-
-
Morris Jette authored
If step requests more CPUs than possible in specified node count of job allocation then return ESLURM_TOO_MANY_REQUESTED_CPUS rather than ESLURM_NODES_BUSY and retrying.
-
- 12 Mar, 2013 1 commit
-
-
Morris Jette authored
-
- 11 Mar, 2013 2 commits
-
-
Nathan Yee authored
Without this change, when the sbatch --export option is used, many Slurm environment variables are not set unless explicitly exported.
-
Danny Auble authored
-
- 08 Mar, 2013 2 commits
-
-
Morris Jette authored
-
Danny Auble authored
success
-
- 07 Mar, 2013 1 commit
-
-
jette authored
This problem would effect systems in which specific GRES are associated with specific CPUs. One possible result is the CPUs identified as usable could be inappropriate and job would be held when trying to layout out the tasks on CPUs (all done as part of the job allocation process). The other problem is that if multiple GRES are linked to specific CPUs, there was a CPU bitmap OR which should have been an AND, resulting in some CPUs being identified as usable, but not available to all GRES.
-
- 06 Mar, 2013 2 commits
-
-
Danny Auble authored
options in srun, and push that logic to salloc and sbatch. Bug 201
-
Danny Auble authored
and timeout in the runjob_mux trying to send in this situation. Bug 223
-
- 04 Mar, 2013 2 commits
-
-
Morris Jette authored
The original reservation data structure is deleted and it's backup added to the reservation list, but jobs can retain a pointer to the original (now invalid) reservation data structure. Bug 250
-
Alejandro Lucero Palau authored
-
- 01 Mar, 2013 1 commit
-
-
Danny Auble authored
-
- 28 Feb, 2013 1 commit
-
-
Danny Auble authored
energy data.
-
- 27 Feb, 2013 2 commits
-
-
Danny Auble authored
-
Matthieu Hautreux authored
-