- 22 Jul, 2014 1 commit
-
-
Morris Jette authored
Unload job tables rather than windows at job end. The table unload also unloads job tables.
-
- 19 Jul, 2014 1 commit
-
-
Morris Jette authored
-
- 18 Jul, 2014 6 commits
-
-
David Bigagli authored
lost should the slurmctld restart.
-
David Bigagli authored
-
Morris Jette authored
Correct NumCPUs count for jobs with --exclusive option. bug 909
-
Morris Jette authored
This probably only happens on native Cray systems due to the deallocation delays related to node health check. In any case, the symptom is error message of this sort "job # dealloc of node ... bad node_offset 0 count is 0". It then fails to deallocate the nodes GRES back for use by other jobs. bug 973
-
Danny Auble authored
slurm_conf_reinit.
-
Danny Auble authored
counting as multiple nodes.
-
- 17 Jul, 2014 3 commits
-
-
Gennaro Oliva authored
-
Morris Jette authored
-
David Bigagli authored
slurmstepd attempts to create it, for example left over from a previous requeue or crash, delete it and recreate it. #961.
-
- 16 Jul, 2014 2 commits
-
-
David Bigagli authored
-
Morris Jette authored
switch/nrt - Unload job tables (in addition to windows) in user space mode to avoid leaking NRT. bug 964
-
- 15 Jul, 2014 2 commits
-
-
Morris Jette authored
Fix race condition which could result in requeue if batch job exit and node registration occur at the same time.
-
Danny Auble authored
(From that commit) There was a problem when building from source where for example @bindir@ would resolve to ${prefix}/bin. This patch fixes it, based on http://www.gnu.org/software/autoconf/manual/ autoconf-2.69/html_node/Installation-Directory-Variables.html It also changes opt_modulefiles_slurm to opt_modulefiles_slurm.in but I couldn't figure out how to get git diff to show that.
-
- 14 Jul, 2014 2 commits
-
-
David Bigagli authored
and should there be problems accessing the state files.
-
Morris Jette authored
Fix for possible abort on change in GRES configuration. bug 958
-
- 11 Jul, 2014 2 commits
-
-
Remi Palancher authored
-
Remi Palancher authored
(commit 4cd63575) with sacctmgr load when Parent has "'" around it
-
- 10 Jul, 2014 4 commits
-
-
Nathan Yee authored
page.
-
Morris Jette authored
-
Danny Auble authored
special characters in their names like ':'
-
Morris Jette authored
It seems to have been broken for some time due to logic blocking SIGWINCH being removed from srun, which could result in the signal being sent to any thread, not the thread designed to handle window resizing.
-
- 09 Jul, 2014 8 commits
-
-
David Bigagli authored
-
David Bigagli authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
failure as well as interactive user.
-
Danny Auble authored
epilog complete message comes from that node do not process the batch step information since the job has already been requeued because the epilog script running isn't guaranteed in this situation.
-
Danny Auble authored
start of requeued jobs.
-
David Bigagli authored
-
- 08 Jul, 2014 7 commits
-
-
David Bigagli authored
-
Morris Jette authored
Expand from 8 to 16 bits for more than 256 sizes
-
Morris Jette authored
sched/backfill - Fix anomaly that could result in jobs being scheduled out of order. bug 911
-
Morris Jette authored
Without this a batch script could be bound to a single CPU rather than the full job allocation. This is a revision to commit 399cb897 without which batch scripts has no binding to CPUs via hwloc
-
Morris Jette authored
squeue and scontrol to report a job's "shared" value based upon partition options rather than reporting "unknown" if job submission does not use --exclusive or --shared option. bug 939
-
Morris Jette authored
-
John Morrissey authored
-
- 07 Jul, 2014 1 commit
-
-
David Bigagli authored
-
- 03 Jul, 2014 1 commit
-
-
Danny Auble authored
-