- 22 Jul, 2014 1 commit
-
-
Morris Jette authored
switch/nrt - Do not explicitly unload windows for a job on termination, only unload its table (which automatically unloads its windows).
-
- 19 Jul, 2014 1 commit
-
-
Morris Jette authored
-
- 18 Jul, 2014 10 commits
-
-
Morris Jette authored
-
David Bigagli authored
lost should the slurmctld restart.
-
David Bigagli authored
-
Morris Jette authored
Correct NumCPUs count for jobs with --exclusive option. bug 909
-
David Bigagli authored
lost should the slurmctld restart.
-
David Bigagli authored
-
Morris Jette authored
Correct NumCPUs count for jobs with --exclusive option. bug 909
-
Morris Jette authored
This probably only happens on native Cray systems due to the deallocation delays related to node health check. In any case, the symptom is error message of this sort "job # dealloc of node ... bad node_offset 0 count is 0". It then fails to deallocate the nodes GRES back for use by other jobs. bug 973
-
Danny Auble authored
slurm_conf_reinit.
-
Danny Auble authored
counting as multiple nodes.
-
- 17 Jul, 2014 3 commits
-
-
Gennaro Oliva authored
-
Morris Jette authored
-
David Bigagli authored
slurmstepd attempts to create it, for example left over from a previous requeue or crash, delete it and recreate it. #961.
-
- 16 Jul, 2014 2 commits
-
-
David Bigagli authored
-
Morris Jette authored
switch/nrt - Unload job tables (in addition to windows) in user space mode to avoid leaking NRT. bug 964
-
- 15 Jul, 2014 3 commits
-
-
Morris Jette authored
Fix race condition which could result in requeue if batch job exit and node registration occur at the same time.
-
Danny Auble authored
(From that commit) There was a problem when building from source where for example @bindir@ would resolve to ${prefix}/bin. This patch fixes it, based on http://www.gnu.org/software/autoconf/manual/ autoconf-2.69/html_node/Installation-Directory-Variables.html It also changes opt_modulefiles_slurm to opt_modulefiles_slurm.in but I couldn't figure out how to get git diff to show that.
-
Danny Auble authored
and will avoid the need for all the daemons to link to libhwloc
-
- 14 Jul, 2014 2 commits
-
-
David Bigagli authored
and should there be problems accessing the state files.
-
Morris Jette authored
Fix for possible abort on change in GRES configuration. bug 958
-
- 11 Jul, 2014 2 commits
-
-
Remi Palancher authored
-
Remi Palancher authored
(commit 4cd63575) with sacctmgr load when Parent has "'" around it
-
- 10 Jul, 2014 9 commits
-
-
Morris Jette authored
-
Nathan Yee authored
page.
-
Danny Auble authored
-
David Bigagli authored
configured then srun will use its dynamic ports only from the configured range.
-
Morris Jette authored
-
Morris Jette authored
-
Danny Auble authored
special characters in their names like ':'
-
Morris Jette authored
It seems to have been broken for some time due to logic blocking SIGWINCH being removed from srun, which could result in the signal being sent to any thread, not the thread designed to handle window resizing.
-
Morris Jette authored
Added support for job email triggers: TIME_LIMIT, TIME_LIMIT_90 (reached 90% of time limit), TIME_LIMIT_80 (reached 80% of time limit). Applies to salloc, sbatch and srun commands.
-
- 09 Jul, 2014 7 commits
-
-
David Bigagli authored
-
David Bigagli authored
-
Morris Jette authored
Added CpuFreqDef configuration parameter in slurm.conf to specify the default CPU frequency and governor to be set at job end.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
failure as well as interactive user.
-
Danny Auble authored
epilog complete message comes from that node do not process the batch step information since the job has already been requeued because the epilog script running isn't guaranteed in this situation.
-