- 18 Jul, 2014 4 commits
-
-
Morris Jette authored
Correct NumCPUs count for jobs with --exclusive option. bug 909
-
Morris Jette authored
This probably only happens on native Cray systems due to the deallocation delays related to node health check. In any case, the symptom is error message of this sort "job # dealloc of node ... bad node_offset 0 count is 0". It then fails to deallocate the nodes GRES back for use by other jobs. bug 973
-
Danny Auble authored
slurm_conf_reinit.
-
Danny Auble authored
counting as multiple nodes.
-
- 17 Jul, 2014 3 commits
-
-
Gennaro Oliva authored
-
Morris Jette authored
-
David Bigagli authored
slurmstepd attempts to create it, for example left over from a previous requeue or crash, delete it and recreate it. #961.
-
- 16 Jul, 2014 2 commits
-
-
David Bigagli authored
-
Morris Jette authored
switch/nrt - Unload job tables (in addition to windows) in user space mode to avoid leaking NRT. bug 964
-
- 15 Jul, 2014 3 commits
-
-
Morris Jette authored
Fix race condition which could result in requeue if batch job exit and node registration occur at the same time.
-
Danny Auble authored
(From that commit) There was a problem when building from source where for example @bindir@ would resolve to ${prefix}/bin. This patch fixes it, based on http://www.gnu.org/software/autoconf/manual/ autoconf-2.69/html_node/Installation-Directory-Variables.html It also changes opt_modulefiles_slurm to opt_modulefiles_slurm.in but I couldn't figure out how to get git diff to show that.
-
Danny Auble authored
and will avoid the need for all the daemons to link to libhwloc
-
- 14 Jul, 2014 2 commits
-
-
David Bigagli authored
and should there be problems accessing the state files.
-
Morris Jette authored
Fix for possible abort on change in GRES configuration. bug 958
-
- 11 Jul, 2014 2 commits
-
-
Remi Palancher authored
-
Remi Palancher authored
(commit 4cd63575) with sacctmgr load when Parent has "'" around it
-
- 10 Jul, 2014 9 commits
-
-
Morris Jette authored
-
Nathan Yee authored
page.
-
Danny Auble authored
-
David Bigagli authored
configured then srun will use its dynamic ports only from the configured range.
-
Morris Jette authored
-
Morris Jette authored
-
Danny Auble authored
special characters in their names like ':'
-
Morris Jette authored
It seems to have been broken for some time due to logic blocking SIGWINCH being removed from srun, which could result in the signal being sent to any thread, not the thread designed to handle window resizing.
-
Morris Jette authored
Added support for job email triggers: TIME_LIMIT, TIME_LIMIT_90 (reached 90% of time limit), TIME_LIMIT_80 (reached 80% of time limit). Applies to salloc, sbatch and srun commands.
-
- 09 Jul, 2014 10 commits
-
-
David Bigagli authored
-
David Bigagli authored
-
Morris Jette authored
Added CpuFreqDef configuration parameter in slurm.conf to specify the default CPU frequency and governor to be set at job end.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
failure as well as interactive user.
-
Danny Auble authored
epilog complete message comes from that node do not process the batch step information since the job has already been requeued because the epilog script running isn't guaranteed in this situation.
-
Danny Auble authored
start of requeued jobs.
-
Danny Auble authored
-
David Bigagli authored
-
- 08 Jul, 2014 5 commits
-
-
David Bigagli authored
-
Morris Jette authored
Expand from 8 to 16 bits for more than 256 sizes
-
Morris Jette authored
sched/backfill - Fix anomaly that could result in jobs being scheduled out of order. bug 911
-
Morris Jette authored
Without this a batch script could be bound to a single CPU rather than the full job allocation. This is a revision to commit 399cb897 without which batch scripts has no binding to CPUs via hwloc
-
Morris Jette authored
squeue and scontrol to report a job's "shared" value based upon partition options rather than reporting "unknown" if job submission does not use --exclusive or --shared option. bug 939
-