- 07 Nov, 2003 1 commit
-
-
Moe Jette authored
re-issuing it (reduces communications overhead).
-
- 06 Nov, 2003 6 commits
-
-
Moe Jette authored
-
Moe Jette authored
KILL_JOB RPC to the slurmd, repeatedly if needed, but don't confirm it). (gnats:326)
-
Moe Jette authored
sent in a more timely fashion.
-
Moe Jette authored
using node hostlist format.
-
Moe Jette authored
for node registration, which is relatively rare).
-
Moe Jette authored
very quickly due to a race condition (purge shared memory before completely set up).
-
- 05 Nov, 2003 13 commits
-
-
Moe Jette authored
getting the configuration form slurmctld (which could be down).
-
Moe Jette authored
primary and secondary controllers.
-
Moe Jette authored
between primary and backup. The request has a brief window in which it can abort and we want to decrease the likelyhood of that happening by retrying less frequently when we know control is transistioning.
-
Moe Jette authored
is still waiting to take control (rather than one long wait).
-
Moe Jette authored
-
Moe Jette authored
Slurmctld must be restarted for changes in these configuration parameters to take effect. Previously these values would change, but there would be no change in the already loaded plugin.
-
Moe Jette authored
(jobcomp/none) in the configuration data structure.
-
Moe Jette authored
data structure.
-
Moe Jette authored
to take effect.
-
Moe Jette authored
-
Mark Grondona authored
-
Mark Grondona authored
-
Moe Jette authored
-
- 04 Nov, 2003 7 commits
-
-
Mark Grondona authored
-
Mark Grondona authored
force -gstabs if TotalView support is needed
-
Mark Grondona authored
-
Mark Grondona authored
-
Mark Grondona authored
-
Mark Grondona authored
-
Mark Grondona authored
o Allow processing of UNSTABLE release for SLURM versioning
-
- 03 Nov, 2003 2 commits
- 31 Oct, 2003 3 commits
- 30 Oct, 2003 2 commits
- 29 Oct, 2003 6 commits
-
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
and/or job step(s) will have their resources de-allocated and be killed. A resource allocation will not be release unless no job steps are active for at least InactiveLimit seconds. DPCS jobs will be subject to this forced de-allocation if they remain inactive for an extended period of time, which can get SLURM and DPCS back in sync if DPCS does a cold-start.
-
Moe Jette authored
were starting multiple jobs simultaneously and slurmd was not able to respond to all of the requests without generating a message timeout. (gnats:319)
-