- 05 Nov, 2003 13 commits
-
-
Moe Jette authored
getting the configuration form slurmctld (which could be down).
-
Moe Jette authored
primary and secondary controllers.
-
Moe Jette authored
between primary and backup. The request has a brief window in which it can abort and we want to decrease the likelyhood of that happening by retrying less frequently when we know control is transistioning.
-
Moe Jette authored
is still waiting to take control (rather than one long wait).
-
Moe Jette authored
-
Moe Jette authored
Slurmctld must be restarted for changes in these configuration parameters to take effect. Previously these values would change, but there would be no change in the already loaded plugin.
-
Moe Jette authored
(jobcomp/none) in the configuration data structure.
-
Moe Jette authored
data structure.
-
Moe Jette authored
to take effect.
-
Moe Jette authored
-
Mark Grondona authored
-
Mark Grondona authored
-
Moe Jette authored
-
- 04 Nov, 2003 7 commits
-
-
Mark Grondona authored
-
Mark Grondona authored
force -gstabs if TotalView support is needed
-
Mark Grondona authored
-
Mark Grondona authored
-
Mark Grondona authored
-
Mark Grondona authored
-
Mark Grondona authored
o Allow processing of UNSTABLE release for SLURM versioning
-
- 03 Nov, 2003 2 commits
- 31 Oct, 2003 3 commits
- 30 Oct, 2003 2 commits
- 29 Oct, 2003 6 commits
-
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
and/or job step(s) will have their resources de-allocated and be killed. A resource allocation will not be release unless no job steps are active for at least InactiveLimit seconds. DPCS jobs will be subject to this forced de-allocation if they remain inactive for an extended period of time, which can get SLURM and DPCS back in sync if DPCS does a cold-start.
-
Moe Jette authored
were starting multiple jobs simultaneously and slurmd was not able to respond to all of the requests without generating a message timeout. (gnats:319)
-
- 28 Oct, 2003 1 commit
-
-
Moe Jette authored
-
- 24 Oct, 2003 6 commits
-
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
report an error message.
-
Moe Jette authored
-
Moe Jette authored
avoid highly fragmented resource allocations. Add list of excluded nodes to job info dumpped and reported. Fix how mis-matched RPC version number are handled. Let error code get back to the API function. Dump job state information upon each job's termination via plugin. Re-issue incomplete write requests in job/partition state save. Make slurmctld continue proper operation without any default partition (gnats:317). Add command/RPC to delete a partition. Retry socket connection for slurmd/io.c as needed (gnats:253).
-
Moe Jette authored
and quitting.
-