- 24 Sep, 2003 2 commits
-
-
jwindley authored
-
Mark Grondona authored
srun erroneously expected replies from these hosts (gnats:291)
-
- 23 Sep, 2003 5 commits
-
-
Mark Grondona authored
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
scalability. An arbitrary number of requests may be queued and they are processed one per second until the queue is empty or pending requests were last attempted recently (configuration parameters set to 60 seconds as a minimum retry interval).
-
Moe Jette authored
These jobs are reported by slurmd on node registration. They are logged but otherwise ignored by slurmctld. Several changes to slurmd logging messaged to report job id and step id using %u format rather than %d format (which shows no-allocate job id values as negative numbers).
-
- 22 Sep, 2003 2 commits
- 21 Sep, 2003 11 commits
-
-
Moe Jette authored
--relative option is lower than the node count specified. The --relative option takes precedence.
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
that reach the slurm inactivity time limit.
-
Moe Jette authored
control (it needs to complete all pending RPCs and save state before the primary reads state and takes over).
-
Moe Jette authored
responsibilities (backup was routinely reading at startup).
-
Moe Jette authored
server to shutdown request.
-
Moe Jette authored
SIGPWR, and SIGLOST.
-
Moe Jette authored
and when returned to service went improperly back into state DRAINING (job counter was inconsistent).
-
Moe Jette authored
transition to DRAINED state.
-
- 20 Sep, 2003 6 commits
-
-
Moe Jette authored
data will not be used and the process is too slow anyway.
-
Moe Jette authored
EPILOG_COMPLETE_MESSAGE. At this time the job is COMPLETED and all associated nodes available.
-
Mark Grondona authored
-
Moe Jette authored
-
Mark Grondona authored
on nodes relative to the current allocation. o srun no longer sends SIGKILL to job if one task is killed except if --no-allocate is used. (the job will otherwise be killed by the controller anyway)
-
Mark Grondona authored
-
- 19 Sep, 2003 11 commits
-
-
Mark Grondona authored
function.
-
Mark Grondona authored
memory appears to be full.
-
Mark Grondona authored
write() call.
-
Mark Grondona authored
- instead of attempting to kill pending threads, immediately exit wait_for_procs if a thread is already waiting for job. - if wait_for_procs fails (thread already waiting), exit w/out sending epilog complete rpc.
-
Moe Jette authored
type debug.
-
Moe Jette authored
descriptors. This was needed in several grouped functions (e.g. slurm_send_recv_rc_msg and slurm_send_only_node_msg, which combine open, send, receive, and close functions for simplicity).
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
RPC server threads to that number - 2. This should slightly reduce the incomming RPC load.
-
Moe Jette authored
and may have been exhausting virtual memory, resulting in the death of slurmctld.
-
Mark Grondona authored
-
- 18 Sep, 2003 3 commits
-
-
Moe Jette authored
the left-over slurmctld process on abnormal termination.
-
Moe Jette authored
clarity. No change in functionality or logic.
-
Moe Jette authored
If the "-c" option is not specified then only the jobs and some node state information will be preserved. Specifically the state of DOWN, DRAINED, or DRAINING nodes and the associated reason field for those nodes.
-