Fix for record job state on successful allocation but failed reply message. (8dd79221) · Commits · Manuel G. Marciani / ces_slurm_simulator

Commit 8dd79221 authored Dec 20, 2017 by

Alejandro Sanchez

Fix for record job state on successful allocation but failed reply message.

On a job [pack]allocation RPC request, if the allocation succeed but the
send response message back to the client failed (i.e. srun was killed before
it could receive the response), then modify the job_record pointer so that
the job_state is set to FAILED, the exit_code as if the job got a SIGTERM
signal and the state_reason to FAIL_LAUNCH. Then users when querying
the job with sacct can discern that something bad happend for this scenario,
instead of STATE being showed as COMPLETED and the ExitCode as 0:0.

Bug 4513.

parent f4494704

Hide whitespace changes

Inline Side-by-side

Please register or to comment