- 26 Aug, 2015 12 commits
-
-
Morris Jette authored
This is a more thorough treatment of bug 1790 and commit d2545ca746ea0a2f9653664a601052cfd5eb8ad It clears the task_id_bitmap once the last task is scheduled and task_cnt becomes zero
-
Morris Jette authored
-
Morris Jette authored
Prevent job array task ID from being reported as NO_VAL if last task in the array gets requeued. The problem is that when that task starts, the task bitmap entry for it stays set, but the task counter gets decremented. If that job then gets requeued, under some conditions a failure to schedule it results in the array_task_id in the job record getting set to NO_VAL. Then when building the job info to report for squeue/scontrol, the string showing the pending task ID's is not rebuilt due to that counter being zero. All indications are that the job runs fine, only the information reported to squeue/scontrol is wrong. bug 1790
-
Thomas Cadeau authored
-
David Bigagli authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
accounting correctly to avoid meaningless errors about overflow.
-
Morris Jette authored
-
- 25 Aug, 2015 17 commits
-
-
Brian Christiansen authored
-
Brian Christiansen authored
Conflicts: config.h.in src/sreport/job_reports.c
-
Nathan Yee authored
-
Brian Christiansen authored
Bug 1873
-
Morris Jette authored
-
Morris Jette authored
Fix shutdown race condition that could cause the plugin to deadlock. Improve validation of burst buffer options. Hold job with an burst_buffer specification later discovered to be bad.
-
Morris Jette authored
-
Danny Auble authored
intended for the head of the tree.
-
Morris Jette authored
Handle out-of-band (i.e. outside of Slurm) persistent burst buffer creation/deletion with respect to their reservations, as used for limit enforcement.
-
Douglas Jacobsen authored
-
Nathan Yee authored
-
David Bigagli authored
-
Morris Jette authored
Avoid creating a duplicate burst buffer record due to a race condition between the "create_persistent" thread and the "load_state" thread.
-
Morris Jette authored
-
Morris Jette authored
Improve precisions of poll timeout logic
-
Danny Auble authored
or binary.
-
Morris Jette authored
Previously we reported an underflow of the node's comp_job_cnt (actually the message reported an underflow of the run_job_cnt, but was really checking the value of comp_job_cnt, in any case, the count is cleared when the node goes down as we are not counting on getting any responses from the down node).
-
- 24 Aug, 2015 9 commits
-
-
Morris Jette authored
-
Morris Jette authored
Major refactoring of burst buffer option parsing for interactive job support
-
Morris Jette authored
-
Morris Jette authored
The --bbf option reads the job's burst buffer options from a file rather than an inline option
-
Danny Auble authored
meaning meaning the limit is removed.
-
Danny Auble authored
-
Danny Auble authored
wouldn't get the correct protocol version to launch a step.
-
jette authored
wait for job TRES limit check before starting burst buffer allocation
-
jette authored
Trigger the job scheduler as soon as the last persistent buffer operation for a pending job completes. At that point, the job buffer can be allocated.
-
- 22 Aug, 2015 2 commits
-
-
Danny Auble authored
This reverts commit 7bdf6917. I believe c4545110 should fix this. I have verified this doesn't happen with just this commit.
-
Morris Jette authored
-