- 14 Jul, 2016 14 commits
-
-
Morris Jette authored
Preserve variable resp_msg for use in error message and use a different variable for temporary storage.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Fix gang scheduling and license release logic if single node job killed on bad node. Notifying gang and releasing licences is normally done when the epilog completion happens, but if the node(s) assigned to a job are all down, that does not happen. This results in the licenses being reserved indefinitely and the gang scheduler being left with a bad (old) job pointer that can result in various failure modes bug 2867
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Add hotels. Other minor changes.
-
Danny Auble authored
-
Danny Auble authored
anyway to attempt to log the backtraces of the potential unkillable processes.
-
Danny Auble authored
667f1105.
-
Danny Auble authored
-
Morris Jette authored
Used wrong symbol name in commit c4e34cb9 a few hours ago
-
Morris Jette authored
Match sessions and instances using new DataWarp data format
-
Morris Jette authored
-
- 13 Jul, 2016 11 commits
-
-
Morris Jette authored
correction to logic in commit c0919263
-
Morris Jette authored
Move the real_size function to after the buffer has been setup per cray documentation
-
Danny Auble authored
We have decided to go back to the way 15.08 called NHC instead of calling it first before sending a SIGKILL to the steps tasks. With this patch we only start the NHC early when we have to resend the SIGKILL for unkillable processes. This will hopefully get us the backtrace of the unkillable processes which was the reason we did it this way in the first place :).
-
Danny Auble authored
processes.
-
Morris Jette authored
job array scripts were not being found in some cases
-
Brian Christiansen authored
-
Morris Jette authored
Don't treat an old version of dw_wlm_cli which does not support the real_size function as an error. Just log it using debug.
-
Morris Jette authored
The Cray burst buffer (DataWarp) API syntax has changed. Modify test to work with new syntax.
-
Morris Jette authored
Test system with old kernel doesn't define O_CLOEXEC, so partially revert commit b1ca7526
-
Morris Jette authored
-
Morris Jette authored
-
- 12 Jul, 2016 14 commits
-
-
Nicolas Joly authored
Bug 2892.
-
Morris Jette authored
-
Danny Auble authored
Bug 2874 We will most likely redo this logic (as it appears to be duplicated) in a following patch.
-
Morris Jette authored
bug 1884
-
Morris Jette authored
No changes in logic
-
Morris Jette authored
-
Morris Jette authored
Don't generate an error when a batch job is submitted that must wait for stage-in before starting.
-
Danny Auble authored
-
Danny Auble authored
Bug 2886
-
Morris Jette authored
Add new SchedulerParameters option of bb_array_stage_cnt=# to indicate how many pending tasks of a job array should be made available for burst buffer resource allocation. bug 1884
-
Tim Wickberg authored
-
Tim Wickberg authored
Conflicts: src/sstat/options.c
-
Jacek Budzowski authored
Was incorrectly translating request to job.extern if part of a comma-separate list. Bug 2890.
-
Morris Jette authored
bug 2858
-
- 11 Jul, 2016 1 commit
-
-
Morris Jette authored
bug 2858
-