- 10 Feb, 2017 2 commits
-
-
Alejandro Sanchez authored
Bug 3446
-
Morris Jette authored
bug 3446
-
- 09 Feb, 2017 6 commits
-
-
Morris Jette authored
burst_buffer/cray - Support default pool which is not the first pool reported by DataWarp and log in Slurm when pools that are added or removed from DataWarp. bug 3453
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
This reverts commit fd690a9c.
-
Danny Auble authored
-
Artem Polyakov authored
-
- 08 Feb, 2017 3 commits
-
-
Brian Christiansen authored
sstat doesn't work on Cray ALPS but works on native Cray setups.
-
Alejandro Sanchez authored
Jobs preempted with PreemptMode=REQUEUE were incorrectly recorded as REQUEUED in the accounting. Bug 3444
-
Morris Jette authored
bug 3448
-
- 07 Feb, 2017 1 commit
-
-
Dominik Bartkiewicz authored
Bug 3447
-
- 03 Feb, 2017 1 commit
-
-
Alejandro Sanchez authored
Bug 3444
-
- 31 Jan, 2017 3 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Alejandro Sanchez authored
-
- 30 Jan, 2017 3 commits
-
-
Danny Auble authored
e3a7bdcc f9804256 d72b13f2 Reference bug 3366 If you are running on a Bluegene system we rely on the prolog to take us out of configuring state. These commits work good for system rebooting the nodes where the prolog is running, but in the case of Bluegene this is the opposite desire :). These commits on a Bluegene pretty much make it so a batch job never gets launched.
-
Morris Jette authored
Clear job's reason of "BeginTime" in a more timely fashion and/or prevents them from being stuck in a PENDING state. There are multiple ways of clearing the reason, especially on a lightly loaded system, but the state can persist indefinitely on a heavily loaded system. bug 3368
-
Morris Jette authored
Fix to logic for getting expected start time of existing job ID with explicit begin time that is in the past. Previous logic would compare that (past) begin time with advanced reservations that would compete with it rather than the current time.
-
- 29 Jan, 2017 4 commits
-
-
Morris Jette authored
-
Morris Jette authored
On cray systems with step NHC, the step launches are delayed and produce a pair of messages (below) that caused the test to fail: srun: Job step creation temporarily disabled, retrying srun: Job step created
-
Morris Jette authored
-
Morris Jette authored
-
- 28 Jan, 2017 4 commits
-
-
Morris Jette authored
-
Morris Jette authored
Avoid a test failing of all nodes in a partition are not usable (down, drained, reserved, or otherwise unusable).
-
Morris Jette authored
-
Morris Jette authored
Disable test if underlying select/linear use
-
- 27 Jan, 2017 2 commits
-
-
Danny Auble authored
Turns out this never worked, ever. What used to happen is if the protocol_version that was read in didn't match the rpc_version given to unpack things was just 0. What this does now is set the rpc_version to what was stored making it all good.
-
Danny Auble authored
correctly.
-
- 26 Jan, 2017 3 commits
-
-
Alejandro Sanchez authored
Bug 3431
-
Morris Jette authored
-
Alejandro Sanchez authored
bug 3433
-
- 25 Jan, 2017 8 commits
-
-
Morris Jette authored
burst_buffer/cray - Fix race condition that could cause multiple batch job launch requests resulting in downed nodes. bug 3366
-
Dominik Bartkiewicz authored
-
Danny Auble authored
This reverts commit b9bff82f.
-
Danny Auble authored
-
Morris Jette authored
It was leaking memory otherwise
-
Tim Wickberg authored
Commit 63b7e3a8 changed the --mem limit to 1MB for the job if not using a memory SelectType, but this can cause the job to fail if the JobAcctGatherFrequency is frequent enough to notice that the "sleep" command is using more than 1MB of resources. Refactor test to avoid specifying job memory. Use --wrap to avoid creating a temporary job script as well.
-
Tim Wickberg authored
Commit 63b7e3a8 changed the --mem limit to 1MB for the step if not using a memory SelectType, but this can cause the job to fail if the JobAcctGatherFrequency is frequent enough to notice that the "sleep" command is using more than 1MB of resources. Refactor test to avoid specifying job memory. Use --wrap to avoid creating a temporary job script.
-
Tim Wickberg authored
Commit 63b7e3a8 changed the --mem limit to 1MB for the step if not using a memory SelectType, but this can cause the job to fail if the JobAcctGatherFrequency is frequent enough to notice that the "sleep" command is using more than 1MB of resources. Refactor test to avoid specifying memory memory; and since only one step is checked for, only run a single step in the job. Use --wrap to avoid creating a temporary job script.
-