- 15 Feb, 2018 2 commits
-
-
Doug Jacobsen authored
Fixes issue with bf_min_prio_reserve not being respected, leading to significantly impacted backfill performance. Bug 4760.
-
Morris Jette authored
Log that support for the ChosLoc configuration parameter will end in Slurm version 18.08. Bug 4791
-
- 14 Feb, 2018 1 commit
-
-
Tim Wickberg authored
And would prevent transmission of a file. Bug 4787.
-
- 13 Feb, 2018 4 commits
-
-
Danny Auble authored
Bug 4736 Bug 4784
-
Morris Jette authored
Partial write can happen under high system load, leading to step termination when finishing the write would let the step launch properly instead. Fix suggested by Matthieu Hautreux. Bug 4778.
-
Felip Moll authored
Add a privilege check for when an unprivileged user tries to modify a resource. Min level set to Operator. Bug 4735.
-
Felip Moll authored
Bug 4747.
-
- 12 Feb, 2018 1 commit
-
-
Felip Moll authored
Fixes some issues around differences in lua package naming. Bug 4568.
-
- 08 Feb, 2018 1 commit
-
-
Dominik Bartkiewicz authored
Bug 4709.
-
- 07 Feb, 2018 9 commits
-
-
Alejandro Sanchez authored
Previously it was taking the MIN, without respecting the order. Also add a note to the resource_limits.html page to clarify the exception for Max[Wall|Time] and/or [Max|Min]Nodes limits, where the default is that the Partition is the king with regards of precedence, unless the respective job's QOS flags Partition[Min|Max|Time]Limit are set. Bug 4681.
-
Danny Auble authored
This prevents a hard-to-diagnose issue where slurmstepd may fail to start due to a missing library. This now ensures slurmd will fail, and keep the node down until the library issue can be fixed. Bug 4645, 4644.
-
Danny Auble authored
fatal() calls exit(1) which precludes getting a backtrace. That's fine on configuration issues and other types of problem, but for hitting "impossible" edge cases getting a core dump may be the only way to isolate the issue. Adding to 17.11 so we can easily provide diagnostic patches without needing users to back-port this implementation. Further use will come in 18.08. Bug 4599.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
This reverts commit 18b35709.
-
Tim Wickberg authored
This reverts commit 6a91845b.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
- 06 Feb, 2018 3 commits
-
-
Danny Auble authored
Wait for the prolog to complete, and then clean up after it. Otherwise orphaned "sleep" executables will be created by the still-launching step_extern. Bug 4718.
-
Felip Moll authored
Preserve node features when slurmctld daemons reconfigured including active and available KNL features. bug 4734
-
Isaac Hartung authored
Bug 4630.
-
- 05 Feb, 2018 1 commit
-
-
Brian Christiansen authored
Bug 4722
-
- 01 Feb, 2018 3 commits
-
-
Regine Gaudin authored
keyvalue_initialized is reset on 'scontrol reconfigure' and other cases, which can lead to additional atfork handlers being registered. These can eventually lead to a segfault if an excessive number of handlers have been re-registered. Set a separate boolean to protect against this. Clear that boolean as part of the atfork handler. Bug 4628.
-
Felip Moll authored
UsePss was correct, but UsePSS and usepss would be silently ignored, leading to confusion as to whether the option was working or not. Treat all JobAcctGatherParams as case-insensitive to avoid confusion. Bug 4637.
-
Brian Christiansen authored
This reverts commit 516b0d59. With the fixing of the NEWS file. We want to keep the idea of only checking one federation.
-
- 30 Jan, 2018 9 commits
-
-
Brian Christiansen authored
Bug 4548
-
Brian Christiansen authored
This reverts commit fb73b8a4. # Conflicts: # NEWS
-
Brian Christiansen authored
message before purging the job record to get the uid of the revoked job. Bug 4502
-
Danny Auble authored
This reverts commit fc0c3e6c.
-
Danny Auble authored
message before purging the job record to get the uid of the revoked job. Bug 4502
-
Morris Jette authored
Bug 4651
-
Brian Christiansen authored
Bug 4548
-
Danny Auble authored
Bug 4634
-
David Gloe authored
job container where if the step was canceled would also cancel the stepd erroneously. Bug 4634
-
- 29 Jan, 2018 3 commits
-
-
Morris Jette authored
one already exists owned by a different user will be logged and the job held. Bug 4614
-
Tim Wickberg authored
-
Alejandro Sanchez authored
Bug 4681
-
- 25 Jan, 2018 3 commits
-
-
Danny Auble authored
from when last started. Signed-off-by: Danny Auble <da@schedmd.com>
-
Felip Moll authored
if LaunchParameter test_exec is set. Bug 4439
-
Felip Moll authored
rights by a secondary group id. Bug 4439
-