- 01 Mar, 2016 4 commits
-
-
Morris Jette authored
-
Morris Jette authored
Insure that a job is completely launched before trying to suspend it. Previous logic would start suspend logic early in the life of the slurmstepd process, after it's listening socket was open but before the tasks were launched. This defers the suspend logic until after all prologs and setup completes and the tasks are launched. This is important in the case of gang scheduling, in which newly launched jobs can be immediately suspended. bug 2494
-
Morris Jette authored
-
Danny Auble authored
-
- 29 Feb, 2016 14 commits
-
-
Danny Auble authored
commit 75317972.
-
Danny Auble authored
commit ff2c5b88
-
Danny Auble authored
ff2c5b88
-
Danny Auble authored
QOS flag affects.
-
Danny Auble authored
deal with time limit with TRES limits.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
to use this function with associations.
-
Danny Auble authored
Bug 1976
-
Morris Jette authored
If power save mode was configured along with PrologSlurmctld, then when PrologSlurmctld completed, it was clearing the node's PowerUp state flag, which launched the job before boot completed. New logic waits for the boot to complete and slurmd to register on the node before clearing the PowerUp flag.
-
Morris Jette authored
Previous logic was not waiting for nodes to all go into "on" state
-
Morris Jette authored
-
Tim Wickberg authored
Found with -Werror=old-style-declaration, remove it rather than move 'inline' to the start of the declaration to silence the warning.
-
Tim Wickberg authored
Default value is 1. Weight is uint32_t so this check was always succeeding.
-
- 27 Feb, 2016 5 commits
-
-
Tim Wickberg authored
Moe noted in 2008 that this was not needed on recent kernels. Eight years seems like enough time.
-
Tim Wickberg authored
Coverity issue 70420.
-
Tim Wickberg authored
Remove the unused PMI2U_FUNC macro rather than changing it. No functional changes.
-
Morris Jette authored
Refactor resource selection logic to support the node_feature AllowUserBoot configuration parameter
-
Morris Jette authored
-
- 26 Feb, 2016 13 commits
-
-
Tim Wickberg authored
No functional change.
-
Danny Auble authored
-
Alejandro Sanchez authored
-
Morris Jette authored
-
Tim Wickberg authored
-
Morris Jette authored
-
Morris Jette authored
-
Tim Wickberg authored
Add note to slurm.conf man page about setting "--cpu_bind=no" as part of SallocDefaultCommand if a TaskPlugin is in use.
-
Maksym Planeta authored
-
Bjørn-Helge Mevik authored
Test 14.10 in the test suite (of slurm 15.08.8, at least) uses $sinfo -tidle -h -o%n to find idle nodes. This only works if NodeHostname == NodeName on the nodes. The following should work regardless of this: $scontrol show hostnames \$($sinfo -tidle -h -o%N)
-
Morris Jette authored
-
Tim Wickberg authored
-
Morris Jette authored
Revert call to getaddrinfo, restoring gethostbyaddr (introduced in Slurm 16.05.0pre1) which was failing on some systems. Specifically test7.2 was failing on some systems with getaddrinfo() returning an error of "System error: Resource temporarily unavailable". Partial reversion of commit 89621f65
-
- 25 Feb, 2016 4 commits
-
-
Tim Wickberg authored
-
Tim Wickberg authored
Since the function is inlined the single definition let GCC build everything properly, but debug builds (which disable inline) resulted in: slurmstepd: [465.0]: symbol lookup error: (trimmed path)/task_cgroup.so: undefined symbol: val_to_char when running srun --cpu_bind=v. task/affinity had this definition already, task/cgroup didn't.
-
Tim Wickberg authored
Kernel option cgroup_enable=memory is likely what you want to fix, at least Debian and Ubuntu ship with it disabled.
-
Tim Wickberg authored
Nothing checks their return code, and it was always success.
-