- 16 Oct, 2014 1 commit
-
-
jette authored
Specifically if job terminated from suspend state, end_time will be the time that the job suspend began
-
- 15 Oct, 2014 15 commits
-
-
Morris Jette authored
If set, powered down nodes in the cloud will be visible.
-
Morris Jette authored
This fixes a race condition if the slurmctld needed to power up a node shortly after startup. Previously it would execute the ResumeProgram twice for effected nodes.
-
Morris Jette authored
Without this change, a node in the cloud that failed to power up, would not have its NoResponding flag cleared, which would prevent its later use. The NoResponding flag is now cleared when manuallly when the node is modified to PowerDown.
-
Morris Jette authored
If a batch job launch to the cloud fails, permit an unlimited number of job requeues. Previously the job would abort on the second launch failure.
-
Nicolas Joly authored
to commit fa73701e.
-
Danny Auble authored
Conflicts: testsuite/expect/test1.91
-
Nicolas Joly authored
This reverts commit 4d03d0b4. Make sure the correct Author is attributed here.
-
Danny Auble authored
This reverts commit 1891936e.
-
Danny Auble authored
This has apparently been broken from the get go. This fixes bug 1172. test21.22 should be updated to test the dump and load of a file that is generated.
-
Danny Auble authored
since this represents the user could be exaggerating their system.
-
Danny Auble authored
-
Danny Auble authored
using --ntasks-per-node. This is related to bug 1145. What was happening is all the cpus were allocated on one socket instead of a cyclic method. While this is allowed it is strange and resulted in this bug. There appears to be a different bug as to why the tasks were laid out in a block fashion in the first place.
-
Nicolas Joly authored
-
Morris Jette authored
-
Morris Jette authored
-
- 14 Oct, 2014 17 commits
-
-
Morris Jette authored
This adds checks for NULL pointers in the gres data structures to avoid memory references including string compares with NULL pointers.
-
Morris Jette authored
-
Morris Jette authored
Attempted to free variable that was not a pointer bug 1166
-
Danny Auble authored
with no way to get them out. This fixes bug 1134. It is advised the pro/epilog to call xtprocadmin in the script instead of returning a non-zero exit code.
-
Danny Auble authored
It was needed at one point since <=2.5 the srun was only a wrapper to aprun that needed help. This isn't the case anymore and hasn't been since we made srun do all the heavy lifting so we can remove it.
-
Danny Auble authored
-
Brian Christiansen authored
The job could have been purged from a short MinJobAge and the trigger would then point to an invalid job. Bug #1144
-
Danny Auble authored
-
Morris Jette authored
Conflicts: src/slurmd/slurmd/slurmd.c
-
Morris Jette authored
Note that PlugStackConfig defaults to plugstack.conf in the same directory as slurm.conf. The added logic tests if the file actually exists (using stat) and if not found then do not fork/exec slurmstepd to invoke the spank prolog/epilog. This saves about 14msec on startup and 14msec on shutdown if no spank plugins are configured. It also eliminates some possible failures (e.g. if fork() fails, or the slurmstepd processes can not exec()). This logic also caches the PlugStackConfig value and reads it again on reconfigure, but avoid reading the value for each job. bug 982
-
Morris Jette authored
Add "void" argument to a function and rename a local function to have a prefix of "_"
-
Danny Auble authored
Issue is only in rc1. Fix regression from commit bfd4697b for bug
-
Danny Auble authored
-
Nicolas Joly authored
-
Danny Auble authored
9b00f12c
-
Nicolas Joly authored
Signed-off-by: Danny Auble <da@schedmd.com>
-
Morris Jette authored
-
- 13 Oct, 2014 4 commits
-
-
Nicolas Joly authored
-
jette authored
-
jette authored
- 11 Oct, 2014 3 commits
-
-
Morris Jette authored
-
Morris Jette authored
if a node is down, then permit setting its state to power down, which causes the SuspendProgram to run and set the node state back to cloud.
-
Morris Jette authored
If a node is powered down, then do not power it up on slurmctld restart.
-