- 11 Sep, 2017 6 commits
-
-
Tim Wickberg authored
After auditing the four calls using this function, it's clear that none of these fd's are ever meant to leak to a fork()'d process.
-
Tim Wickberg authored
-
Tim Wickberg authored
Created by VIM if .swp is already in use.
-
Morris Jette authored
-
Morris Jette authored
-
Ole H Nielsen authored
-
- 09 Sep, 2017 6 commits
-
-
Tim Wickberg authored
Per the creat() man page, creat() is equivalent to calling open with flags of O_CREAT|O_WRONLY|O_TRUNC. Add O_CLOEXEC as well.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
Only cut over when the heartbeat file is not being updated any longer. Bug 4142.
-
Tim Wickberg authored
Will write out a timestamp into a 'heartbeat' file in StateSaveLocation every (SlurmctldTimeout / 4) seconds to demonstrate that the primary controller still has access to the directory, and thus the backup should avoid taking control. Bug 4142.
-
- 08 Sep, 2017 22 commits
-
-
Morris Jette authored
-
Isaac Hartung authored
-
Brian Christiansen authored
The change from strncpy to strlcpy was chopping off the last character of the name.
-
Tim Wickberg authored
xgroup_set_param() avoids the string parsing overhead and should be used instead.
-
Tim Wickberg authored
-
Tim Wickberg authored
Save re-parsing the input string back into the components.
-
Tim Wickberg authored
Since ReleaseAgent is no longer required, we can strip out all the supporting logic for it.
-
Morris Jette authored
Accidentally removed bracked in checked-in code
-
Morris Jette authored
-
Morris Jette authored
-
Dominik Bartkiewicz authored
If /proc was inaccessible proc_name would leak. Put an explicit length cap in sprintf to avoid warning. The size is checked immediate before here so this is just making the 10-char limit explicit. Bug 4062.
-
Morris Jette authored
-
Morris Jette authored
-
Dominik Bartkiewicz authored
-
Tim Wickberg authored
If the network path to shared storage used for the StateSaveLocation is separate from that used to communicate with the cluster, both the primary and backup controllers can end up acting as master on loss of the cluster network. Alter the HA takeover code path to make sure that the job state save file is not still being updated by the primary slurmctld. If it is, refuse to takeover and retry again later. Bug 3592.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
Bug 3921.
-
Tim Wickberg authored
-
Dominik Bartkiewicz authored
Bug 4062.
-
Marshall Garey authored
-
Marshall Garey authored
Bug 4084
-
- 07 Sep, 2017 6 commits
-
-
Morris Jette authored
-
Dominik Bartkiewicz authored
bug 3824
-
Morris Jette authored
-
Morris Jette authored
Do not run the Node Health Check on termination of the external step as this happens when the job allocation ends and the job NHC will be executed anyway. Bug 4074
-
Dominik Bartkiewicz authored
-
Danny Auble authored
-