- 12 Jan, 2016 3 commits
-
-
Tim Wickberg authored
Match behavior of other PBS-like resource managers. Bug 2330.
-
Alejandro Sanchez authored
-
Dorian Krause authored
Don't allow user specified reservation names to disrupt the normal reservation sequeuece numbering scheme. bug 2318
-
- 11 Jan, 2016 3 commits
-
-
Danny Auble authored
-
Danny Auble authored
anything. The slurmd will process things correctly after the fact.
-
Morris Jette authored
The restriction from Cray has been lifted. bug 2317
-
- 08 Jan, 2016 1 commit
-
-
Tim Wickberg authored
Otherwise upgrading slurm on a compute node while tasks are running will cause plugin mismatch, as slurmstepd would not load the library until task completion before. Bug 2319.
-
- 07 Jan, 2016 6 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Tim Wickberg authored
Bug 2314.
-
Danny Auble authored
this happens anywhere in the code but just incase it ever does, lets fix it.
-
Morris Jette authored
This can be caused by a core reservation on nodes which get taken out of the system or fail. bug 2296
-
Danny Auble authored
-
- 06 Jan, 2016 2 commits
-
-
Tim Wickberg authored
salloc/sbatch/srun did not mention this. Also reference OverTimeLimit as another option affecting the final run time. Bug 2309.
-
Danny Auble authored
the job starts update the cpus_per_task appropriately. This also moves update num_tasks to after the setting of node counts on an update. It didn't appear to matter, but the cpus_per_task and pn_min_cpus had to be figured out after the cpus and nodes were set but before tasks. Bug 2302
-
- 05 Jan, 2016 3 commits
-
-
Morris Jette authored
burst_buffer/cray - Improve tracking of allocated resources to handle race condition when reading state while buffer allocation is in progress. Also initialize a mutex
-
Danny Auble authored
DBD for the first time. The corruption is only noticed at shutdown. Bug 2293
-
Morris Jette authored
-
- 04 Jan, 2016 4 commits
-
-
Morris Jette authored
Set job's reason to "Priority" when higher priority job in that partition (or reservation) can not start rather than leaving the reason set to "Resources". bug 2285
-
Danny Auble authored
error message.
-
Danny Auble authored
-
Morris Jette authored
If a reservation's nodes value is "all" then track the current nodes in the system, even if those nodes change. Nodes will automatically be added to or removed from a reservation when slurm.conf changes. bug 2204
-
- 02 Jan, 2016 1 commit
-
-
Brian Christiansen authored
Bug 2281
-
- 31 Dec, 2015 2 commits
-
-
Tim Wickberg authored
Rename the variable to match rest of codebase while here. This is related to bug 2295, although snprintf() protects against buffer overflow in 15.08 and up.
-
Tim Wickberg authored
Later releases have switched over to snprintf to avoid this issue, but 14.11 did not get that patch. Bug 2295.
-
- 30 Dec, 2015 1 commit
-
-
Danny Auble authored
-
- 29 Dec, 2015 3 commits
-
-
Alejandro Sanchez authored
time.
-
Danny Auble authored
static/overlap systems when some hardware issue happens when restarting the slurmctld.
-
Danny Auble authored
a dynamic system and mark the block in error on a static/overlap system. Bug 2273
-
- 28 Dec, 2015 2 commits
-
-
Morris Jette authored
Don't use lower weight nodes for job allocation when topology/tree used. bug 2284
-
Morris Jette authored
Preemption/gang scheduling: If a job is suspended at slurmctld restart or reconfiguration time, then leave it suspended rather than resume+suspend. bug 2274
-
- 23 Dec, 2015 2 commits
-
-
Morris Jette authored
task/affinity: Disable core-level task binding if more CPUs required than available cores. bug 2267
-
Morris Jette authored
Log as error if more than 3 aeld connects per second that cause is likely duplicate slurmctld daemon bug 2278
-
- 22 Dec, 2015 1 commit
-
-
Morris Jette authored
This is needed to properly enforce limits and account for usage.
-
- 19 Dec, 2015 1 commit
-
-
John Hensley authored
Remove the 1024-character limit on lines in batch scripts, which was causing long lines to be silently truncated. I noticed it when jobs were getting created with fewer dependencies than specified. Also increase the line length when showing job info.
-
- 18 Dec, 2015 3 commits
-
-
jette authored
sched/backfill: If a job can not be started within the configured backfill_window, set it's start time to 0 (unknown) rather than the end of the backfill_window. bug 2100
-
Danny Auble authored
-
jette authored
If a pending job array has multiple reasons for being in a pending state, then print all reasons in a comma separated list. Before: JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 97354_[1-4] debug tmp jette PD 0:00 1 (Resources) After: JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 97354_[1-4] debug tmp jette PD 0:00 1 (Resources,JobHeldUser)
-
- 16 Dec, 2015 2 commits
-
-
Brian Christiansen authored
Bug 2130
-
Morris Jette authored
Move slurmctld mail handler to separate thread for improved performance. Original logic did fork/exec without separate thread and if the slurmctld memory size is huge, then the time required for fork() to complete can be significant. bug 2252
-