- 09 Jun, 2017 1 commit
-
-
Morris Jette authored
-
- 08 Jun, 2017 2 commits
-
-
Dominik Bartkiewicz authored
Improve selection of jobs to preempt when there are multiple partitions with jobs subject to preemption. bug 3824
-
Dominik Bartkiewicz authored
Prevent segfault from pointer dereference to the QOS that is being deleted. Fix to commit 3e8aa451.
-
- 07 Jun, 2017 2 commits
-
-
Tim Wickberg authored
-
Tim Wickberg authored
-
- 06 Jun, 2017 1 commit
-
-
Morris Jette authored
-
- 03 Jun, 2017 1 commit
-
-
Danny Auble authored
Fix regression from commit c05dcb8a (bug 1923) that doesn't take into consideration a blank char * as a valid option. This fixes the scenario like sacctmgr list associations user='' which would only print account associations. Bug 3862
-
- 02 Jun, 2017 2 commits
-
-
Danny Auble authored
a good return code. This also fixes the situation where the step was ending but not yet ended so it sends the KILL_TASK_FAILED error instead of JOB_NOTRUNNING. Also it removes the abort in favor of exit which it should had been anyways. Bug 3758
-
Dominik Bartkiewicz authored
list_for_each)
-
- 01 Jun, 2017 10 commits
-
-
Mark Klein authored
Inadvertently set to one when requested. Bug 3855.
-
Tim Wickberg authored
Bug 3857.
-
Danny Auble authored
-
Danny Auble authored
purge_files_list.
-
Danny Auble authored
-
Tim Wickberg authored
File deletion can be slow, especially when StateSaveLocation in on NFS or other network filesystems. Since purge_old_job() holds all the slurmctld write locks, this is especially performance sensitive. Moving this to an independent thread lets the slower filesystem cleanup happen without owning these locks. purge_old_job() then results in the purged job ids being queued in the purge_list. A race with the job id potentially wrapping around again is already prevented by _dup_job_file_test() in get_next_job_id(). Bug 3763.
-
Tim Wickberg authored
Only called from _list_delete_job once the MinJobAge has passed.
-
Tim Wickberg authored
This will need to be handled differently. The timeout can lead to the purge process falling further and further behind on high throughput systems if the number of job scripts that can be deleted within a second is lower than the job submission and completion rate of the cluster, eventually leading to the MaxJobCount limit being reached. Bug 3763.
-
Danny Auble authored
-
Danny Auble authored
-
- 31 May, 2017 6 commits
-
-
Danny Auble authored
it works better on multi-slurmd installs.
-
Tim Wickberg authored
Revert some of my b50f4661. Elaborate on tradeoffs, and point to HTC page as well which is a better location for this info.
-
Danny Auble authored
-
Tim Wickberg authored
This is better discussed in the high_throughput.shtml doc. Also, "Contrain" is misspelled adding to the confusion.
-
Tim Shaw authored
Bug 3840.
-
Tim Shaw authored
-
- 30 May, 2017 6 commits
-
-
Tim Shaw authored
node_featurs/knl_cray plugin: Don't clear configured GRES from non-KNL node. bug 3768
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
- 26 May, 2017 6 commits
-
-
Danny Auble authored
-
Dominik Bartkiewicz authored
Initial fix for handling floating partitions that use qos grp limits. Bug 3776
-
Danny Auble authored
the SchedulerParameters=reduce_completing_frag option. NOTE: reduce_completing_frag on or off only works with CompletingWait set to something. Bug 3756
-
Dominik Bartkiewicz authored
This will improve performance and simplify the code. bug 3757
-
Gary authored
bug 3754
-
Gary authored
For jobs submited to multiple partitions, report the job's earliest start time for any partition. bug 3754
-
- 25 May, 2017 3 commits
-
-
Isaac Hartung authored
Burst buffer jobs cannot be run as root currently, change test to prevent that. Bug 3723.
-
Danny Auble authored
Bug 3756
-
Dominik Bartkiewicz authored
Bug 3756
-