- 06 Jul, 2015 2 commits
-
-
Morris Jette authored
Backfill scheduler now considers OverTimeLimit and KillWait configuration parameters to estimate when running jobs will exit. Initially the job's end time is estimated based upon it's time limit. After the time limit is reached, the end time estimate is based upon the OverTimeLimit and KillWait configuration parameters. bug 1774
-
Morris Jette authored
Backfill scheduler: The configured backfill_interval value (default 30 seconds) is now interpretted as a maximum run time for the backfill scheduler. Once reached, the scheduler will build a new job queue and start over, even if not all jobs have been tested. bub 1774
-
- 03 Jul, 2015 1 commit
-
-
Morris Jette authored
-
- 30 Jun, 2015 3 commits
-
-
Thomas Cadeau authored
Bug 1745
-
Brian Christiansen authored
This reverts commit 3f91f4b2.
-
Danny Auble authored
and test21.* updated to use them.
-
- 29 Jun, 2015 1 commit
-
-
Nathan Yee authored
Bug 1745
-
- 25 Jun, 2015 3 commits
-
-
David Bigagli authored
-
Danny Auble authored
ESLURM_DB_CONNECTION when in error.
-
Morris Jette authored
-
- 24 Jun, 2015 2 commits
-
-
Morris Jette authored
-
David Bigagli authored
-
- 23 Jun, 2015 2 commits
-
-
David Bigagli authored
-
Morris Jette authored
-
- 22 Jun, 2015 9 commits
-
-
Morris Jette authored
Updates of existing bluegene advanced reservations did not work at all. Some multi-core configurations resulting in an abort due to creating core_bitmaps for the reservation that only had one bit per node rather than one bit per core. These bugs were introduced in commit 5f258072
-
Morris Jette authored
-
Morris Jette authored
-
David Bigagli authored
-
Thomas Cadeau authored
-
David Bigagli authored
-
Moe Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
- 19 Jun, 2015 2 commits
-
-
David Bigagli authored
-
David Bigagli authored
job data structure.
-
- 18 Jun, 2015 3 commits
-
-
David Bigagli authored
-
-
Morris Jette authored
-
- 17 Jun, 2015 3 commits
-
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Morris Jette authored
-
- 15 Jun, 2015 2 commits
-
-
Brian Christiansen authored
-
Morris Jette authored
Logic was assuming the reservation had a node bitmap which was being used to check for overlapping jobs. If there is no node bitmap (e.g. a licenses only reservation), an abort would result.
-
- 12 Jun, 2015 2 commits
-
-
Brian Christiansen authored
Bug 1739
-
Brian Christiansen authored
Bug 1743
-
- 11 Jun, 2015 5 commits
-
-
Brian Christiansen authored
Prevent double free.
-
Brian Christiansen authored
-
Brian Christiansen authored
Bug 1733
-
Didier GAZEN authored
In your node_mgr fix to keep rebooted nodes down (commit 9cd15dfe), you forgot to consider the case of nodes that are powered up but are responding after ResumeTimeout seconds (the maximum time permitted). Such nodes are marked DOWN (because they didn't respond within ResumeTimeout seconds) than should become silently available when ReturnToService=1 (as stated in the slurm.conf manual) With your modification when such nodes are finally responding, they are seen as rebooted nodes and remain in the DOWN state (with the new reason: Node unexpectedly rebooted) even when ReturnToService=1 ! Correction of commit 3c2b46af
-
Didier GAZEN authored
-