- 09 May, 2018 2 commits
-
-
Tim Wickberg authored
Made obsolete by structural changes in 17.11. Bug 4953.
-
Tim Wickberg authored
Partition is deleted immediately, not flagged. Tangentially related to bug 5136.
-
- 08 May, 2018 14 commits
-
-
Tim Wickberg authored
-
Tim Wickberg authored
Bug 5133.
-
Brian Christiansen authored
Bug 5146
-
Tim Wickberg authored
-
Tim Wickberg authored
Caused by a corrupted protocol_version field value being received by the slurmstepd, as we cannot safely write/read a uint16_t across the pipe as if it was an int. Regression caused by commit 90b116c2. Bug 5133.
-
Danny Auble authored
-
Brian Christiansen authored
since it's not allocated anymore. Found while investigating Bugs 5137,4522.
-
Danny Auble authored
Regression caused in commit fa3a8ff1. Coverity issue 182984
-
Brian Christiansen authored
Requeued jobs are marked as PENDING|COMPLETING until the epilog checks in. The issue is that if job_set_alloc_tres gets called while in the PENDING|COMPLETING state, the job's alloc_tres_str will be free'd. If this job then gets checkpointed in this state (PENDING|COMPLETING + no tres_alloc_str) on startup the controller would crash because it expected the job to have a tres_alloc_str/cnt when in the COMPLETING state. This could be triggered if starting the controller without the dbd up. When the dbd comes up, the assoc_cache_mgr calls _update_job_tres() which calls job_set_alloc_tres. It could also be triggered by adding new tres. This most likely started happening in 17.11.5 because of commit 865b672f which introduced calling _update_job_tres() on each job after the dbd comes up. Bugs 5137,4522
-
Morris Jette authored
Coverity CID 185507
-
Morris Jette authored
Coverity CID 185506
-
Morris Jette authored
Coverity CID 185503
-
Morris Jette authored
Coverity CID 185505
-
Morris Jette authored
Coverity CID 185504
-
- 07 May, 2018 4 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
- 05 May, 2018 1 commit
-
-
Tim Wickberg authored
-
- 04 May, 2018 3 commits
-
-
Brian Christiansen authored
Only when the connection has timedout. If the connection is timing out, consider increasing TCPTimeout in the slurm.conf Bug 4574
-
Danny Auble authored
-
Danny Auble authored
# Conflicts: # src/slurmctld/job_mgr.c
-
- 03 May, 2018 16 commits
-
-
Boris Karasev authored
Bug 5129.
-
Alejandro Sanchez authored
Bug 5110.
-
Alejandro Sanchez authored
Bug 5110.
-
Tim Wickberg authored
Continuation of d0deea4f. Bug 4841.
-
Alejandro Sanchez authored
Use setenv() instead of setenvfs(), since setenvfs() memory allocation is implemented with xmalloc() and fini_setproctitle() (which is called on reconfigure) free's the memory with free(), leading to a: "free(): invalid size" malloc_printerr error. Continuation of dce83a23. Bug 5095.
-
Felip Moll authored
Due to current design the job limits are checked before the allocation is made when one specifies a generic gres and a specific gres type is configured. The workaround for now is to define a job submit plugin to control the user request and succesfully apply limits. Bug 4767
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
With the previous commit, a segfault would occur if the partiton's nodes were updated after a reconfigure happened because reconfigure blows away the node_record_table_ptr and the node's tres_cnt don't get rebuilt till after PERIODIC_NODE_ACCT (300 seconds).
-
Brian Christiansen authored
or when creating partitions with nodes. Though the the partition tres_cnts will be updated every 5 minutes (PERIODIC_NODE_ACCT) this will update it on create or update.
-
Brian Christiansen authored
Bug 4274
-
Brian Christiansen authored
-
Brian Christiansen authored
Bug 4274
-
Brian Christiansen authored
Verify that a tres weight is an integer.
-
Brian Christiansen authored
This allows job limits to be enforced at submission -- with QOS DenyOnLimit flag. Note that the values could be expanded at schedule time (e.g. request one task but get all cpus on a core). The expanded values are considered when scheduling.
-
Brian Christiansen authored
-