- 23 May, 2019 20 commits
-
-
Brian Christiansen authored
Node state is 32bit. Have to wait till 20.02 to change packing routines. See 845ff7d4 Bug 6964
-
Brian Christiansen authored
Bug 6964
-
Brian Christiansen authored
The reason was being set after the message was sent to the db. Also clear the draing and reboot states before the message is sent so that the event state will show DOWN. Bug 6964
-
Brian Christiansen authored
Bug 6964
-
Brian Christiansen authored
so that new jobs can't get on the node. Bug 6964
-
Morris Jette authored
unnecessarily. Bug 7106
-
Danny Auble authored
Bug 6927
-
Dominik Bartkiewicz authored
for completing job. Bug 6927
-
Morris Jette authored
If GRES are not bound to specific sockets in a multi-socket node then the sock_gres->sock_cnt variable will be zero and find no usable GRES on a node. Bug 7095
-
Moe Jette authored
specific sockets. Bug 7019
-
Danny Auble authored
This reverts commit 9cd7e5f4.
-
Moe Jette authored
specific sockets. Bug 7095
-
Brian Christiansen authored
Continuations of 45bfc4dc Bug 6926
-
Brian Christiansen authored
Commits: c2bc255c f591f0c9
-
Dominik Bartkiewicz authored
Bug 6926
-
Felip Moll authored
The name variable hasn't been set yet, so this is always NULL. Print the uid/gid instead. While here, treat uid/gid as uint32_t, and use strtoul() rather than atoi() to avoid issues with high-number uid/gid values. Fixes GCC 9 warning. Bug 7101.
-
Alejandro Sanchez authored
Continuation of 89b791bf. Bug 7045.
-
Alejandro Sanchez authored
To indicate that a job is dependent or has an invalid dependency. Not used for now, just added and removed according to its meaning. Bug 7045.
-
Albert Gil authored
After 1d66b395 18.08 and 17.11 are the same so we can just reuse the 18.08 block instead of making a new one. Bug 7080
-
Albert Gil authored
Bug 7080
-
- 22 May, 2019 5 commits
-
-
Brian Christiansen authored
Bug 6467
-
Marshall Garey authored
Job steps that run on cloud nodes and use the alias_list - in other words, SlurmctldParameters=cloud_dns is not in slurm.conf - all talk directly back to the slurmctld. To make that happen, we set the parent tank of each stepd to -1. However, we also set the rank of each stepd to 0. this meant that when each stepd sent a REQUEST_STEP_COMPLETE RPC to the slurmctld, they would tell slurmctld to clean up node 0 in the step allocation. So, multi-node step allocations weren't cleaning up after the steps completed and would cause subsequent job steps to hang. The step allocations would only clean up properly at the end of the job. Ensure that each stepd uses the correct rank so that job steps are properly cleaned up after each step completes. Bug 6467.
-
Alejandro Sanchez authored
They were associated to these two commits: b4d7de48 6871185a Bug 5562.
-
Alejandro Sanchez authored
They were associated to these two commits: b4d7de48 6871185a Bug 5562.
-
Morris Jette authored
Bug 6998.
-
- 21 May, 2019 8 commits
-
-
Dominik Bartkiewicz authored
Bug 6822
-
Moe Jette authored
Bug 7061
-
Brian Christiansen authored
-
Dominik Bartkiewicz authored
unlimited could get overwritten with default queue depth preventing the whole queue from being looked at -- especially in a high-throughput envrionment. Bug 6822 Co-authored-by: Morris Jette <jette@schedmd.com>
-
Alejandro Sanchez authored
Node memory overallocation wouldn't be properly detected since we would just be interpreting the available memory as RealMemory - MemSpecLimit, ignoring other job's memory usage. Bug 5562.
-
Alejandro Sanchez authored
This compares a job memory request against each selected node available memory, interpreting the latter for now as RealMemory - MemSpecLimit. Bug 5562.
-
Dominik Bartkiewicz authored
Bug 6508
-
Alejandro Sanchez authored
Previously when no memory was explicitly requested the job was assigned the DefMemPer[CPU|Node] from the first partition in the list (or the cluster-wide value if the partition wasn't configured with it), even when evaluating against a different partition. Bug 6950.
-
- 17 May, 2019 2 commits
-
-
Tim Wickberg authored
This is select/cons_res, not select/cons_tres.
-
Morris Jette authored
Previous select/cons_res logic would allocate one CPU per task on the node Bug 6981
-
- 16 May, 2019 5 commits
-
-
Morris Jette authored
Previous select/cons_tres logic would allocate one CPU per task on the node Bug 6981
-
Morris Jette authored
Modify task layout with --overcommit option plus a heterogeneous job allocation so that a cyclic task distribution can start happening before all CPUs on all nodes are fully allocated. The number of tasks per node will be unchanged from the previous algorithm, but tasks will be distributed in a cyclic fashion first and then extra tasks placed on nodes with more CPUs. Previously all CPUs would be fully allocated in a cyclic fashion, then excess tasks distributed evenly across all allocated nodes. Bug 6981
-
Dominik Bartkiewicz authored
Add warning to slurm.h.in that no new reservation flags can be stored in slurmdbd in 19.05. (Although they could still be used by slurmctld without issue.) Note that the underlying RPC still uses uint32_t, but this will be changed before 20.02 on master, and changing the column to uint32_t in 19.05 just to change it again in 20.02 is best avoided. Bug 6969.
-
Nathan Rini authored
Free format_list, plugin_id_select_list, rpc_version_list in _free_cluster_cond_members(). Bug 7020.
-
Marshall Garey authored
There was a syntax error in the mysql for inserting the event records into the event table caused by commit 3d61b6aa. The syntax error was a semicolon in the middle of the query, for example: insert into "voyager_event_table" (time_start, time_end, node_name, cluster_nodes, reason, reason_uid, state, tres) values ('1538669453', '1539298628', 'v1', '', 'cold-start', '1017', '0', '1=8,2=4000,5=8,1001=4,1002=1');, (<... another record>);, ... Bug 7025.
-