- 28 Sep, 2016 2 commits
-
-
Tim Wickberg authored
Remove from build system, and delete L/P specific files. Run autogen.sh as well.
-
Morris Jette authored
Add "sbatch_wait_nodes" to SchedulerParameters to control default sbatch behaviour with respect to waiting for all allocated nodes to be ready for use. Job can override the configuration option using the --wait-all-nodes=# option. bug 3120
-
- 27 Sep, 2016 2 commits
-
-
Morris Jette authored
Prior logic would treat execute line like this: $ sbatch --wait-all-nodes -N3 tmp with "-N3" as being the argument to the "--wait-all-nodes" option. See bug 3120
-
Morris Jette authored
Add salloc/sbatch/srun option --use-min-nodes to prefer smaller node counts when a range of node counts is specified (e.g. "-N 2-4"). bug 2996
-
- 26 Sep, 2016 1 commit
-
-
Morris Jette authored
Add salloc/sbatch/srun --priority option of "TOP" to set job priority to the highest possible value. This option is only available to Slurm operators and administrators. bug 3115
-
- 24 Sep, 2016 2 commits
-
-
Morris Jette authored
bug 3090
-
Morris Jette authored
Make sure no attempt is made to schedule a requeued job until all steps are cleaned (Node Health Check completes for all steps on a Cray). bug 3082
-
- 23 Sep, 2016 1 commit
-
-
Morris Jette authored
Make sure no attempt is made to schedule a requeued job until all steps are cleaned (Node Health Check completes for all steps on a Cray). bug 3082
-
- 22 Sep, 2016 6 commits
-
-
Dominik Bartkiewicz authored
Otherwise limit is checking the node count against the midplane count. Bug 3049.
-
Alejandro Sanchez authored
Check if node names are contiguous with respect to the node list assigned to the partition, rather than just monotonically increasing. Bug 3006.
-
Tim Wickberg authored
-
Janne Blomqvist authored
Bugs 2681 and 2703 Conflicts: NEWS
-
Adam Moody authored
-
Alejandro Sanchez authored
license of a certain type.
-
- 21 Sep, 2016 8 commits
-
-
Morris Jette authored
node_features/knl_cray plugin: Increase default CapmcTimeout parameter from 10 to 60 seconds. bug 3100
-
Morris Jette authored
capmc_suspend/resume - If a request modify NUMA or MCDRAM state on a set of nodes or reboot a set of nodes fails then just requeue the job and abort the entire operation rather than trying to operate on individual nodes. bug 3100
-
Morris Jette authored
Allow a node's PowerUp state flag to be cleared using update_node RPC. bug 3100
-
Morris Jette authored
When powering up a node to change it's state (e.g. KNL NUMA or MCDRAM mode) then pass to the ResumeProgram the job ID assigned to the nodes in the SLURM_JOB_ID environment variable. bug 3100
-
Morris Jette authored
Don't log error for job end_time being zero if node health check is still running. bug 3053
-
Morris Jette authored
capmc_suspend/resume - If a request modify NUMA or MCDRAM state on a set of nodes or reboot a set of nodes fails then just requeue the job and abort the entire operation rather than trying to operate on individual nodes. bug 3100
-
Morris Jette authored
Allow a node's PowerUp state flag to be cleared using update_node RPC. bug 3100
-
Morris Jette authored
When powering up a node to change it's state (e.g. KNL NUMA or MCDRAM mode) then pass to the ResumeProgram the job ID assigned to the nodes in the SLURM_JOB_ID environment variable. bug 3100
-
- 20 Sep, 2016 1 commit
-
-
Morris Jette authored
Don't log error for job end_time being zero if node health check is still running. bug 3053
-
- 17 Sep, 2016 2 commits
-
-
Danny Auble authored
the same logic that was found in the slurmdbd. Now both functionalities share the same code. This was done with the merge right before this commit.
-
Morris Jette authored
Restore ability to manually power down nodes, broken in 15.08.12 in commit b4904661 The patch introduced in commit b4904661 (not powering down dead node) has a bad side effect. Adding the "(node_ptr->last_idle != 0)" condition prevents from powering down nodes with the following command: scontrol update nodename=nX state=power_down because the state update function relies on zeroing the "last_idle" variable when a power_down is requested (see src/slurmctld/node_mgr.c, line 1589). Reverting this commit should solve the problem...but I let you decide... Didier GAZEN
-
- 16 Sep, 2016 1 commit
-
-
Morris Jette authored
node_features/knl_cray: If a node is rebooted outside of Slurm's direction, update it's active features with current MCDRAM and NUMA mode information. bug 3071
-
- 15 Sep, 2016 3 commits
-
-
Tim Wickberg authored
Will be appended to usernames if --mail-user is not explicitly set for the job and email notifications are requested. Bug 3089.
-
Morris Jette authored
Fix race condition that could result in MCDRAM state information coming from capmc rather than cnselect (used state for next boot rather than latest boot). bug 3080
-
Nicolas Joly authored
-
- 14 Sep, 2016 2 commits
-
-
Alejandro Sanchez authored
No functional change, just silencing the warning message in this instance. Bug 3079.
-
Alejandro Sanchez authored
Bug 3073.
-
- 12 Sep, 2016 1 commit
-
-
Tim Wickberg authored
-
- 09 Sep, 2016 3 commits
-
-
Morris Jette authored
Modify srun task completion handling to only build the task/node string for logging purposes if it is needed. Modified for performance purposes. bug 3044
-
Tim Wickberg authored
This reverts commit 1ec2a4ae.
-
Alejandro Sanchez authored
Bug 3063.
-
- 08 Sep, 2016 2 commits
-
-
Brian Christiansen authored
In scontrol show nodes.
-
Morris Jette authored
Restructure srun command locking for task_exit processing logic for improved parallelism. This change decreases the amount of time consumed by serial logic by 2 orders of magnitude. bug 3044
-
- 07 Sep, 2016 2 commits
-
-
Morris Jette authored
Preserve node "RESERVATION" state when one of multiple overlapping reservations ends. Previous logic would clear the node's RESERVATION state flag when any one of the reservations on the node ended rather than keeping the node in RESERVATION state until the last reservation ended. bug 3057
-
Morris Jette authored
Handle case when slurmctld daemon restart while compute node reboot in progress. Return node to service rather than setting DOWN. bug 3042
-
- 06 Sep, 2016 1 commit
-
-
Morris Jette authored
Add salloc_wait_nodes option to the SchedulerParameters parameter in the slurm.conf file controlling when the salloc command returns in relation to when nodes are ready for use (i.e. booted). bug 3043
-