- 12 Oct, 2016 10 commits
-
-
Tim Wickberg authored
This introduced an inadvertent dependency on the config file, which does not exist when setting up a new cluster. Bug 3026. This reverts commit c39f9ac9.
-
Tim Wickberg authored
-
Morris Jette authored
task/affinity plugin: Honor a job's --ntasks-per-socket and --ntasks-per-core options in task binding. bug 3118
-
Pär Lindfors authored
-
Brian Christiansen authored
Changed in df70b651
-
Brian Gilmer authored
-
Morris Jette authored
Preserve non-KNL node features when updating the KNL node features for a multi-node job in which the non-KNL node features vary by node.
-
Morris Jette authored
node_features/knl_cray plugin: If the reconfiguration of nodes for an interactive job fails, kill the job (it can't be requeued like a batch job).
-
Morris Jette authored
Execute "capmc node_status" at more frequent intervals to handle nodes getting added or removed from the system using Cray tools (i.e. try to keep Slurm and Cray software better synchronized).
-
Morris Jette authored
node_features/knl_cray plugin: Add separate thread to interact with capmc in response to unexpected node reboots. bug 3153
-
- 11 Oct, 2016 9 commits
-
-
Alejandro Sanchez authored
bug 3091
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
bug 3155
-
Morris Jette authored
Prevent possible divide by zero in select/cons_res if a node's board count is higher than it's socket count. bug 3155
-
Morris Jette authored
If a node's socket or core count are changed at registration time (e.g. a KNL node's NUMA mode is changed), change it's board count to match. bug 3155
-
Morris Jette authored
Cray: The slurmd can manipulate the socket/core/thread values reported based upon the configuration. The logic failed to consider select/cray with SelectTypeParameters=other_cons_res as equivalent to select/cons_res. bug 3155
-
Tim Wickberg authored
abs() should not be used on long long variables as it would truncate if strictly confirming to C99. Use llabs() instead. Fix to commit 2aefc66b.
-
Tim Wickberg authored
-
- 09 Oct, 2016 1 commit
-
-
Morris Jette authored
These tests fail with a configuration of CR_ONE_TASK_PER_CORE. The tests are being disabled now for such a configuration, but could be modified to run in such an environment in the future.
-
- 07 Oct, 2016 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
Correct SchedulerParameters=bf_busy_nodes logic with respect to the job's minimum node count. Previous logic would not decremement counter in two locations and reject valid job request for not reaching minimum node count.
-
- 06 Oct, 2016 13 commits
-
-
Danny Auble authored
always just AccountingPolicy.
-
Tim Wickberg authored
-
Danny Auble authored
daf91a80
-
Danny Auble authored
-
Danny Auble authored
configure.
-
Morris Jette authored
Add description of how to test for disabled Cray nodes and avoid trying to use them in job allocations. Access to Cray required to flesh out the code and test.
-
Morris Jette authored
-
Morris Jette authored
Accidentally added in commit f4887c74 bug 3124
-
Morris Jette authored
See commit 37161057 Bug 3124.
-
Morris Jette authored
Avoid printing sinfo output in test setup Exit immediately on job submit failure (job_id == 0) Preserve files if a test fails to help determine what happened Initial commit in 37161057 bug 3124
-
Alejandro Sanchez authored
Bug 3124.
-
Morris Jette authored
Commit 59b118bf moved the call to _build_bitmaps() lower in the code, but that prevented node config_record data structures getting their node_bitmap fields set, causing the build_feature_list_eq() function to fail with a NULL node_bitmap field
-
Morris Jette authored
node_features plugin - Add "mode" argument to node_features_p_node_xlate() function to fix some bugs updating a node's features using the node update RPC. Without this change it is impossible to clear the active features of a node or reset non-KNL node features.
-
- 05 Oct, 2016 5 commits
-
-
Morris Jette authored
If a node is initially determined to NOT be of type KNL, then don't ever look for it's MCDRAM or NUMA modes in cnselect output when dealing with job launch failures or unexpected node reboots.
-
Morris Jette authored
node_features/knl_cray plugin: Substantially streamline and speed up logic to load current node state on reconfigure failure or unexpected node boot. Completely eliminate capmc calls and just use cnselect to load current node mode information.
-
Morris Jette authored
node_features/knl_cray plugin: drain any node not reported by "capmc node_status" on startup or reconfig. Also re-tests on failed node restart for job.
-
Morris Jette authored
node_features/knl_cray plugin: Remove any KNL MCDRAM or NUMA features from node's configuration if capmc does NOT report the node as being KNL. For example, we don't want a non-KNL node with features="quad,cache".
-
Brian Christiansen authored
Found by clang. Continuation of 76d62ae4
-