- 06 Oct, 2016 8 commits
-
-
Morris Jette authored
-
Morris Jette authored
See commit 37161057 Bug 3124.
-
Morris Jette authored
Avoid printing sinfo output in test setup Exit immediately on job submit failure (job_id == 0) Preserve files if a test fails to help determine what happened Initial commit in 37161057 bug 3124
-
Alejandro Sanchez authored
Bug 3124.
-
Morris Jette authored
This continues the work of commit f735a0a3 to fix a few places missed in the previous commit.
-
Morris Jette authored
-
Morris Jette authored
Commit 59b118bf moved the call to _build_bitmaps() lower in the code, but that prevented node config_record data structures getting their node_bitmap fields set, causing the build_feature_list_eq() function to fail with a NULL node_bitmap field
-
Morris Jette authored
node_features plugin - Add "mode" argument to node_features_p_node_xlate() function to fix some bugs updating a node's features using the node update RPC. Without this change it is impossible to clear the active features of a node or reset non-KNL node features.
-
- 05 Oct, 2016 14 commits
-
-
Morris Jette authored
If a node is initially determined to NOT be of type KNL, then don't ever look for it's MCDRAM or NUMA modes in cnselect output when dealing with job launch failures or unexpected node reboots.
-
Morris Jette authored
-
Morris Jette authored
node_features/knl_cray plugin: drain any node not reported by "capmc node_status" on startup or reconfig. Also re-tests on failed node restart for job.
-
Morris Jette authored
node_features/knl_cray plugin: Remove any KNL MCDRAM or NUMA features from node's configuration if capmc does NOT report the node as being KNL. For example, we don't want a non-KNL node with features="quad,cache".
-
Morris Jette authored
Add new knl.conf configuration parameter CapmcRetries Modify capmc_suspend and capmc_resume to retry operations when Cray State Manager is down. Add retry logic to node_features/knl_cray to handle Cray State manager being down. bug 3100
-
Danny Auble authored
from commit ee4a9776.
-
Danny Auble authored
-
Danny Auble authored
reset by list_count. Also remove a nested if for cleaner code.
-
Danny Auble authored
-
Tim Wickberg authored
Logs go to both locations when running in non-daemonized mode. Don't refer to this as "debug" mode, while useful for debugging it's not directly related. Bug 3146.
-
Morris Jette authored
node_features/knl_cray plugin: Substantially streamline and speed up logic to load current node state on reconfigure failure or unexpected node boot. Completely eliminate capmc calls and just use cnselect to load current node mode information.
-
Morris Jette authored
node_features/knl_cray plugin: drain any node not reported by "capmc node_status" on startup or reconfig. Also re-tests on failed node restart for job.
-
Morris Jette authored
node_features/knl_cray plugin: Remove any KNL MCDRAM or NUMA features from node's configuration if capmc does NOT report the node as being KNL. For example, we don't want a non-KNL node with features="quad,cache".
-
Brian Christiansen authored
Found by clang. Continuation of 76d62ae4
-
- 04 Oct, 2016 9 commits
-
-
Brian Christiansen authored
-
Morris Jette authored
Missed a flag that needed to move on Cray systems too
-
Morris Jette authored
Add new knl.conf configuration parameter CapmcRetries Modify capmc_suspend and capmc_resume to retry operations when Cray State Manager is down. Add retry logic to node_features/knl_cray to handle Cray State manager being down. bug 3100
-
Danny Auble authored
from commit ee4a9776.
-
Danny Auble authored
-
Danny Auble authored
reset by list_count. Also remove a nested if for cleaner code.
-
Danny Auble authored
-
Tim Wickberg authored
Logs go to both locations when running in non-daemonized mode. Don't refer to this as "debug" mode, while useful for debugging it's not directly related. Bug 3146.
-
Morris Jette authored
-
- 03 Oct, 2016 2 commits
-
-
Dominik Bartkiewicz authored
-
Tim Wickberg authored
Removing the bl_bgq Makefile from configure.ac was a mistake. Add back and run autogen.sh to fix build.
-
- 30 Sep, 2016 7 commits
-
-
Alejandro Sanchez authored
Otherwise they'll truncate when packed into the RPC and end up as some bizarre value at the controller. Bug 3098.
-
Tim Wickberg authored
CID 44797.
-
Tim Wickberg authored
Coverity doesn't like the odd structure that was left behind. No functional difference. CID 44793.
-
Dominik Bartkiewicz authored
Set completed time for pending/running runaway jobs to the max of (start, eligible, submit) times. Bug 3075
-
Morris Jette authored
-
Morris Jette authored
Change "sched_params" to "power_params" because that's what it contains.
-
Morris Jette authored
Added new SchedulerParameters options step_retry_count and step_retry_time to control scheduling behaviour of job steps waiting for resources. bug 3121
-