- 26 Feb, 2015 4 commits
-
-
Morris Jette authored
Previously, there was no binding of tasks to the appropriate NUMA. Based upon work by Josko Plazonic <plazonic@princeton.edu>.
-
Morris Jette authored
Improved logging and some code restructuring. No change in logic.
-
David Bigagli authored
This reverts commit e24a418b.
-
David Bigagli authored
-
- 25 Feb, 2015 5 commits
-
-
Morris Jette authored
Mail notifications on job BEGIN, END and FAIL now apply to a job array as a whole rather than generating individual email messages for each task in the job array.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
This is a variation on commit 5391b8cc Check $HOME/.my.cnf last rather than first to follow more standard search order
-
Morris Jette authored
-
- 24 Feb, 2015 10 commits
-
-
Brian Christiansen authored
Bug 1469
-
Michael A. Raymond authored
-
Morris Jette authored
-
Nina Suvanphim authored
The /root/.my.cnf would typically contain the login credentials for root. If those are needed for Slurm, then it should be checking that directory. (In reply to Nina Suvanphim from comment #0) ... > const char *default_conf_paths[] = { > "/root/.my.cnf", <<<<<<<<<<<<<<<<<------- add this line > "/etc/my.cnf", "/etc/opt/cray/MySQL/my.cnf", > "/etc/mysql/my.cnf", NULL }; I'll also note that typically the $HOME/.my.cnf file would be checked last rather than first.
-
Morris Jette authored
Fix some logic related to power distribution across nodes
-
Morris Jette authored
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
don't support strong_alias
-
Morris Jette authored
Update power management web page: Add notes about powering nodes down/up Prevent underflow in power distribution logic Add logic to identify nodes in "ready" state. Only ready nodes can have their power caps modified Don't change power cap if node not in ready state Various improvements to logging Refactor code to eliminate duplicate/repeated building of full NID list Plug some memory leaks
-
- 23 Feb, 2015 1 commit
-
-
Morris Jette authored
Modify test 12.7 so that we specify a reason when setting a node DOWN A recent change to the Slurm code now requires a reason
-
- 21 Feb, 2015 1 commit
-
-
Morris Jette authored
-
- 20 Feb, 2015 5 commits
-
-
Morris Jette authored
-
Morris Jette authored
Correct capmc arguments to set power cap. Convert "capmc get_node_energy_counter" to use hostlist expressin rather than listing every node in a comma separated list. Log commands and args run by the plugin via the power_run_script() function in src/plugins/power/common/power_common.c. Use hostlist to build condenced nid list for power cap set/clear functions.
-
Morris Jette authored
-
Dorian Krause authored
we came across the following error message in the slurmctld logs when using non-consumable resources: error: gres/potion: job 39 dealloc of node node1 bad node_offset 0 count is 0 The error comes from _job_dealloc(): node_gres_data=0x7f8a18000b70, node_offset=0, gres_name=0x1999e00 "potion", job_id=46, node_name=0x1987ab0 "node1") at gres.c:3980 (job_gres_list=0x199b7c0, node_gres_list=0x199bc38, node_offset=0, job_id=46, node_name=0x1987ab0 "node1") at gres.c:4190 job_ptr=0x19e9d50, pre_err=0x7f8a31353cb0 "_will_run_test", remove_all=true) at select_linear.c:2091 bitmap=0x7f8a18001ad0, min_nodes=1, max_nodes=1, max_share=1, req_nodes=1, preemptee_candidates=0x0, preemptee_job_list=0x7f8a2f910c40) at select_linear.c:3176 bitmap=0x7f8a18001ad0, min_nodes=1, max_nodes=1, req_nodes=1, mode=2, preemptee_candidates=0x0, preemptee_job_list=0x7f8a2f910c40, exc_core_bitmap=0x0) at select_linear.c:3390 bitmap=0x7f8a18001ad0, min_nodes=1, max_nodes=1, req_nodes=1, mode=2, preemptee_candidates=0x0, preemptee_job_list=0x7f8a2f910c40, exc_core_bitmap=0x0) at node_select.c:588 avail_bitmap=0x7f8a2f910d38, min_nodes=1, max_nodes=1, req_nodes=1, exc_core_bitmap=0x0) at backfill.c:367 The cause of this problem is that _node_state_dup() in gres.c does not duplicate the no_consume flag. The cr_ptr passed to _rm_job_from_nodes() is created with _dup_cr() which calls _node_state_dup(). Below is a simple patch to fix the problem. A "future-proof" alternative might be to memcpy() from gres_ptr to new_gres and only handle pointers separately.
-
Morris Jette authored
-
- 19 Feb, 2015 4 commits
-
-
Brian Christiansen authored
Bug 1471
-
Morris Jette authored
-
Morris Jette authored
"If you specify a maximum node count and the host list contains more nodes, the extra node names will be silently ignored." Not so.
-
Danny Auble authored
runs certain sreport reports.
-
- 18 Feb, 2015 10 commits
-
-
Morris Jette authored
-
Morris Jette authored
For srun command with the --no-alloc option, the dummy credential created did not have two new fields (job_constraints and job_gres_list) initialized, resulting in invalid memory references. Bug introduced earlier today.
-
Morris Jette authored
Added "--mail=stage_out" option to job submission commands to notify user when burst buffer state out is complete.
-
Morris Jette authored
Add new job_descriptor fields to the job_submit/lua interface: clusters, power_flags, and sicp_mode
-
Morris Jette authored
Add SLURM_JOB_CONSTAINTS to environment variables available to the Prolog. bug 1458
-
Morris Jette authored
Add job credential to "Run Prolog" RPC used with a configuration of PrologFlags=alloc. This allows the Prolog to be passed identification of GPUs allocated to the job.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Add SLURM_JOB_GPUS environment variable to those available in Prolog. Also add list of environment variables available in the various prologs and epilogs on the web page. bug 1458
-
Danny Auble authored
-