- 22 Jun, 2011 9 commits
-
-
Morris Jette authored
If an salloc allocation is revoked and the job is in a suspended state, send the child processes a SIGCONT before sending SIGHUP or SIGTERM so that the processes can terminate immediately.
-
Morris Jette authored
For front-end architectures on which job steps are run (emulated Cray and BlueGene systems only), fix bug that would free memory still in use.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
-
Morris Jette authored
Processes suspended and resumed are determined by using process group ID and parent process ID, so some processes may be missed. Since salloc runs as a normal user, it's ability to identify processes associated with a job is limited.
-
- 21 Jun, 2011 13 commits
-
-
Morris Jette authored
Only suspend the salloc command's children when suspending the job for cray systems. This is required to prevent additional aprun commands from being spawned.
-
Danny Auble authored
-
Danny Auble authored
-
Moe Jette authored
Modify srun_job_suspend() to return status (message sent or not)
-
Moe Jette authored
-
Moe Jette authored
Modify slurmctld logic to send SRUN_REQUEST_SUSPEND so that it does not wait for a reply from salloc or srun.
-
Moe Jette authored
This initializes the job_suspend callback function in srun.
-
Moe Jette authored
Improve efficiency of select/linear plugin with topology/tree plugin configured, Patch by Andriy Grytsenko (Massive Solutions Limited).
-
-
Moe Jette authored
Modify smap and sview to display all nodes even if multiple nodes exist at each coordinate.
-
Moe Jette authored
Modify NODES_PER_COORDINATE as desired in src/plugins/select/cray/libemulate/alps_emulate.c b/src/plugins/select/cray/libemulate/alps_emulate.c
-
Moe Jette authored
-
dannyauble authored
this is a resubmission of https://github.com/chaos/slurm/pull/24
-
- 20 Jun, 2011 6 commits
-
-
Moe Jette authored
Cray systems: Add support to suspend/resume salloc command to insure that aprun does not get initiated when the job is suspended.
-
Moe Jette authored
Eliminate memory leak if cray.conf file is not found Eliminate assert due to not using xmalloc if cray.conf file is found
-
Jimmy Tang authored
Jimmy Tang: - add underscores to internal helper functions - remove static definitions of some variables - use xfree and xmalloc when available Peter Vermeulen - Remove unneeded prototypes and dead code
-
moe authored
updated drop-in replacement of the contribs/cray libalps test programs. * up-to-date updates of libalps (in synch with slurm's libalps) * support for Accelerators * alps_tests/reassemble_inventory and * alps_tests/test_reservation now have accelerator support * the latter implements the key/value format for passing parameters to e.g. the gres/gpu plugin (but here it is just a dumb test program). Patch from Gerrit Renker and Stphen Trofinoff, CSCS.
-
moe authored
With regard to forthcoming Accelerator support in Basil 1.2/Alps 4.0, this adds interface support for passing the following Accelerator parameters: * accelerator type (currently only "GPU" is supported), * model/rank information (uninterpreted "family" string), * amount of on-board memory in MB. 02_Cray-Accelerator-params.diff Patch from Gerrit Renker and Stephen Trofinoff, CSCS.
-
moe authored
This adds support to parse Basil 1.2/Alps 4.0 per-node accelerator information. 01_Cray-Accelerator-basic-support.diff Patch from Gerrit Renker and Stephen Trofinoff, CSCS
-
- 17 Jun, 2011 5 commits
-
-
Moe Jette authored
-
Moe Jette authored
-
Moe Jette authored
NOTE: THERE HAS BEEN A NEW FIELD ADDED TO THE CONFIGURATION RESPONSE RPC AS SHOWN BY "SCONTROL SHOW CONFIG". THIS FUNCTION WILL ONLY WORK WHEN THE SERVER AND CLIENT ARE BOTH RUNNING SLURM VERSION 2.3.0.pre6
-
Moe Jette authored
Improve variable names, comments, variable types and various cosmetic changes in select/linear. Patch from Andriy Grytsenko (Massive Solutions Limited).
-
Moe Jette authored
Fix bug in layout of job step with --nodelist option plus node count. Old code could allocate too few nodes by double counting some nodes.
-
- 16 Jun, 2011 7 commits
-
-
Moe Jette authored
-
Danny Auble authored
-
Danny Auble authored
system.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
changes
-
Danny Auble authored
-