- 26 Nov, 2014 4 commits
-
-
Morris Jette authored
-
Morris Jette authored
Job wait reason of "ReqNodeNotAvail" expanded to identify unavailable nodes (e.g. "ReqNodeNotAvail(Unavailable:tux[3-6])"). bug 1277
-
Morris Jette authored
-
David Bigagli authored
an error message and return 1.
-
- 24 Nov, 2014 2 commits
-
-
Morris Jette authored
Permit "SuspendTime=NONE" as slurm.conf value rather than only a numeric value to match "scontrol show config" output. This remove some special case logic in the test of "scontrol write config" and in the generation of the output file written.
-
Artem Polyakov authored
Double max string that Slurm can pack from 16MB to 32MB to support larger MPI2 configurations.
-
- 22 Nov, 2014 1 commit
-
-
Morris Jette authored
Added SchedulerParameters option of "bf_busy_nodes". When selecting resources for pending jobs to reserve for future execution (i.e. the job can not be started immediately), then preferentially select nodes that are in use. This will tend to leave currently idle resources available for backfilling longer running jobs, but may result in allocations having less than optimal network topology. This option is currently only supported by the select/cons_res plugin.
-
- 21 Nov, 2014 4 commits
-
-
David Bigagli authored
-
Danny Auble authored
-
Dominik Bartkiewicz authored
This can happen if the specified job ID is not found.
-
Morris Jette authored
Add advance reservation flag of "replace" that causes allocated resources to be replaced with idle resources. This maintains a pool of available resources that maintains a constant size (to the extent possible).
-
- 20 Nov, 2014 4 commits
-
-
Morris Jette authored
-
Morris Jette authored
This plugin was never completed, so it is not worth the effort of continued support at this time.
-
David Bigagli authored
-
David Bigagli authored
-
- 18 Nov, 2014 1 commit
-
-
Morris Jette authored
Added new job state of STOPPED indicating processes have been stopped with a SIGSTOP (using scancel or sview), but retain its allocated CPUs. Job state returns to RUNNING when SIGCONT is sent (also using scancel or sview). bug 1263
-
- 14 Nov, 2014 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
- 13 Nov, 2014 3 commits
-
-
Danny Auble authored
-
Brian Christiansen authored
Bug 1253
-
Brian Christiansen authored
Bug 1255
-
- 12 Nov, 2014 6 commits
-
-
Morris Jette authored
This applies to salloc, sbatch, and srun. bug 1254
-
Morris Jette authored
Avoid including unused CPUs in a job's allocation when cores or sockets are allocated. A minor variation of this has been tested with version 14.11, but I do not want to risk introducing a problem with this new logic. bug 1212
-
Hongjia Cao authored
-
David Bigagli authored
-
Danny Auble authored
-
Morris Jette authored
Do not requeue a batch job from slurmd daemon if it is killed while in the process of being launched (a race condition introduced in v14.03.9). This partially reverts commit 2bc9bc29
-
- 11 Nov, 2014 2 commits
-
-
David Bigagli authored
-
Danny Auble authored
CR_CORE/SOCKET system lay out tasks correctly. This is a complete fix for bug 1247 You can also reference commit 7461c119 for tests to give correct functionality.
-
- 10 Nov, 2014 2 commits
-
-
Danny Auble authored
with CR_PACK_NODES. Really do commit d388dd67 a different way to get the same info and be able to lay out tasks correctly when --hint=nomultithread. tests on a 4 core 8 thread system are srun -n6 --hint=nomultithread --exclusive whereami | sort -h srun: cpu count 6 0 snowflake0 - MASK:0x1 1 snowflake0 - MASK:0x2 2 snowflake0 - MASK:0x4 3 snowflake0 - MASK:0x8 4 snowflake1 - MASK:0x1 5 snowflake1 - MASK:0x2 and srun -n10 -N5 --hint=nomultithread --exclusive whereami | sort -h srun: cpu count 10 0 snowflake0 - MASK:0x1 1 snowflake0 - MASK:0x2 2 snowflake0 - MASK:0x4 3 snowflake0 - MASK:0x8 4 snowflake1 - MASK:0x1 5 snowflake1 - MASK:0x2 6 snowflake1 - MASK:0x4 7 snowflake2 - MASK:0x1 8 snowflake3 - MASK:0x1 9 snowflake4 - MASK:0x1
-
Nate Coraor authored
Move will-run test for multiple clusters from the sbatch code into the API so that it can be used with DRMAA.
-
- 07 Nov, 2014 2 commits
-
-
David Bigagli authored
an maintenance reservation that is not active yet.
-
Danny Auble authored
work "partition". reference bug 1246
-
- 06 Nov, 2014 5 commits
-
-
Danny Auble authored
is requested. This is a re-factor of commit e5635a76 related to bug 1148 to handle the cases where a job could run, but an error was given when selecting the nodes.
-
Danny Auble authored
-
Danny Auble authored
lock was locked outside of the function or not. This also fixes a race condition when adding a QOS and planning on using it right away when the controller is busy with previous requests.
-
David Bigagli authored
from slurm.h
-
Danny Auble authored
PerCPU. Before it wasn't taking into account if the user was requesting per node memory or the job was told it needed to use less than the node allowed.
-
- 05 Nov, 2014 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
released.
-