- 13 May, 2014 3 commits
-
-
Morris Jette authored
Correct squeue's job node and CPU counts for requeued jobs. Previously, when a job was requeued, its CPU count reported was that of the previous execution. When combined with the --ntasks-per-node option, squeue would compute the expected node count. If the --exclusive option is also used, the node count reported by squeue could be off by a large margin (e.g. "sbatch --exclusive --ntasks-per-node=1 -N1 .." on requeue would use the number of CPUs on the allocated node to recompute the expected node count). bug 756
-
Danny Auble authored
jobacct_gather/cgroup.
-
Morris Jette authored
Support SLURM_CONF path which does not have "slurm.conf" as the file name. bug 803
-
- 12 May, 2014 4 commits
-
-
Morris Jette authored
If a job has non-responding node, retry job step create rather than returning with DOWN node error. bug 734
-
Morris Jette authored
-
Puenlap Lee authored
Also correct related documentation
-
Hongjia Cao authored
Completing nodes is removed when calling _try_sched() for a job, which is the case in select_nodes(). If _try_sched() thinks the job can run now but select_nodes() returns ESLURM_NODES_BUSY, the backfill loop will be ended.
-
- 09 May, 2014 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
- 08 May, 2014 2 commits
-
-
Morris Jette authored
Fix sinfo -R to print each down/drained node once, rather than once per partition. This was broken in the sinfo change to process each partition's information in a separate pthread.
-
Morris Jette authored
Correct sinfo --sort fields to match documentation: E => Reason, H -> Reason Time (new), R -> Partition Name, u/U -> Reason user (new)
-
- 07 May, 2014 4 commits
-
-
Morris Jette authored
Without this patch, jobs with an infinite time limit would have their preemption GraceTime ignored.
-
Morris Jette authored
related to bug 789
-
Danny Auble authored
-
Danny Auble authored
them.
-
- 06 May, 2014 5 commits
-
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
-
Morris Jette authored
In slurm.spec file, replace "Requires cray-MySQL-devel-enterprise" with "Requires mysql-devel" per David Gloe.
-
- 05 May, 2014 5 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
Related to bug 771
-
Morris Jette authored
In version 14.03.2 was using "slurm_<jobid>_4294967294.out" due to error in job array logic.
-
Danny Auble authored
cnode counts.
-
- 02 May, 2014 4 commits
-
-
Danny Auble authored
-
Danny Auble authored
This is for bug 775
-
Danny Auble authored
-
Danny Auble authored
-
- 01 May, 2014 4 commits
-
-
Danny Auble authored
regression from 2a674aee
-
Danny Auble authored
-
Danny Auble authored
is running.
-
Danny Auble authored
-
- 30 Apr, 2014 3 commits
-
-
David Bigagli authored
together.
-
Morris Jette authored
Switch/nrt - Properly track usage of CAU and RDMA resources with multiple tasks per compute node. Previous logic would allocate resources once per task and then deallocate once per node, leaking CMA and RDMA resources and preventing their use by future jobs.
-
Morris Jette authored
If a job is held, then only release it with the "scontrol release <jobid>" command rather than a simple reset of the job's priority. This is needed to support job arrays better. Otherwise a priority reset of a job array would free all requeued/held jobs from that job array rather than leaving them held.
-
- 28 Apr, 2014 3 commits
-
-
Danny Auble authored
-
Danny Auble authored
in 2.0 :)
-
Morris Jette authored
Previously partition priority was only considered when used as a component of a job's priority with the priority/multifactor plugin. Now the partition priority is considered first, as documented, and the job priority is considered second. bug 764
-
- 26 Apr, 2014 1 commit
-
-
Stuart Midgley authored
Add --priority option to the salloc, sbatch and srun commands.
-