- 13 May, 2014 7 commits
-
-
Morris Jette authored
If a batch job launch request can not be built (the script file is missing, a credential can not be created, or the user does not exist on the selected compute node), then cancel the job in a graceful fashion. Previously, the bad RPC would be sent to the compute node and that node DRAINED. see bug 807
-
Morris Jette authored
Correct SelectTypeParameters=CR_LLN with job selecition of specific nodes. Previous logic would in most instances allocate resources on all nodes to the job.
-
Morris Jette authored
Correct squeue's job node and CPU counts for requeued jobs. Previously, when a job was requeued, its CPU count reported was that of the previous execution. When combined with the --ntasks-per-node option, squeue would compute the expected node count. If the --exclusive option is also used, the node count reported by squeue could be off by a large margin (e.g. "sbatch --exclusive --ntasks-per-node=1 -N1 .." on requeue would use the number of CPUs on the allocated node to recompute the expected node count). bug 756
-
Danny Auble authored
jobacct_gather/cgroup.
-
Morris Jette authored
Support SLURM_CONF path which does not have "slurm.conf" as the file name. bug 803
-
Morris Jette authored
-
Morris Jette authored
-
- 12 May, 2014 7 commits
-
-
Morris Jette authored
If a job has non-responding node, retry job step create rather than returning with DOWN node error. bug 734
-
Morris Jette authored
-
Morris Jette authored
-
Puenlap Lee authored
Also correct related documentation
-
Nathan Yee authored
Add force option to all file removals ("rm ..." to "rm -f ..."). bug 673
-
Morris Jette authored
-
Hongjia Cao authored
Completing nodes is removed when calling _try_sched() for a job, which is the case in select_nodes(). If _try_sched() thinks the job can run now but select_nodes() returns ESLURM_NODES_BUSY, the backfill loop will be ended.
-
- 09 May, 2014 5 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Martin Perry authored
-
Morris Jette authored
Related to bug 795
-
- 08 May, 2014 4 commits
-
-
jette authored
-
Morris Jette authored
Fix sinfo -R to print each down/drained node once, rather than once per partition. This was broken in the sinfo change to process each partition's information in a separate pthread.
-
Morris Jette authored
Conflicts: src/sinfo/sort.c
-
Morris Jette authored
Correct sinfo --sort fields to match documentation: E => Reason, H -> Reason Time (new), R -> Partition Name, u/U -> Reason user (new)
-
- 07 May, 2014 8 commits
-
-
David Gloe authored
it turns out SLES 11 SP3 (at least) defines it with a newline, so this will be a problem for anyone building RPMs on that OS.
-
Morris Jette authored
Without this patch, jobs with an infinite time limit would have their preemption GraceTime ignored.
-
Morris Jette authored
For the salloc, sbatch and srun commands, report usage of the --signal option when the user requests command help.
-
Morris Jette authored
related to bug 789
-
Danny Auble authored
-
Danny Auble authored
them.
-
Morris Jette authored
-
Morris Jette authored
commit 8ddadea5 combined all pending jobs, even if they had the special exit flag set. This treats pending and special_exit state jobs differently. Only those jobs with state pending (and NOT special_exit) are combined in squeue.
-
- 06 May, 2014 9 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
-
Morris Jette authored
-