- 05 Jun, 2014 1 commit
-
-
David Bigagli authored
when specified escaped.
-
- 04 Jun, 2014 4 commits
-
-
Morris Jette authored
Attempt to create duplicate event trigger now generates ESLURM_TRIGGER_DUP ("Duplicate event trigger").
-
Morris Jette authored
Modify strigger to accept arguments to the program to execute when an event trigger occurs.
-
Morris Jette authored
Added strigger option of -N, --noheader to not print the header when displaying a list of triggers.
-
Morris Jette authored
batch jobs have cpus_per_task set to zero, which resulted in an error of "task/cgroup: task[0] unable to set taskset '0x0'"
-
- 03 Jun, 2014 4 commits
-
-
David Bigagli authored
requeue, requeuehold and release operations.
-
Morris Jette authored
Do not purge the script and environment files for completed jobs on slurmctld reconfiguration or restart (they might be later requeued). Purge the files only when the job record is purged. bug 834
-
Morris Jette authored
If a job --mem-per-cpu limit exceeds the partition or system limit, then scale the job's memory limit and CPUs per task to satisfy the limit. bug 848
-
David Bigagli authored
not finished yet otherwise if requeued the job may enter an invalid COMPLETING state.
-
- 29 May, 2014 1 commit
-
-
Morris Jette authored
select/cons_res plugin: Fix memory leak related to job preemption. bug 837
-
- 28 May, 2014 2 commits
-
-
Danny Auble authored
-
Morris Jette authored
This give system administrators the option on AMD Opteron 6000 series processors of either considering each NUMA node on a socket as a separate socket (resulting in some incorrect logging of socket count information) or not (resulting in sub-optimal job allocations since each core in the socket will be considered equivalent, even if on different NUMA nodes within the socket). bug 838
-
- 23 May, 2014 3 commits
-
-
David Bigagli authored
-
David Bigagli authored
-
Danny Auble authored
not able to be separated into multiply patches. If EnforcePartLimits=Yes and QOS job is using can override limits, allow it. Fix issues if partition allows or denys account's or QOS' and either are not set. If a job requests a partition and it doesn't allow a QOS or account the job is requesting pend unless EnforcePartLimits=Yes. Before it would always kill the job at submit.
-
- 21 May, 2014 5 commits
-
-
David Bigagli authored
-
Danny Auble authored
based on the mask given.
-
Danny Auble authored
task/affinity.
-
Danny Auble authored
thread in a core.
-
Danny Auble authored
it can bind cyclically across sockets.
-
- 20 May, 2014 3 commits
-
-
Morris Jette authored
cpus-per-task support: Try to pack all CPUs of each tasks onto one socket. Previous logic could spread the tasks CPUs across multiple sockets.
-
Danny Auble authored
This reverts commit b22268d8.
-
Danny Auble authored
-
- 19 May, 2014 1 commit
-
-
Morris Jette authored
Properly enforce job --requeue and --norequeue options. Previous logic was in three places not doing so (either ignoring the value, ANDing it with the JobRequeue configuration option or using the JobRequeue configuration option by itself). bug 821
-
- 15 May, 2014 2 commits
-
-
Morris Jette authored
Add SelectTypeParameters option of CR_PACK_NODES to pack a job's tasks tightly on its allocated nodes rather than distributing them evenly across the allocated nodes. bug 819
-
Danny Auble authored
something you also get a signal which would produce deadlock. Fix Bug 601.
-
- 14 May, 2014 2 commits
-
-
Morris Jette authored
Run EpilogSlurmctld for a job is killed during slurmctld reconfiguration. bug 806
-
Morris Jette authored
Only if ALL of their partitions are hidden will a job be hidden by default. bug 812
-
- 13 May, 2014 4 commits
-
-
Morris Jette authored
Correct SelectTypeParameters=CR_LLN with job selecition of specific nodes. Previous logic would in most instances allocate resources on all nodes to the job.
-
Morris Jette authored
Correct squeue's job node and CPU counts for requeued jobs. Previously, when a job was requeued, its CPU count reported was that of the previous execution. When combined with the --ntasks-per-node option, squeue would compute the expected node count. If the --exclusive option is also used, the node count reported by squeue could be off by a large margin (e.g. "sbatch --exclusive --ntasks-per-node=1 -N1 .." on requeue would use the number of CPUs on the allocated node to recompute the expected node count). bug 756
-
Danny Auble authored
jobacct_gather/cgroup.
-
Morris Jette authored
Support SLURM_CONF path which does not have "slurm.conf" as the file name. bug 803
-
- 12 May, 2014 4 commits
-
-
Morris Jette authored
If a job has non-responding node, retry job step create rather than returning with DOWN node error. bug 734
-
Morris Jette authored
-
Puenlap Lee authored
Also correct related documentation
-
Hongjia Cao authored
Completing nodes is removed when calling _try_sched() for a job, which is the case in select_nodes(). If _try_sched() thinks the job can run now but select_nodes() returns ESLURM_NODES_BUSY, the backfill loop will be ended.
-
- 09 May, 2014 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
- 08 May, 2014 2 commits
-
-
Morris Jette authored
Fix sinfo -R to print each down/drained node once, rather than once per partition. This was broken in the sinfo change to process each partition's information in a separate pthread.
-
Morris Jette authored
Correct sinfo --sort fields to match documentation: E => Reason, H -> Reason Time (new), R -> Partition Name, u/U -> Reason user (new)
-