- 09 Jun, 2014 1 commit
-
-
Morris Jette authored
Add child_forked() function to the slurm_acct_gather_profile plugin to close open files, leaving application with no extra open file descriptors.
-
- 07 Jun, 2014 2 commits
-
-
David Bigagli authored
it is already running.
-
David Bigagli authored
job is JOB_COMPLETING or already pending.
-
- 06 Jun, 2014 3 commits
-
-
David Bigagli authored
last epilog completes, either slurmd epilog or slurmctld epilog, whichever comes last.
-
Morris Jette authored
Describe cgroup-based core specialization support.
-
David Bigagli authored
don't clear the dependency if the job is completing.
-
- 05 Jun, 2014 6 commits
-
-
Danny Auble authored
(Also remove extra pending check, no reason to check it twice ;))
-
Morris Jette authored
If the backup slurmctld assumes primary status, then do NOT purge any job state files (batch script and environment files) but if any attempt is made to re-use them consider this a fatal error. It may indicate that multiple primary slurmctld daemons are active (e.g. both backup and primary are functioning as primary and there is a split brain problem).
-
Danny Auble authored
-
Morris Jette authored
Test time when job_state file was written to detect multiple primary slurmctld daemons (e.g. both backup and primary are functioning as primary and there is a split brain problem).
-
Stephen Trofinoff authored
Signed-off-by: Danny Auble <da@schedmd.com>
-
David Bigagli authored
when specified escaped.
-
- 04 Jun, 2014 4 commits
-
-
Morris Jette authored
Attempt to create duplicate event trigger now generates ESLURM_TRIGGER_DUP ("Duplicate event trigger").
-
Morris Jette authored
Modify strigger to accept arguments to the program to execute when an event trigger occurs.
-
Morris Jette authored
Added strigger option of -N, --noheader to not print the header when displaying a list of triggers.
-
Morris Jette authored
batch jobs have cpus_per_task set to zero, which resulted in an error of "task/cgroup: task[0] unable to set taskset '0x0'"
-
- 03 Jun, 2014 4 commits
-
-
David Bigagli authored
requeue, requeuehold and release operations.
-
Morris Jette authored
Do not purge the script and environment files for completed jobs on slurmctld reconfiguration or restart (they might be later requeued). Purge the files only when the job record is purged. bug 834
-
Morris Jette authored
If a job --mem-per-cpu limit exceeds the partition or system limit, then scale the job's memory limit and CPUs per task to satisfy the limit. bug 848
-
David Bigagli authored
not finished yet otherwise if requeued the job may enter an invalid COMPLETING state.
-
- 31 May, 2014 1 commit
-
-
Danny Auble authored
-
- 29 May, 2014 1 commit
-
-
Morris Jette authored
select/cons_res plugin: Fix memory leak related to job preemption. bug 837
-
- 28 May, 2014 2 commits
-
-
Danny Auble authored
-
Morris Jette authored
This give system administrators the option on AMD Opteron 6000 series processors of either considering each NUMA node on a socket as a separate socket (resulting in some incorrect logging of socket count information) or not (resulting in sub-optimal job allocations since each core in the socket will be considered equivalent, even if on different NUMA nodes within the socket). bug 838
-
- 23 May, 2014 4 commits
-
-
Morris Jette authored
Replace round-robin front-end node selection with least-loaded algorithm.
-
David Bigagli authored
-
David Bigagli authored
-
Danny Auble authored
not able to be separated into multiply patches. If EnforcePartLimits=Yes and QOS job is using can override limits, allow it. Fix issues if partition allows or denys account's or QOS' and either are not set. If a job requests a partition and it doesn't allow a QOS or account the job is requesting pend unless EnforcePartLimits=Yes. Before it would always kill the job at submit.
-
- 21 May, 2014 6 commits
-
-
David Bigagli authored
-
Danny Auble authored
based on the mask given.
-
Danny Auble authored
task/affinity.
-
Danny Auble authored
thread in a core.
-
Danny Auble authored
it can bind cyclically across sockets.
-
Morris Jette authored
add a PriorityFlags option of CALCULATE_RUNNING. If set, then the priority of running jobs will continue to be recalculated periodically. The PriorityFlags value reported by sview and "scontrol show config" will be reported as a string rather than its numeric value.
-
- 20 May, 2014 3 commits
-
-
Morris Jette authored
cpus-per-task support: Try to pack all CPUs of each tasks onto one socket. Previous logic could spread the tasks CPUs across multiple sockets.
-
Danny Auble authored
This reverts commit b22268d8.
-
Danny Auble authored
-
- 19 May, 2014 2 commits
-
-
Morris Jette authored
Properly enforce job --requeue and --norequeue options. Previous logic was in three places not doing so (either ignoring the value, ANDing it with the JobRequeue configuration option or using the JobRequeue configuration option by itself). bug 821
-
Morris Jette authored
Add support for a job step's CPU governor and/or frequency to be reset on suspend/resume (or gang scheduling). The default for an idle CPU will now be "ondemand" rather than "userspace" with the lowest frequency (to recover from hard slurmd failures and support gang scheduling).
-
- 16 May, 2014 1 commit
-
-
Morris Jette authored
Add srun --cpu-freq options to set the CPU governor (OnDemand, Performance, Conservative, PowerSave or UserSpace). task/affinity: support set cpu_freq without cpuset (using hwloc and sched functions) Fix calculation used to set --cpu-freq=highm1 (relied upon ordering of possible CPU frequencies).
-