- 30 May, 2015 1 commit
-
-
Danny Auble authored
-
- 29 May, 2015 5 commits
-
-
Brian Christiansen authored
Bug 1495
-
Morris Jette authored
Correct count of CPUs allocated to job on system with hyperthreads. The bug was introduced in commit a6d3074d On a system with hyperthreads: srun -n1 --ntasks-per-core=1 hostname you would get: slurmctld: error: job_update_cpu_cnt: cpu_cnt underflow on job_id 67072
-
Morris Jette authored
preempt/job_prio plugin: Implement the concept of Warm-up Time here. Use the QoS GraceTime as the amount of time to wait before preempting. Basically, skip preemption if your time is not up.
-
Morris Jette authored
-
Danny Auble authored
a job runs past it's time limit.
-
- 28 May, 2015 2 commits
-
-
David Bigagli authored
-
Brian Christiansen authored
Bug 1705
-
- 27 May, 2015 1 commit
-
-
Morris Jette authored
However, --mem=0 now reflects the appropriate amount of memory in the system, --mem-per-cpu=0 hasn't changed. This allows all the memory to be allocated in a cgroup but is not "consumed" and is available for other jobs running on the same host. Eric Martin, Washington University School of Medicine
-
- 26 May, 2015 3 commits
-
-
David Bigagli authored
-
Morris Jette authored
Correct list of unavailable nodes reported in a job's "reason" field when that job can not start. bug 1614
-
Danny Auble authored
which can be used to aggregate messages to the slurmctld into a single message to reduce communication to the slurmctld. Currently only epilog complete messages and node registration messages use this logic.
-
- 22 May, 2015 3 commits
-
-
Morris Jette authored
bug 1679
-
Morris Jette authored
-
Morris Jette authored
Changes some variable names "norelation" to "no_relation" Replace some blocks of spaces with tabs Add definition of layouts "free" call to slurm.h.in Add "layout" information to scontrol help message Fix typo in error message Translate a french error message to english "valide" to "valid"
-
- 21 May, 2015 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
- 20 May, 2015 2 commits
-
-
Brian Christiansen authored
Bug 1679
-
Morris Jette authored
-
- 19 May, 2015 1 commit
-
-
Morris Jette authored
switch/cray: Revert logic added to 14.11.6 that set "PMI_CRAY_NO_SMP_ENV=1" if CR_PACK_NODES is configured. bug 1585
-
- 16 May, 2015 1 commit
-
-
David Bigagli authored
-
- 15 May, 2015 2 commits
-
-
Morris Jette authored
preempt/job_prio plugin: Implement the concept of Warm-up Time here. Use the QoS GraceTime as the amount of time to wait before preempting. Basically, skip preemption if your time is not up.
-
Morris Jette authored
-
- 14 May, 2015 3 commits
-
-
David Bigagli authored
-
David Bigagli authored
-
Brian Christiansen authored
Bug 1548
-
- 13 May, 2015 4 commits
-
-
Morris Jette authored
Add PrologFlags option of "Contain" to create a proctrack container at job resource allocation time. At job allocation time, a slurmstepd is spawned on every allocated compute node in which to place external processes (e.g. PAM can place ssh processes into a cgroup). This entity is accounted for and reported by sacct as "<jobid>.extern". Some more testing and development remain, but it mostly works.
-
Brian Christiansen authored
Bug 1627
-
Brian Christiansen authored
-
Brian Christiansen authored
-
- 12 May, 2015 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
- 11 May, 2015 1 commit
-
-
Morris Jette authored
Make sure that old step data is purged when a job is requeued. Without this logic, if a job terminates abnormally then old step data may be left in slurmctld. If the job is then requeued and started on a different node, referencing that old job step data can result in abnormal events. One specific failure mode is if the job is requeued on a node with a different number of cores, and the step terminated RPC arrives later, the job and step bitmaps of allocated cores can differ in size generating an abort. bug 1660
-
- 08 May, 2015 4 commits
-
-
Danny Auble authored
-
David Gloe authored
Bug 1657
-
Brian Christiansen authored
Bug 1618
-
Jonathon Nelson authored
-
- 07 May, 2015 1 commit
-
-
Danny Auble authored
cpu count.
-
- 06 May, 2015 2 commits
-
-
Morris Jette authored
Add re-entrant versions of glibc time functions (e.g. localtime) to Slurm in order to eliminate rare deadlock of slurmstepd fork and exec calls. bug 1638
-
Danny Auble authored
utilization.
-