- 11 Jul, 2013 1 commit
-
-
Morris Jette authored
Processes which can be signalled will be. Others will be signalled when possible from a pthread.
-
- 10 Jul, 2013 2 commits
-
-
Morris Jette authored
-
Danny Auble authored
values.
-
- 09 Jul, 2013 4 commits
-
-
Danny Auble authored
-
Francois Diakhate authored
show information to users that had no associations on the system.
-
Morris Jette authored
Fix for bug 330
-
jette authored
-
- 08 Jul, 2013 1 commit
-
-
Thomas Cadeau authored
-
- 06 Jul, 2013 2 commits
-
-
Morris Jette authored
Previously srun would fail if the hostlist expression was larger than 0xffff bytes.
-
Morris Jette authored
-
- 05 Jul, 2013 2 commits
-
-
John Thiltges authored
When using ThreadsPerCore > 1, it appears that DefMemPerCPU is being scaled by slurmctld, but not by slurmd/slurmstepd. For example, we set ThreadsPerCore=2 and DefMemPerCPU=100. Running a single core job, we would expect two threads to be allocated and AllocMem on the assigned node to increase by 200MB. scontrol reports that AllocMem increased by 200MB, but the task/cgroup plugin only sees 100M of RAM. It looks like the problem may lie in common/slurm_cred.c:format_core_allocs(). The function counts the job/step cores and multiplies the mem_limit's, but it does not scale the CPU count like in slurmd/slurmd/req.c:_check_job_credential(). See bug 309
-
jette authored
-
- 02 Jul, 2013 2 commits
-
-
Danny Auble authored
-
Morris Jette authored
The terminating task's PID is available in task->pid.
-
- 28 Jun, 2013 4 commits
-
-
Morris Jette authored
Effects jobs with --exclusive and --cpus-per-task options bug 355
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
enum.
-
- 26 Jun, 2013 3 commits
-
-
Morris Jette authored
Note, we are switching to the ubuntu version numbering scheme: last two digits of the year for the major version number month number for the minor version number 13.12 == 2013 December
-
Morris Jette authored
-
Dominik Friedrich authored
-
- 25 Jun, 2013 3 commits
-
-
Thomas Cadeau authored
-
Danny Auble authored
for as the job/step id.
-
David Gloe authored
The SLURM Makefile.am scripts use pkglibexecdir. One source indicates that this was not added until automake 1.10.2 (https://github.com/rerun/rerun/issues/167). So we just made that to be the minimum.
-
- 24 Jun, 2013 1 commit
-
-
jette authored
Under very heavy load with many thousands of batch job submissions or job signals, the write lock can be held for very long periods of time preventing job scheduling, squeue response, etc. This code inserts a timing break to permit other functions to get the locks.
-
- 21 Jun, 2013 4 commits
-
-
Danny Auble authored
-
Danny Auble authored
default and appeared to break other things.
-
Danny Auble authored
-
Martin Perry authored
specified.
-
- 18 Jun, 2013 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
- 12 Jun, 2013 1 commit
-
-
Morris Jette authored
if on "scontrol reconfig" when AllowNodes is manually changed using scontrol since last slurmctld restart.
-
- 10 Jun, 2013 1 commit
-
-
Morris Jette authored
due to either down nodes or explicit resizing. Generated slurmctld errors of this type: [2013-06-04T12:43:46+06:00] error: gres/gpu: step_test 68662.4294967294 gres_bit_alloc is NULL This is a movement of the logic introduced in commit https://github.com/SchedMD/slurm/commit/6fff97bb77d2d88aa808c47fd7880246a0c1d090 to eliminate a memory leak.
-
- 07 Jun, 2013 1 commit
-
-
Danny Auble authored
overwritten by the parallel versions thus making it so we need handle both cases.
-
- 06 Jun, 2013 1 commit
-
-
Mark Nelson authored
-
- 05 Jun, 2013 5 commits
-
-
Danny Auble authored
Since we don't currently track energy usage per task (only per step). Otherwise we get double the energy.
-
Danny Auble authored
-
Janne Blomqvist authored
Andy Wettstein (University of Chicago) reported privately to me that slurmctld 2.5.4 crashed after he enabled the priority/multifactor2 plugin due to a division by zero error. I was able to reproduce the crash by creating an account hierarchy where all the accounts and users had zero shares. See bug 315
-
David Bigagli authored
Revert premature change of META
-
jette authored
Without this change, it appears that POE ignores the -procs argument resulting in a job step request with multiple host names, but only one ntask required
-