- 15 Aug, 2013 2 commits
-
-
Danny Auble authored
could end up before the job started. Bug 371
-
Danny Auble authored
-
- 14 Aug, 2013 4 commits
-
-
Morris Jette authored
This avoids waiting for the job's initiation to fail.
-
Morris Jette authored
Fix job state recovery logic in which a job's accounting frequency was not set. This would result in a value of 65534 seconds being used (the equivalent of NO_VAL in uint16_t), which could result in the job being requeued or aborted.
-
David Bigagli authored
-
Morris Jette authored
Problem reported by BYU. slurm.conf included a file one byte in length. Logic created a buffer one byte long and used fgets() to read the file. fgets() reads one byte less than the buffer size to include a trailing '\0', so it fails to read the file.
-
- 13 Aug, 2013 3 commits
-
-
Morris Jette authored
-
jette authored
This problem was reported by Harvard University and could be reproduced with a command line of "srun -N1 --tasks-per-node=2 -O id". With other job types, the error message could be logged many times for each job. This change logs the error once per job and only if the job request does not include the -O/--overcommit option.
-
Danny Auble authored
was down (slurmctld not running) during that time period.
-
- 09 Aug, 2013 1 commit
-
-
Danny Auble authored
version of Slurm.
-
- 07 Aug, 2013 1 commit
-
-
Danny Auble authored
-
- 06 Aug, 2013 1 commit
-
-
Danny Auble authored
of at multifactor poll.
-
- 01 Aug, 2013 1 commit
-
-
David Bigagli authored
to drain the node and log error slurmd log file.
-
- 31 Jul, 2013 1 commit
-
-
David Bigagli authored
-
- 30 Jul, 2013 1 commit
-
-
Thomas Cadeau authored
-
- 26 Jul, 2013 2 commits
-
-
David Bigagli authored
-
Morris Jette authored
-
- 25 Jul, 2013 1 commit
-
-
Alexander Bersenev authored
gres_alloc, gres_req, and gres_used fields were empty if the job was not started immediately. bug 380
-
- 23 Jul, 2013 4 commits
-
-
David Bigagli authored
-
Danny Auble authored
-
David Bigagli authored
-
Danny Auble authored
-
- 22 Jul, 2013 1 commit
-
-
Danny Auble authored
-
- 18 Jul, 2013 1 commit
-
-
Tim Wickberg authored
-
- 16 Jul, 2013 2 commits
-
-
Danny Auble authored
accounting_storage/filetxt.
-
Danny Auble authored
erroneously on a bluegene system.
-
- 11 Jul, 2013 2 commits
-
-
Danny Auble authored
message.
-
Francois Diakhate authored
users.
-
- 10 Jul, 2013 2 commits
-
-
Morris Jette authored
-
Danny Auble authored
values.
-
- 09 Jul, 2013 4 commits
-
-
Danny Auble authored
-
Francois Diakhate authored
show information to users that had no associations on the system.
-
Morris Jette authored
Fix for bug 330
-
jette authored
-
- 08 Jul, 2013 1 commit
-
-
Thomas Cadeau authored
-
- 06 Jul, 2013 2 commits
-
-
Morris Jette authored
Previously srun would fail if the hostlist expression was larger than 0xffff bytes.
-
Morris Jette authored
-
- 05 Jul, 2013 2 commits
-
-
John Thiltges authored
When using ThreadsPerCore > 1, it appears that DefMemPerCPU is being scaled by slurmctld, but not by slurmd/slurmstepd. For example, we set ThreadsPerCore=2 and DefMemPerCPU=100. Running a single core job, we would expect two threads to be allocated and AllocMem on the assigned node to increase by 200MB. scontrol reports that AllocMem increased by 200MB, but the task/cgroup plugin only sees 100M of RAM. It looks like the problem may lie in common/slurm_cred.c:format_core_allocs(). The function counts the job/step cores and multiplies the mem_limit's, but it does not scale the CPU count like in slurmd/slurmd/req.c:_check_job_credential(). See bug 309
-
jette authored
-
- 28 Jun, 2013 1 commit
-
-
Morris Jette authored
Effects jobs with --exclusive and --cpus-per-task options bug 355
-