- 09 Aug, 2016 6 commits
-
-
Morris Jette authored
Make EnforcePartLimit support logic work with any ordering of partitions in job submit request. Developed jointly with Alejandro Sanchez <alex@schedmd.com> bug 2920
-
Dominik Bartkiewicz authored
Calculation switched the node count in place of the cpu count, which results in incorrect estimates. CID 44784.
-
Dominik Bartkiewicz authored
CID 44787.
-
Tim Wickberg authored
Bug 2950. Also identified as CID 56684 (copy+paste error).
-
Morris Jette authored
-
Dominik Bartkiewicz authored
CID 45023 and 45024.
-
- 08 Aug, 2016 3 commits
-
-
Morris Jette authored
Fix task:CPU binding logic for some processors. This bug was introduced in version 16.05.1 to address KNL bunding problem. bug 2972
-
Dominik Bartkiewicz authored
Needed due to part_filter_set() calls; without write lock this can race returning inconsistent results to 'sinfo'. Bug 2958.
-
Morris Jette authored
Regression test fixes if SelectTypePlugin not managing memory and no node memory size set (defaults to 1 MB per node).
-
- 07 Aug, 2016 1 commit
-
-
Morris Jette authored
Fix race condition in the account_gather plugin that could result in job stuck in COMPLETING state. bug 2973
-
- 03 Aug, 2016 1 commit
-
-
Morris Jette authored
Prior logic used to create an advanced reservation based upon a core count would ignore the specialized cores. Then when a job tried to use the reservation, it would consider the specialized cores and not be able to use the core count used in the reservation creation. This change considers specialized cores when creating the reservation.
-
- 02 Aug, 2016 1 commit
-
-
Sergey Meirovich authored
If slurmstepd had been swapped out before upgrade happened it could easily lead to SIGBUS at any time after upgrade. Prevent that by mlocking it. bug 2334
-
- 29 Jul, 2016 4 commits
-
-
Moe Jette authored
SLURM_JOB_RESERVAION environment variables are set for the salloc command. Document the same environment variables for the salloc, sbatch and srun commands in their man pages.
-
Danny Auble authored
Also we are making extern lower case in the api to match this and sacct.
-
Danny Auble authored
-
Danny Auble authored
that had a partition in them.
-
- 28 Jul, 2016 2 commits
-
-
Morris Jette authored
Document that the SLURM_JOB_ACCOUNT and SLURM_JOB_QOS environment variables are set for the srun command in its man page. bug 2945
-
Morris Jette authored
Document that the SLURM_JOB_ACCOUNT and SLURM_JOB_QOS environment variables are set for the salloc and sbatch commands in their man pages. bug 2945
-
- 27 Jul, 2016 4 commits
-
-
Morris Jette authored
Document that persistent burst buffers can not be created or destroyed using the salloc or srun --bb options. bug 2404
-
Brian Christiansen authored
Missed in b5bba34c
-
Danny Auble authored
on batch script completes.
-
Danny Auble authored
-
- 26 Jul, 2016 2 commits
-
-
Morris Jette authored
-
Danny Auble authored
(difference between start of job and when it was eligible).
-
- 25 Jul, 2016 2 commits
-
-
David Gloe authored
Bug 2939.
-
Danny Auble authored
-
- 23 Jul, 2016 1 commit
-
-
Morris Jette authored
-
- 22 Jul, 2016 4 commits
-
-
Dominik Bartkiewicz authored
Inadvertently broken in commit 05eac196. Bug 2912.
-
Danny Auble authored
or failed based on the signal that would always be killing it.
-
Danny Auble authored
end of the job to do it.
-
Danny Auble authored
make them using the master job ID instead of the normal job ID.
-
- 21 Jul, 2016 1 commit
-
-
Morris Jette authored
Treat invalid user ID in AllowUserBoot option of knl.conf file as error rather than fatal (log and do not exit).
-
- 20 Jul, 2016 3 commits
-
-
Morris Jette authored
Prevent slurmctld abort if job is killed or requeued while waiting for reboot of its allocated compute nodes. The _wait_boot() would reference job_ptr->node_bitmap, which would be NULL.
-
Boris Karasev authored
Bug 2908
-
Tim Wickberg authored
Step hasn't been assigned resources, so the select_jobinfo struct hasn't yet been populated. Calling select_g_step_finish will dereference causing a segfault. Bug 2922.
-
- 19 Jul, 2016 5 commits
-
-
Morris Jette authored
-
Morris Jette authored
If the user is now allowed to use the partition, then do not check that user's group access again for 5 seconds. bug 2913
-
Morris Jette authored
Improve partition AllowGroups caching. Update the table of UIDs permitted to use a partition based upon it's AllowGroups configuration parameter as new valid UIDs are found rather than looking up that user's group information for every job they submit, which can involve considerable overhead for some systems. bug 2913
-
Morris Jette authored
Minimize preempted jobs for configurations with multiple jobs per node. Previous logic would preeempt every job on node allocated to pending job. bug 2906
-
Morris Jette authored
Fix for core selection with job --gres-flags=enforce-binding option. Previous logic would in some cases allocate a job zero cores, resulting in slurmctld abort. bug 2808
-