- 02 May, 2014 14 commits
-
-
Danny Auble authored
Conflicts: META NEWS
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
cbcea672
-
Danny Auble authored
situations where the association tree has multiple grp limits, only honor the first one.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
This is for bug 775
-
Danny Auble authored
-
Danny Auble authored
Conflicts: src/slurmd/slurmstepd/slurmstepd_job.c
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
- 01 May, 2014 8 commits
-
-
Danny Auble authored
regression from 2a674aee
-
Danny Auble authored
Conflicts: NEWS
-
Danny Auble authored
-
Danny Auble authored
is running.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
is running.
-
Danny Auble authored
-
- 30 Apr, 2014 11 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
David Bigagli authored
together.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Switch/nrt - Properly track usage of CAU and RDMA resources with multiple tasks per compute node. Previous logic would allocate resources once per task and then deallocate once per node, leaking CMA and RDMA resources and preventing their use by future jobs.
-
Morris Jette authored
-
Morris Jette authored
If a job is held, then only release it with the "scontrol release <jobid>" command rather than a simple reset of the job's priority. This is needed to support job arrays better. Otherwise a priority reset of a job array would free all requeued/held jobs from that job array rather than leaving them held.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
If a job's priority is set non-zero then always clear the JOB_SPECIAL_EXIT job state flag, not only when the prior state is HELD_USER or HELD. I'm not sure how the job could have cleared the HELD state and changed to NO_REASON, but this would fix the problem. bug 760
-
- 29 Apr, 2014 7 commits
-
-
Morris Jette authored
Modify slurmd to keep track of which jobs have already been launched. It the launch is complete, then process suspend requests immediately. Previously the suspend request was always delayed by 1 second, which adversely impacts gang scheduling performance. If the job can't be found (say after a slurmd restart), then delay the suspend by up to 3 seconds, but only once.
-
Morris Jette authored
Change the integer to hex function to support 32-bit unsigned integers and exit on systems with more than 32 cpus per node since Expect can not work with numbers so large.
-
David Bigagli authored
-
Morris Jette authored
Change the integer to hex function to support 32-bit unsigned integers and exit on systems with more than 32 cpus per node since Expect can not work with numbers so large.
-
David Bigagli authored
-
David Bigagli authored
-
David Bigagli authored
-