- 03 Dec, 2013 4 commits
-
-
Morris Jette authored
Use hash function to locate job records for improved performance.
-
Morris Jette authored
Change partition write lock to a read lock as we use a different mechanism for hidden partitions in getting individual jobs.
-
Morris Jette authored
-
Morris Jette authored
Correct logic returning remaining job dependencies in job information reported by scontrol and squeue. Eliminates vestigial descriptors with no job ID values (e.g. "afterany"). As depdencies are removed, the job ID values were removed from the strings, but not the descriptors. This eliminates both. It also checks the full job ID to make sure we do not remove "afterany:1234" when job "123" completes.
-
- 02 Dec, 2013 3 commits
-
-
Morris Jette authored
Fix race condition on batch job termination that could result in a job exit code of 0xfffffffe if the slurmd on node zero registers its active jobs at the same time that slurmstepd is recording the job's exit code. but 535
-
Morris Jette authored
-
David Bigagli authored
-
- 29 Nov, 2013 6 commits
-
-
Morris Jette authored
-
Morris Jette authored
There was already cgroup locking in the version 14.03 code base using different variable names and slighly different logic from that in commit 3f6d9e36. This commit is a variant of that commit in order to make the logic in version 2.6 match that of our next release (logic which is already pretty well tested). bug 447
-
Morris Jette authored
proctrack/cgroup - Add locking to prevent race condition where one job step is ending for a user or job at the same time another job stepsis starting and the user or job container is deleted from under the starting job step. bug 447
-
Morris Jette authored
This eliminates some now redundant arrays and variable copying introduced in commit 74d1a4b4 bug 525
-
David Bigagli authored
Substantial performance improvement for systems with Shared=YES or FORCE and large numbers of running jobs (replace bubble sort with quick sort). Bug 525
-
David Bigagli authored
Remove trailing spaces No changes in logic
-
- 27 Nov, 2013 5 commits
-
-
Morris Jette authored
Original code worked only for Cray systems. For other systems it set gres_alloc to the total number of each GRES allocated on each node to any job
-
Morris Jette authored
-
Morris Jette authored
-
Jason Bacon authored
-
Morris Jette authored
-
- 26 Nov, 2013 5 commits
-
-
Chris Scheller authored
-
Morris Jette authored
-
Morris Jette authored
Logs errors related to apbasil use
-
Morris Jette authored
No change in logic, just move the logic that resets a batch job accounting information into its own function.
-
Morris Jette authored
-
- 25 Nov, 2013 5 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
jette authored
No change in underlying logic
-
jette authored
This fixes a problem where a job contains a license that is removed in a slurmctld reconfiguration. Without this change, the job would be left with a non-zero license_list pointer referencing memory that had been freed bug 527
-
jette authored
Increase the range of possible reservation time values to allow for a really long RPC delay (possibly due to slurmctld fail over from primary to backup controller). Also change to a #define value for clarity bug 527
-
- 24 Nov, 2013 3 commits
- 18 Nov, 2013 1 commit
-
-
Morris Jette authored
The time/resource allocation matrix is rebuilt on each job exit, which severely impacts performance at large counts of running jobs (say >10k jobs).
-
- 14 Nov, 2013 4 commits
-
-
Morris Jette authored
bug 511
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
- 13 Nov, 2013 4 commits
-
-
Morris Jette authored
-
Morris Jette authored
This makes it simpler to enable detailed debugging for reservations. This includes more information than we probably want to see with the DebugFlag=reservation and would be only for developer debugging
-
Morris Jette authored
This might have worked fine for core reservations or when there are sufficient idle nodes to use, the the select_g_resv_test() function clears the node bitmap for nodes that it can not use and the reservation create logic did not restore that bitmap after a failed resource selection attempt. This logic restores the node bitmap on a failed call to select_g_resv_test() so we can add nodes to the bitmap of available nodes rather than having it repeatedly cleared. The logic also adds some performance enhancements that I will add to in the next commit.
-
Morris Jette authored
-