- 17 Oct, 2012 5 commits
-
-
Morris Jette authored
-
Danny Auble authored
Missing spaces in dodump function of sacct
-
Carles Fenoy authored
-
jette authored
No real changes to logic other than some additional error checking.
-
Blomqvist Janne authored
-
- 16 Oct, 2012 1 commit
-
-
Morris Jette authored
Preempt jobs only when insufficient idle resources exist to start job, regardless of the node weight.
-
- 15 Oct, 2012 2 commits
-
-
Morris Jette authored
Conflicts: NEWS RELEASE_NOTES
-
Morris Jette authored
-
- 05 Oct, 2012 3 commits
-
-
Morris Jette authored
-
Morris Jette authored
Preemptor was not being scheduled. Fix for bugzilla #3.
-
Morris Jette authored
While this change lets gang scheduling happen, it overallocates resources from different priority partitions when gang scheduling is not running.
-
- 04 Oct, 2012 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
Preemptor was not being scheduled. See bugzilla #3 for details
-
- 03 Oct, 2012 9 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Nathan Yee authored
-
Morris Jette authored
-
Morris Jette authored
tried to use uint32_t to store negative number
-
- 02 Oct, 2012 10 commits
-
-
Morris Jette authored
-
Morris Jette authored
See bugzilla bug 132 When using select/cons_res and CR_Core_Memory, hyperthreaded nodes may be overcommitted on memory when CPU counts are scaled. I've tested 2.4.2 and HEAD (2.5.0-pre3). Conditions: ----------- * SelectType=select/cons_res * SelectTypeParameters=CR_Core_Memory * Using threads - Ex. "NodeName=linux0 Sockets=1 CoresPerSocket=4 ThreadsPerCore=2 RealMemory=400" Description: ------------ In the cons_res plugin, _verify_node_state() in job_test.c checks if a node has sufficient memory for a job. However, the per-CPU memory limits appear to be scaled by the number of threads. This new value may exceed the available memory on the node. And, once a node is overcommitted on memory, future memory checks in _verify_node_state() will always succeed. Scenario to reproduce: ---------------------- With the example node linux0, we run a single-core job with 250MB/core srun --mem-per-cpu=250 sleep 60 cons_res checks that it will fit: ((real - alloc) >= job mem) ((400 - 0) >= 250) and the job starts Then, the memory requirement is doubled: "slurmctld: error: cons_res: node linux0 memory is overallocated (500) for job X" "slurmd: scaling CPU count by factor of 2" This job should not have started While the first job is still running, we submit a second, identical job srun --mem-per-cpu=250 sleep 60 cons_res checks that it will fit: ((400 - 500) >= 250), the unsigned int wraps, the test passes, and the job starts This second job also should not have started
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
-
- 01 Oct, 2012 1 commit
-
-
Danny Auble authored
-
- 29 Sep, 2012 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
- 28 Sep, 2012 1 commit
-
-
Don Lipari authored
-
- 27 Sep, 2012 4 commits
-
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
purged from the system if its front-end node goes down.
-
Danny Auble authored
-