- 15 Apr, 2015 4 commits
-
-
Morris Jette authored
-
Morris Jette authored
Prevent slurmdbd error if cluster added or removed while rollup in progress. Removing a cluster can cause slurmdbd to abort. Adding a cluster can cause the slurmdbd rollup to hang.
-
Morris Jette authored
Specialized memory (a node's MemSpecLimit configuration parameter) is not available for allocation to jobs. Select/linear, cons_res, and serial plugins fixed. bug 1548
-
Morris Jette authored
-
- 14 Apr, 2015 14 commits
-
-
Morris Jette authored
When allocating resources with resolution of sockets, charge the job for a CPUs on allocated sockets rather than just the CPUs on used cores.
-
David Bigagli authored
-
David Bigagli authored
-
Remi Palancher authored
-
Morris Jette authored
-
Danny Auble authored
-
Nicolas Joly authored
bug 1595
-
Morris Jette authored
This is a correction to commit df8e3447
-
Morris Jette authored
job_submit/lua: Enable reading and writing job environment variables. For example: if (job_desc.environment.LANGUAGE == "en_US") then ...
-
David Bigagli authored
-
David Bigagli authored
This reverts commit b87df0fb2ae474f815ea707844ae30c7b8606dc9.
-
Remi Palancher authored
-
David Bigagli authored
-
Remi Palancher authored
-
- 13 Apr, 2015 3 commits
-
-
Morris Jette authored
This removes some old logic that was just commented out when the LUA tables were converted to a new structure.
-
Morris Jette authored
-
Morris Jette authored
The error was being triggered by logic to collect accounting information starting and not being completed when testing to start a node ping RPC. In other words, the logic wasn't testing for the completion of one RPC against the start of that same RPC, but against the start of a different RPC. This change moves all of the timeout logic into the ping_nodes.c module where we can make sure that the timing of different RPCs do not get confused with each other. bug 1190
-
- 11 Apr, 2015 8 commits
-
-
Morris Jette authored
-
Morris Jette authored
This is a correction to commit 1eb32d90 so the select/cray plugin can be built on non-Cray system (for testing). bug 1587
-
Morris Jette authored
GRES counters changed in v15.08 from 32- to 64-bit.
-
Morris Jette authored
-
Morris Jette authored
slurmd was logging: "Error reading step 2031886.0 memory limits" (with various job/step IDs) because it was treating SLURM_SUCCESS as an error rather than SLURM_ERROR bug 1589
-
Morris Jette authored
Add logging of premature EOF
-
Danny Auble authored
-
Morris Jette authored
Disable changes to GRES count while jobs are running on the node. Previous logic could clear GRES count data and generate counter underflow then when attemtping to deallocate the job's resources (including future job terminations when computing when and where pending jobs will start as part of backfill scheduling logic). bug 1589
-
- 10 Apr, 2015 1 commit
-
-
Morris Jette authored
-
- 09 Apr, 2015 3 commits
-
-
Morris Jette authored
-
Danny Auble authored
if many steps/jobs finish at once.
-
Morris Jette authored
* Add "--thread-spec" option to salloc, sbatch and srun commands. This is the count of threads reserved for system use per node. * Add ability for scontrol to get/set job ThreadSpec * sivew: Add job ThreadSpec field to get/set * Modify select/cons_res to manage job allocations while leaving specialized threads for system use. * core_spec plugins: Fix system task binding logic and logging * Modify squeue output for thread_spec values * task/affinity and cgroup: Enhanced task binding logic
-
- 08 Apr, 2015 7 commits
-
-
David Bigagli authored
-
Morris Jette authored
bug 1589
-
Morris Jette authored
-
Morris Jette authored
bug 1570
-
Morris Jette authored
-
Morris Jette authored
A function in sshare changed, which broke several tests. The commit that broke the tests is 5dacc0fc
-
Danny Auble authored
-