- 14 Aug, 2015 37 commits
-
-
Morris Jette authored
Add logic to burst_buffer/cray and generic to set job's buffer size in TRES array for limits and accounting. This includes only the the job-specific buffers, not persistent buffers, which will be handled separately.
-
Brian Christiansen authored
-
Danny Auble authored
-
Carlos Fenoy authored
The problem is that with high rate profiling, the cpu usage becomes a binary field (see the output below). I would like to have a more accurate value of the cpu usage. I'm thinking about writing another profiling plugin that may be very useful in our environment, and probably also in other environments where compute nodes are shared. Bug 1748
-
Danny Auble authored
change is to ease future transition to IPv6 reducing the number of places that use AF_INET. AF_SLURM currently points to AF_INET though.
-
Danny Auble authored
-
Danny Auble authored
char in the tres string.
-
Danny Auble authored
-
Danny Auble authored
removed, so make sure we process correctly, sort it while we are at it.
-
Danny Auble authored
-
Danny Auble authored
slurmdb_combine_tres_strings.
-
Brian Christiansen authored
Bug 858
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Morris Jette authored
Plus some cosmetic changes to a couple of tests
-
Danny Auble authored
-
Brian Christiansen authored
-
Danny Auble authored
-
Danny Auble authored
end up pointing to qos_ptr_2 erroneously.
-
Danny Auble authored
clearing the pointer in the partition record.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Brian Christiansen authored
Fix case where job would get the wrong cpu count when using --ntasks-per-core and --cpus-per-task together. Bug 1370
-
Brian Christiansen authored
-
Brian Christiansen authored
Bug 1295 Continuation of 166a4eb8
-
Morris Jette authored
-
Daniel Ahlin authored
We are using a non-default AuthInfo configuration and based on log-messages we see I believe this is not properly handled in certain parts of the code. Typical log message: Aug 12 17:06:15 t02n20 slurmd[27001]: error: Munge encode failed: Failed to access "/var/run/munge/munge.socket.2": No such file or directory Aug 12 17:06:15 t02n20 slurmd[27001]: error: Creating authentication credential: Socket communication error Aug 12 17:06:15 t02n20 slurmd[27001]: error: stepd_connect to 3165.0 failed: Protocol authentication error Aug 12 17:06:15 t02n20 slurmd[27001]: error: If munged is up, restart with --num-threads=10
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Unfortunately the reservation's core_bitmap is a global bitmap and the nodes in the system may have changed in terms of their node count or nodes have been added or removed. Make best effort to rebuild the reservation's core_bitmap on the limited information currently available. The specific cores might change, but this logic at least leaves their count constant and uses the same nodes bug 1850
-
- 13 Aug, 2015 3 commits
-
-
Morris Jette authored
Add logic to get default account/qos information for newly discovered buffers in order to properly enforce limits. Add logic to get TRES index for limits Refactor bb_find_user_rec() function argument
-
Danny Auble authored
-
Danny Auble authored
we are running with associations.
-