- 16 Mar, 2016 9 commits
-
-
Danny Auble authored
-
Danny Auble authored
time. https://bugs.schedmd.com/show_bug.cgi?id=2547 The code just wasn't fully baked before and was probably written before a lot of the other supporting code was done i.e assoc_mgr_set_assoc|qos_tres_cnt were done specifically for this kind of thing. Many of the usage structures weren't realloced either as well as the tres_cnt local to each qos and assoc wasn't updated. So all in all pretty bad code - bad Danny. This makes sure all this sets up and no memory corruption happens.
-
Morris Jette authored
-
Morris Jette authored
Generate burst buffer use completion email immediately afer teardown completes rather than at job purge time (likely minutes later). bug 2539
-
Morris Jette authored
Change burst buffer use completion message from "SLURM Job_id=1360353 Name=tmp Staged Out, StageOut time 00:01:47" to "SLURM Job_id=1360353 Name=tmp StageOut/Teardown time 00:01:47"
-
Alejandro Sanchez authored
-
Morris Jette authored
This is being fixed in shortly be creating a separate library for bcast functionaltiy
-
Brian Christiansen authored
-
Brian Christiansen authored
Bug 2396
-
- 15 Mar, 2016 9 commits
-
-
Alejandro Sanchez authored
-
Morris Jette authored
-
Tim Wickberg authored
Bug 2548. No functional change, documentation only.
-
Tim Wickberg authored
Conflicts: src/plugins/burst_buffer/generic/burst_buffer_generic.c
-
Tim Wickberg authored
Otherwise "not found" value of -1 for tres_pos would cause out-of-bounds memory access.
-
Tim Wickberg authored
Conflicts: src/plugins/burst_buffer/cray/burst_buffer_cray.c
-
Tim Wickberg authored
Bug 2543.
-
Tim Wickberg authored
Fix bad cast in 3a604563, and update pct to 64-bits to prevent truncation of intermediate value (pct * 100).
-
Morris Jette authored
-
- 14 Mar, 2016 7 commits
-
-
Danny Auble authored
resolve NoInAddrAny when doing a strstr. Continuation of commit 775c46de.
-
Danny Auble authored
on only one port like TopologyParam=NoInAddrAny does for everything else.
-
Brian Christiansen authored
-
Tim Wickberg authored
-
Tim Wickberg authored
There's no /proc on *BSD, and BSD handles OOM in a completely different way.
-
Tim Wickberg authored
-
Tim Wickberg authored
Dividing a negative int by a positive can have unexpected behavior - C99 requires "truncation towards zero". This was to an incorrect output of: sbcast: File compressed from 104857600 to 104889678 (40 percent) in 2160081 usec when testing with a file of random data. This is actually negative 0 (point something that was truncated) compression, not "40".
-
- 12 Mar, 2016 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
- 11 Mar, 2016 7 commits
-
-
Morris Jette authored
Conflicts: NEWS src/smap/Makefile.am
-
Morris Jette authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
Return [0-100:2] formatting, rather than [0,2,4,6,8,...] when using a step function. Was inadvertantly broken in 14.11 with commit 5ffdca92. Bug 2535.
-
Morris Jette authored
-
Morris Jette authored
Need higher count for KNL processor.
-
- 10 Mar, 2016 6 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Conflicts: NEWS
-
Morris Jette authored
Fix Cray NHC spawning on job requeue. Previous logic would leave nodes allocated to a requeued job as non-usable on job termination. Specifically, each job has a "cleaning/cleaned" flag. Once a job terminates, the cleaning flag is set, then after the job node health check completes, the value gets set to cleaned. If the job is requeued, on its second (or subsequent) termination, the select/cray plugin is called to launch the NHC. The plugin sees the "cleaned" flag already set, it then logs: error: select_p_job_fini: Cleaned flag already set for job 1283858, this should never happen and returns, never launching the NHC. Since the termination of the job NHC triggers releasing job resources (CPUs, memory, and GRES), those resources are never released for use by other jobs. Bug 2384
-
David Gloe authored
An error in slurmconfgen_smw.py caused it to parse the nic as the nid. On some systems those values differ, causing the generated slurm.conf file to be incorrect. Bug 2532.
-
Tim Wickberg authored
_set_collectors() already has a run_in_daemon("slurmd") that precludes this from being an issue.
-