- 27 May, 2016 3 commits
-
-
Danny Auble authored
-
Tim Wickberg authored
Add missing unlock before return. Coverity 44888.
-
Morris Jette authored
This reverts commit cc242de3 That patch fixed bug 2745, but breaks tests 1.89 and 1.91 on typical Xeon processors
-
- 26 May, 2016 1 commit
-
-
Morris Jette authored
Fix for uninitialized variable in task binding logic, could leave tasks with fewer CPUs than intended. bug 2766
-
- 25 May, 2016 2 commits
-
-
Morris Jette authored
-
Tim Wickberg authored
Coverity 44891.
-
- 24 May, 2016 8 commits
-
-
Tim Wickberg authored
sizeof(optarg) is incorrect, that's the size of the pointer not the length of the character string that must be parsed. Coverity 53128.
-
Tim Wickberg authored
Coverity 44992.
-
Tim Wickberg authored
Needs to unlock here, not re-lock the lock.
-
Tim Wickberg authored
-
Tim Wickberg authored
Prevent '--preserve' from being inadvertanly enabled by '-j'.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Morris Jette authored
Logic introduced in v16.05.0-rc2 could attempt state change for nid00000, even if no such node exists bug 2758
-
- 20 May, 2016 1 commit
-
-
Morris Jette authored
Change how Slurm determines the NUMA count of a node. Ignore KNL NUMA that only include memory. bug 2745
-
- 19 May, 2016 2 commits
-
-
Brian Christiansen authored
Need thread_id to distinguish between multiple threads with the same name.
-
Brian Christiansen authored
-
- 18 May, 2016 6 commits
-
-
Danny Auble authored
and the slurmctld doesn't wait long enough for the response it would give up leaving the connection open and create a situation where the next message sent could receive the response of the first one. Bug 2739
-
Morris Jette authored
Correct logic that calculates a step's cpus_per_task allocation on a heterogenous job allocation. Mixing a KNL with a Xeon resulted in a count that was between the CPU count on the two node types and invalid on the node with smaller CPU count (e.g. 272 CPUs on KNL, 8 on Xeon, and 2 tasks, cpus_per_task = 140).
-
Brian Christiansen authored
-
Alejandro Sanchez authored
Bug #2713.
-
Alejandro Sanchez authored
Bug #2713.
-
Nicolas Joly authored
-
- 17 May, 2016 1 commit
-
-
Tim Wickberg authored
-
- 16 May, 2016 3 commits
-
-
Josko Plazonic authored
Update slurm.spec file to have seff depend on slurm-perlapi.
-
Jason Bacon authored
-
Morris Jette authored
-
- 13 May, 2016 3 commits
-
-
Morris Jette authored
-
Danny Auble authored
when in use. The problem here is the polling threads in the various acct_gather codes were detached and could possibly still be polling after the plugin had been unloaded making a seg fault with a backtrace like this... #0 0x00007fe7af008c00 in ?? () #1 0x00007fe7b1138479 in __nptl_deallocate_tsd () at pthread_create.c:175 #2 0x00007fe7b11398b0 in __nptl_deallocate_tsd () at pthread_create.c:326 #3 start_thread (arg=0x7fe7b1f12700) at pthread_create.c:346 #4 0x00007fe7b0e6fb5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 The fix was to make the threads non-detached and join them before calling a dlclose.
-
Morris Jette authored
Whenever possible, avoid allocating nodes that require a reboot. Previous logic failed to re-sort the job set table based upon the need for rebooting to achieve the desired features (e.g. KNL MCDRAM or CACHE mode). bug 2726
-
- 12 May, 2016 3 commits
-
-
Danny Auble authored
trying to verify the cluster name (which may try to /create/ files or directories) *before* dropping privs results in a fatal error as slurmctld tries to create items which ultimately fail. Moving this process until after the privs and uid have changed allows the process to succeed. Reported by Jon Nelson <jdnelson@dyn.com> Bug 2728
-
Morris Jette authored
Reject invalid step at submit time rather than leaving it queued. Bug 2722 describes one of the use cases triggering the bug.
-
Morris Jette authored
This partially restores commit 03b2cfb5 Logic was not closing file descriptor, which left the file locked and leaked an open file descriptor.
-
- 11 May, 2016 4 commits
-
-
Danny Auble authored
tasks-per-node/nodes != tasks print warning and ignore ntasks-per-node. Bug 2520
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
make it to the slurmctld when using message aggregation.
-
- 10 May, 2016 3 commits
-
-
Danny Auble authored
-
Tim Wickberg authored
-
Marlys Kohnke authored
for better robustness. This cray/select plugin code has been modified to remove a possible timing window where two aeld pthreads could exist, interfering with each other through the global aeld_running variable. An additional validity check has been added to the data provided to aeld through an alpsc_ev_set_application_info() call. If an error is returned from that call, only certain errors need the current socket connection closed to aeld and a new connection established. Other error returns will log an error message and keep the current session established with aeld.
-