1. 24 May, 2016 2 commits
  2. 23 May, 2016 1 commit
    • Nicolas Joly's avatar
      Fix scancel(1) uninitialized condition variable · 370e828e
      Nicolas Joly authored
      Still testing 16.05 on my NetBSD/amd64 workstation ...
      Just encountered a crash with scancel(1).
      njoly@lanfeust [~]> sbatch --wrap "sleep 3600"
      Submitted batch job 4680
      njoly@lanfeust [~]> scancel 4680
      scancel: Error detected by libpthread: Invalid condition variable.
      Detected by file "/local/src/NetBSD/src/lib/libpthread/pthread_cond.c", line 140, function "pthread_cond_timedwait".
      See pthread(3) for information.
      zsh: abort (core dumped)  scancel 4680
      Checking the code show indeed that pthread_cond_wait() call from scancel.c:_signal_job_by_str() use an uninitialised condition variable "num_active_threads_cond"
      The attached patch, which add the missing pthread_cond_init() seems to fix it.
      bug 2753
      370e828e
  3. 18 May, 2016 2 commits
  4. 17 May, 2016 4 commits
  5. 16 May, 2016 2 commits
  6. 13 May, 2016 2 commits
    • Danny Auble's avatar
      Performance fix for commit b1fbeb85 · d73d56ec
      Danny Auble authored
      d73d56ec
    • Danny Auble's avatar
      Fix race condition with respects to cleaning up the profiling threads · b1fbeb85
      Danny Auble authored
      when in use.
      
      The problem here is the polling threads in the various acct_gather codes
      were detached and could possibly still be polling after the plugin had
      been unloaded making a seg fault with a backtrace like this...
      
      #0  0x00007fe7af008c00 in ?? ()
      #1  0x00007fe7b1138479 in __nptl_deallocate_tsd () at pthread_create.c:175
      #2  0x00007fe7b11398b0 in __nptl_deallocate_tsd () at pthread_create.c:326
      #3  start_thread (arg=0x7fe7b1f12700) at pthread_create.c:346
      #4  0x00007fe7b0e6fb5d in clone ()
          at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
      
      The fix was to make the threads non-detached and join them before calling
      a dlclose.
      b1fbeb85
  7. 12 May, 2016 2 commits
  8. 11 May, 2016 2 commits
  9. 10 May, 2016 7 commits
  10. 09 May, 2016 2 commits
  11. 06 May, 2016 2 commits
    • Morris Jette's avatar
      Add another explanation for test failure · b5dabfe8
      Morris Jette authored
      b5dabfe8
    • John Thiltges's avatar
      Fix for slurmstepd setfault · db0fe22e
      John Thiltges authored
      With slurm-15.08.10, we're seeing occasional segfaults in slurmstepd. The logs point to the following line: slurm-15.08.10/src/slurmd/slurmstepd/mgr.c:2612
      
      On that line, _get_primary_group() is accessing the results of getpwnam_r():
          *gid = pwd0->pw_gid;
      
      If getpwnam_r() cannot find a matching password record, it will set the result (pwd0) to NULL, but still return 0. When the pointer is accessed, it will cause a segfault.
      
      Checking the result variable (pwd0) to determine success should fix the issue.
      db0fe22e
  12. 05 May, 2016 3 commits
  13. 04 May, 2016 3 commits
  14. 03 May, 2016 6 commits