1. 10 May, 2016 2 commits
    • Marlys Kohnke's avatar
      Changes have been made to the cray/select plugin aeld code · 2959a1e6
      Marlys Kohnke authored
          for better robustness.  This cray/select plugin code has
          been modified to remove a possible timing window where two
          aeld pthreads could exist, interfering with each other through
          the global aeld_running variable.
      
          An additional validity check has been added to the data provided
          to aeld through an alpsc_ev_set_application_info() call.
          If an error is returned from that call, only certain errors
          need the current socket connection closed to aeld and a new
          connection established.  Other error returns will log an
          error message and keep the current session established with
          aeld.
      2959a1e6
    • Danny Auble's avatar
      Fix issue where daemons would only listen on specific address given in · 79c9a499
      Danny Auble authored
      slurm.conf instead of all.  If looking for specific addresses use
      TopologyParam options No*InAddrAny.
      
      This was broken in 15.08 with the advent of the referenced TopologyParams
      the commits 9378f195 and c5312f52 are no longer needed.
      
      Bug 2696
      79c9a499
  2. 09 May, 2016 2 commits
  3. 06 May, 2016 1 commit
    • John Thiltges's avatar
      Fix for slurmstepd setfault · db0fe22e
      John Thiltges authored
      With slurm-15.08.10, we're seeing occasional segfaults in slurmstepd. The logs point to the following line: slurm-15.08.10/src/slurmd/slurmstepd/mgr.c:2612
      
      On that line, _get_primary_group() is accessing the results of getpwnam_r():
          *gid = pwd0->pw_gid;
      
      If getpwnam_r() cannot find a matching password record, it will set the result (pwd0) to NULL, but still return 0. When the pointer is accessed, it will cause a segfault.
      
      Checking the result variable (pwd0) to determine success should fix the issue.
      db0fe22e
  4. 05 May, 2016 2 commits
  5. 03 May, 2016 4 commits
  6. 29 Apr, 2016 4 commits
  7. 28 Apr, 2016 3 commits
  8. 27 Apr, 2016 2 commits
  9. 26 Apr, 2016 2 commits
  10. 23 Apr, 2016 1 commit
  11. 20 Apr, 2016 1 commit
    • Morris Jette's avatar
      burst_buffer/cray - fix create/desroy buffer only · 1391d29a
      Morris Jette authored
      burst_buffer/cray - Don't call Datawarp "paths" function if script includes
          only create or destroy of persistent burst buffer. Some versions of Datawarp
          software return an error for such scripts, causing the job to be held.
      bug 2624
      1391d29a
  12. 13 Apr, 2016 2 commits
  13. 12 Apr, 2016 2 commits
  14. 11 Apr, 2016 4 commits
  15. 09 Apr, 2016 1 commit
    • Morris Jette's avatar
      backfill scheduling enhancement · e62a9270
      Morris Jette authored
      When determining when a pending job will be able to start, rather
        than testing after removing each running job and trying to schedule
        the pending jobs, remove multiple jobs that all end about the
        same time before testing. This reduces the number of calls to
        the job placement logic, which is time consuming.
      e62a9270
  16. 07 Apr, 2016 2 commits
  17. 06 Apr, 2016 5 commits