1. 06 May, 2016 1 commit
    • John Thiltges's avatar
      Fix for slurmstepd setfault · db0fe22e
      John Thiltges authored
      With slurm-15.08.10, we're seeing occasional segfaults in slurmstepd. The logs point to the following line: slurm-15.08.10/src/slurmd/slurmstepd/mgr.c:2612
      
      On that line, _get_primary_group() is accessing the results of getpwnam_r():
          *gid = pwd0->pw_gid;
      
      If getpwnam_r() cannot find a matching password record, it will set the result (pwd0) to NULL, but still return 0. When the pointer is accessed, it will cause a segfault.
      
      Checking the result variable (pwd0) to determine success should fix the issue.
      db0fe22e
  2. 05 May, 2016 3 commits
  3. 04 May, 2016 3 commits
  4. 03 May, 2016 6 commits
  5. 02 May, 2016 1 commit
  6. 29 Apr, 2016 4 commits
  7. 28 Apr, 2016 4 commits
  8. 27 Apr, 2016 4 commits
  9. 26 Apr, 2016 7 commits
  10. 23 Apr, 2016 1 commit
  11. 20 Apr, 2016 2 commits
  12. 15 Apr, 2016 1 commit
  13. 14 Apr, 2016 1 commit
    • Morris Jette's avatar
      Set burst buffer reason for job · 49d483db
      Morris Jette authored
      If a job fails stage in, set its reason to BurstBufferOperation
      with a string describing what happened. Previously the reason was
      set to AdminHeld on stage-in failure.
      49d483db
  14. 13 Apr, 2016 2 commits