1. 12 May, 2016 1 commit
    • Danny Auble's avatar
      If the cluster name and state are stored on NFS (with root_squash), · e422127c
      Danny Auble authored
      trying to verify the cluster name (which may try to /create/ files or
      directories) *before* dropping privs results in a fatal error as
      slurmctld tries to create items which ultimately fail.  Moving
      this process until after the privs and uid have changed allows
      the process to succeed.
      
      Reported by Jon Nelson <jdnelson@dyn.com>
      
      Bug 2728
      e422127c
  2. 11 May, 2016 2 commits
  3. 10 May, 2016 4 commits
  4. 09 May, 2016 2 commits
  5. 06 May, 2016 1 commit
    • John Thiltges's avatar
      Fix for slurmstepd setfault · db0fe22e
      John Thiltges authored
      With slurm-15.08.10, we're seeing occasional segfaults in slurmstepd. The logs point to the following line: slurm-15.08.10/src/slurmd/slurmstepd/mgr.c:2612
      
      On that line, _get_primary_group() is accessing the results of getpwnam_r():
          *gid = pwd0->pw_gid;
      
      If getpwnam_r() cannot find a matching password record, it will set the result (pwd0) to NULL, but still return 0. When the pointer is accessed, it will cause a segfault.
      
      Checking the result variable (pwd0) to determine success should fix the issue.
      db0fe22e
  6. 05 May, 2016 2 commits
  7. 03 May, 2016 4 commits
  8. 29 Apr, 2016 4 commits
  9. 28 Apr, 2016 3 commits
  10. 27 Apr, 2016 2 commits
  11. 26 Apr, 2016 2 commits
  12. 23 Apr, 2016 1 commit
  13. 20 Apr, 2016 1 commit
    • Morris Jette's avatar
      burst_buffer/cray - fix create/desroy buffer only · 1391d29a
      Morris Jette authored
      burst_buffer/cray - Don't call Datawarp "paths" function if script includes
          only create or destroy of persistent burst buffer. Some versions of Datawarp
          software return an error for such scripts, causing the job to be held.
      bug 2624
      1391d29a
  14. 13 Apr, 2016 2 commits
  15. 12 Apr, 2016 2 commits
  16. 11 Apr, 2016 4 commits
  17. 09 Apr, 2016 1 commit
    • Morris Jette's avatar
      backfill scheduling enhancement · e62a9270
      Morris Jette authored
      When determining when a pending job will be able to start, rather
        than testing after removing each running job and trying to schedule
        the pending jobs, remove multiple jobs that all end about the
        same time before testing. This reduces the number of calls to
        the job placement logic, which is time consuming.
      e62a9270
  18. 07 Apr, 2016 2 commits