1. 19 Jul, 2016 1 commit
    • Morris Jette's avatar
      gres-flags=enforce-binding fix · 5df8509f
      Morris Jette authored
      Fix for core selection with job --gres-flags=enforce-binding option.
          Previous logic would in some cases allocate a job zero cores, resulting in
          slurmctld abort.
      bug 2808
      5df8509f
  2. 16 Jul, 2016 2 commits
    • Danny Auble's avatar
      Add SLURM_PENDING_STEP id so it won't be confused with SLURM_EXTERN_CONT. · 0c7bd6d0
      Danny Auble authored
      In commit b8190e5d many places that were mean to be pending step ids
      were changed to be extern_step id.  The main problem was when we came up
      with the idea of the extern step we reused -1 (INFINITE) for the id.  So
      pending steps also appeared to be extern steps as well.  Hopefully this
      fixes the situation.
      
      Bug 2907
      0c7bd6d0
    • Morris Jette's avatar
      Move startup of power save thread · fb8e3558
      Morris Jette authored
      Start power save thread only after the partition information is read
        in order to avoid trying to interpret the SuspendExcParts configuration
        information before the partition information is available, which would
        result in a slurmctld abort.
      fb8e3558
  3. 15 Jul, 2016 2 commits
  4. 14 Jul, 2016 2 commits
    • Morris Jette's avatar
      Fix gang scheduling and license release logic · 111e3b48
      Morris Jette authored
      Fix gang scheduling and license release logic if single node job killed on
          bad node. Notifying gang and releasing licences is normally done when
          the epilog completion happens, but if the node(s) assigned to a job are
          all down, that does not happen. This results in the licenses being
          reserved indefinitely and the gang scheduler being left with a bad
          (old) job pointer that can result in various failure modes
      bug 2867
      111e3b48
    • Danny Auble's avatar
      CRAY - If trying to kill a step and you have NHC_NO_STEPS set run NHC · e956f297
      Danny Auble authored
      anyway to attempt to log the backtraces of the potential
      unkillable processes.
      e956f297
  5. 13 Jul, 2016 1 commit
  6. 12 Jul, 2016 6 commits
  7. 11 Jul, 2016 1 commit
  8. 08 Jul, 2016 7 commits
  9. 07 Jul, 2016 6 commits
  10. 06 Jul, 2016 5 commits
  11. 05 Jul, 2016 2 commits
  12. 04 Jul, 2016 1 commit
  13. 02 Jul, 2016 3 commits
  14. 01 Jul, 2016 1 commit