1. 30 May, 2018 17 commits
  2. 24 May, 2018 1 commit
    • Brian Christiansen's avatar
      Notify srun and ctld when unkillable stepd exits · 956a808d
      Brian Christiansen authored
      Commits f18390e8 and eed76f85 modified the stepd so that if the
      stepd encountered an unkillable step timeout that the stepd would just
      exit the stepd. If the stepd is a batch step then it would reply back
      to the controller with a non-zero exit code which will drain the node.
      But if an srun allocation/step were to get into the unkillable step
      code, the steps wouldn't let the waiting srun or controller know about
      the step going away -- leaving a hanging srun and job.
      
      This patch enables the stepd to notify the waiting sruns and the ctld of
      the stepd being done and drains the node for srun'ed alloction and/or
      steps.
      
      Bug 5164
      956a808d
  3. 21 May, 2018 1 commit
  4. 19 May, 2018 3 commits
  5. 18 May, 2018 8 commits
  6. 17 May, 2018 1 commit
  7. 16 May, 2018 5 commits
  8. 15 May, 2018 4 commits
    • Morris Jette's avatar
      Add reboot node weight parameter · da8c8374
      Morris Jette authored
      Add node_features plugin function "node_features_p_reboot_weight()" to
         return the node weight to be used for a compute node that requires reboot
         for use (e.g. to change the NUMA mode of a KNL node).
      Add NodeRebootWeight parameter to knl.conf configuration file.
      da8c8374
    • Morris Jette's avatar
      Merge branch 'slurm-17.11' · 57f54212
      Morris Jette authored
      57f54212
    • Morris Jette's avatar
      Make a test more robust · b1c2a6fb
      Morris Jette authored
      If ReturnToService=2 is configured, the test could generate an error
      changing node state to resume after setting it to down. The reason
      is if the node communicates with slurmctld, then its state will
      automatically be changed from down to idle and resuming an idle
      node triggers an error.
      b1c2a6fb
    • Alejandro Sanchez's avatar
      Merge branch 'slurm-17.11' · 579f8ffd
      Alejandro Sanchez authored
      579f8ffd