1. 05 Oct, 2016 2 commits
  2. 04 Oct, 2016 1 commit
    • Morris Jette's avatar
      add knl.conf parameter CapmcRetries · 5cb90497
      Morris Jette authored
      Add new knl.conf configuration parameter CapmcRetries
      Modify capmc_suspend and capmc_resume to retry operations when
        Cray State Manager is down.
      Add retry logic to node_features/knl_cray to handle Cray State
        manager being down.
      bug 3100
      5cb90497
  3. 03 Oct, 2016 1 commit
  4. 30 Sep, 2016 4 commits
  5. 29 Sep, 2016 6 commits
  6. 28 Sep, 2016 1 commit
  7. 27 Sep, 2016 2 commits
  8. 26 Sep, 2016 1 commit
    • Morris Jette's avatar
      Add salloc/sbatch/srun --priority=top option · 62b9884f
      Morris Jette authored
      Add salloc/sbatch/srun --priority option of "TOP" to set job priority to
          the highest possible value. This option is only available to Slurm operators
          and administrators.
      bug 3115
      62b9884f
  9. 24 Sep, 2016 1 commit
  10. 23 Sep, 2016 1 commit
  11. 22 Sep, 2016 5 commits
  12. 21 Sep, 2016 4 commits
  13. 20 Sep, 2016 1 commit
  14. 17 Sep, 2016 1 commit
    • Morris Jette's avatar
      Restore ability to manually power down nodes · da722a89
      Morris Jette authored
      Restore ability to manually power down nodes, broken in 15.08.12
      in commit b4904661
      
      The patch introduced in commit b4904661 (not powering down dead node) has a bad side effect.  Adding the "(node_ptr->last_idle != 0)" condition prevents from powering down nodes with the following command:
      
      scontrol update nodename=nX state=power_down
      
      because the state update function relies on zeroing the "last_idle" variable when a power_down is requested (see src/slurmctld/node_mgr.c, line 1589).
      
      Reverting this commit should solve the problem...but I let you decide...
      
      Didier GAZEN
      da722a89
  15. 16 Sep, 2016 1 commit
    • Morris Jette's avatar
      Update KNL modes for out-of-band reboot · 3a465f80
      Morris Jette authored
      node_features/knl_cray: If a node is rebooted outside of Slurm's direction,
          update it's active features with current MCDRAM and NUMA mode information.
      bug 3071
      3a465f80
  16. 15 Sep, 2016 2 commits
  17. 14 Sep, 2016 2 commits
  18. 09 Sep, 2016 3 commits
  19. 08 Sep, 2016 1 commit
    • Morris Jette's avatar
      Restructure srun task_exit logic · 6b6d4e1a
      Morris Jette authored
      Restructure srun command locking for task_exit processing logic for improved
        parallelism. This change decreases the amount of time consumed by serial
        logic by 2 orders of magnitude.
      bug 3044
      6b6d4e1a