1. 13 Jul, 2018 1 commit
  2. 15 Jun, 2018 1 commit
  3. 05 Jun, 2018 1 commit
  4. 27 Apr, 2018 1 commit
  5. 11 Apr, 2018 1 commit
  6. 15 Mar, 2018 1 commit
  7. 11 Dec, 2017 1 commit
    • David Gloe's avatar
      CRAY - Switch to standard pid files on Cray systems. · 5342d979
      David Gloe authored
      Bug 4500
      
      The pid files in slurm.conf and the systemd service files must match,
      or systemd will time out looking for the wrong pid file. Currently,
      the Cray slurm.conf template has different pid files for slurmctld and
      slurmd than the service files.
      
      There's no reason for us to use these nonstandard pid files, and it
      will save us some headaches to switch over.
      5342d979
  8. 06 Dec, 2017 1 commit
    • David Gloe's avatar
      Replace slurm_playbook.yaml.in with a version that auto-detects the Slurm version. · d54eb247
      David Gloe authored
      Due to the way Cray builds Slurm, the prefix and bindir paths include the
      Slurm version (/opt/slurm/<version>). This means every time we update to a
      new Slurm version we must update the Slurm ansible playbook. It also means
      that the slurm_playbook.yaml file must be built with Slurm to be used
      (it can't simply be copied directly).
      
      The attached patch updates the playbook to determine the version of Slurm
      to use from the module file, and hardcodes the sysconfdir setting we give
      in our Slurm installation guide. If a customer uses different paths, they
      can update the playbook to meet their needs.
      
      Bug 4360.
      d54eb247
  9. 07 Nov, 2017 1 commit
  10. 16 Oct, 2017 1 commit
  11. 10 Oct, 2017 1 commit
  12. 09 Oct, 2017 4 commits
  13. 20 Sep, 2017 2 commits
  14. 15 Sep, 2017 1 commit
  15. 04 Aug, 2017 1 commit
  16. 29 Jun, 2017 1 commit
  17. 16 Jun, 2017 1 commit
  18. 25 May, 2017 1 commit
  19. 05 May, 2017 1 commit
  20. 21 Apr, 2017 2 commits
  21. 07 Mar, 2017 1 commit
    • Morris Jette's avatar
      capmc_resume not changing node state · cecaf222
      Morris Jette authored
      capmc_resume (Cray resume node script) - Do not disable changing a node's
          active features if SyscfgPath is configured in the knl.conf file.
      bug 3533
      cecaf222
  22. 15 Feb, 2017 1 commit
  23. 06 Feb, 2017 1 commit
  24. 29 Jan, 2017 1 commit
    • Morris Jette's avatar
      task/cray configuration ordering bug · 0545f523
      Morris Jette authored
      CRAY systems only: TaskPlugins must list task/cgroup before task/cray in
          order for the cgroup files to be created before task/cray runs. Without
          this change, the task/cray plugin frequently produces errors about the
          "mems" file being missing. The errors don't seem consistent, so this
          probably involves a race condition. Note that NERSC uses this order
          today and I changed read_config.c to produce a fatal error if the order
          is reversed.
      0545f523
  25. 24 Jan, 2017 1 commit
  26. 23 Jan, 2017 1 commit
    • Morris Jette's avatar
      Add new knl.conf parameters to capmc drivers · 0eea2c3d
      Morris Jette authored
      Add new knl.conf parameter to the capmc_suspend and capmc_resume
        programs. They are not used by those programs, but we need to
        prevent an error if those new parameters are used.
      0eea2c3d
  27. 18 Jan, 2017 1 commit
  28. 16 Dec, 2016 2 commits
  29. 30 Nov, 2016 1 commit
  30. 10 Nov, 2016 1 commit
  31. 25 Oct, 2016 1 commit
    • Tim Wickberg's avatar
      Print warning that task/cray must be listed before task/cgroup in TaskPlugin · c3266fca
      Tim Wickberg authored
      task/cray's _get_numa_nodes() function needs to run before task/cgroup
      cleans up the cgroup hierarchies, otherwise ALPS memory compaction will
      never run.
      
      Also move task_p_add_pid() outside the #ifdef HAVE_NATIVE_CRAY
      block so that the plugin will load (albeit without any functionality)
      on non-Cray systems for testing purposes.
      
      Revise documentation and provided slurm.conf templates as well.
      
      Bug 3154.
      c3266fca
  32. 06 Oct, 2016 1 commit
  33. 05 Oct, 2016 1 commit
    • Morris Jette's avatar
      add knl.conf parameter CapmcRetries · 9218a40f
      Morris Jette authored
      Add new knl.conf configuration parameter CapmcRetries
      Modify capmc_suspend and capmc_resume to retry operations when
        Cray State Manager is down.
      Add retry logic to node_features/knl_cray to handle Cray State
        manager being down.
      bug 3100
      9218a40f
  34. 04 Oct, 2016 1 commit
    • Morris Jette's avatar
      add knl.conf parameter CapmcRetries · 5cb90497
      Morris Jette authored
      Add new knl.conf configuration parameter CapmcRetries
      Modify capmc_suspend and capmc_resume to retry operations when
        Cray State Manager is down.
      Add retry logic to node_features/knl_cray to handle Cray State
        manager being down.
      bug 3100
      5cb90497