1. 14 Mar, 2016 1 commit
  2. 11 Mar, 2016 2 commits
  3. 10 Mar, 2016 2 commits
  4. 09 Mar, 2016 2 commits
    • Morris Jette's avatar
      cray job requeue bug · fec5e03b
      Morris Jette authored
      Fix Cray NHC spawning on job requeue. Previous logic would leave nodes
      allocated to a requeued job as non-usable on job termination.
      
      Specifically, each job has a "cleaning/cleaned" flag. Once a job
      terminates, the cleaning flag is set, then after the job node health
      check completes, the value gets set to cleaned. If the job is requeued,
      on its second (or subsequent) termination, the select/cray plugin
      is called to launch the NHC. The plugin sees the "cleaned" flag
      already set, it then logs:
      error: select_p_job_fini: Cleaned flag already set for job 1283858, this should never happen
      and returns, never launching the NHC. Since the termination of the
      job NHC triggers releasing job resources (CPUs, memory, and GRES),
      those resources are never released for use by other jobs.
      
      Bug 2384
      fec5e03b
    • David Gloe's avatar
      Correctly parse nids in slurmconfgen_smw.py · 88ccc111
      David Gloe authored
      An error in slurmconfgen_smw.py caused it to parse the nic as the nid.
      On some systems those values differ, causing the generated slurm.conf file to
      be incorrect.
      
      Bug 2532.
      88ccc111
  5. 08 Mar, 2016 5 commits
  6. 07 Mar, 2016 1 commit
    • Tim Wickberg's avatar
      add additional tuning notes for mysql/mariadb · 49dc5d8d
      Tim Wickberg authored
      In particular, it seems that MariaDB has changed the default for
      innodb_lock_wait_timeout has been lowered which can cause issues
      for the various rollup processes on systems with high job counts.
      49dc5d8d
  7. 05 Mar, 2016 2 commits
  8. 04 Mar, 2016 2 commits
  9. 03 Mar, 2016 4 commits
  10. 02 Mar, 2016 3 commits
  11. 01 Mar, 2016 9 commits
  12. 29 Feb, 2016 1 commit
  13. 26 Feb, 2016 5 commits
  14. 25 Feb, 2016 1 commit
    • Tim Wickberg's avatar
      Add missing definition for val_to_char() · 344c74fc
      Tim Wickberg authored
      Since the function is inlined the single definition let GCC build everything
      properly, but debug builds (which disable inline) resulted in:
      slurmstepd: [465.0]: symbol lookup error:
      (trimmed path)/task_cgroup.so: undefined symbol: val_to_char
      when running srun --cpu_bind=v.
      
      task/affinity had this definition already, task/cgroup didn't.
      344c74fc