1. 13 Nov, 2013 3 commits
    • Morris Jette's avatar
      Corrections to advanced reservation logic with overlapping jobs. · d6954b77
      Morris Jette authored
      This might have worked fine for core reservations or when there
      are sufficient idle nodes to use, the the select_g_resv_test()
      function clears the node bitmap for nodes that it can not use
      and the reservation create logic did not restore that bitmap
      after a failed resource selection attempt. This logic restores
      the node bitmap on a failed call to select_g_resv_test() so we
      can add nodes to the bitmap of available nodes rather than having
      it repeatedly cleared.
      The logic also adds some performance enhancements that I will
      add to in the next commit.
      d6954b77
    • David Bigagli's avatar
      Update NEWS file for commit dc0c4e29. · 3643c8a9
      David Bigagli authored
      3643c8a9
    • Morris Jette's avatar
      Fix bug in job step allocation failing due to memory limit · 21ed817c
      Morris Jette authored
      This fixes a bug where a system is enforcing memory limits and
      the job already has a step running on some of the nodes then
      tries to start another step using some of those nodes. For example
      wwith DefMemPerNode configured and the select plugin enforcing
      memory limits, try:
      salloc -N2 bash
      $ srun -N1 sleep 10&
      $ srun -N2 hostname
      Without this patch, the second srun would fail instead of pend.
      21ed817c
  2. 09 Nov, 2013 1 commit
  3. 08 Nov, 2013 1 commit
  4. 05 Nov, 2013 1 commit
  5. 04 Nov, 2013 6 commits
  6. 03 Nov, 2013 1 commit
    • jette's avatar
      Enlarge max job array task ID to 32-bits · 494f6771
      jette authored
      The system really can not handle larger job arrays without adding
      a job array data structure, but this puts some of the infrastructure
      in place now.
      494f6771
  7. 02 Nov, 2013 1 commit
  8. 01 Nov, 2013 3 commits
    • Morris Jette's avatar
      Fix for used_cpu_run_secs bad calcuation · 0da4d951
      Morris Jette authored
      Add argument to priority plugin's priority_p_reconfig function to note
      when the association and QOS used_cpu_run_secs field has been reset.
      Without this flag, we remove time on "scontrol setdebug" or
      "scontrol setdebugflag" that can result in used_cpu_run_secs
      going negative or otherwise get bad values.
      Correction to logic added in commit 6d793189
      bug 423
      0da4d951
    • Morris Jette's avatar
      Fix for used_cpu_run_secs bad calcuation · f247ff3a
      Morris Jette authored
      Add argument to priority plugin's priority_p_reconfig function to note
      when the association and QOS used_cpu_run_secs field has been reset.
      Without this flag, we remove time on "scontrol setdebug" or
      "scontrol setdebugflag" that can result in used_cpu_run_secs
      going negative or otherwise get bad values.
      Correction to logic added in commit 6d793189
      bug 423
      f247ff3a
    • Morris Jette's avatar
      sched/wiki, sched/wiki2 - Fix to allow job start · 1f2348ab
      Morris Jette authored
      Fix to work with change logic introduced in Slurm version 2.6.3
      scheduling logic which prevented Maui/Moab from starting jobs.
      1f2348ab
  9. 31 Oct, 2013 1 commit
  10. 30 Oct, 2013 3 commits
  11. 29 Oct, 2013 3 commits
  12. 28 Oct, 2013 5 commits
  13. 25 Oct, 2013 4 commits
  14. 24 Oct, 2013 1 commit
    • Morris Jette's avatar
      Improve setting of job wait "Reason" field. · cf7ca59b
      Morris Jette authored
      Without this change a job with a reason of WAIT_PART_DOWN,
      WAIT_PART_INACTIVE, WAIT_PART_NODE_LIMIT, WAIT_PART_TIME_LIMIT, or
      WAIT_QOS_THRES would not be cleared when that reason no longer
      applied.
      cf7ca59b
  15. 23 Oct, 2013 4 commits
  16. 22 Oct, 2013 2 commits
    • Morris Jette's avatar
      proctrack/cgroup - Fix for race condition · 260c5485
      Morris Jette authored
      Add cgroup create retry logic in case one step is starting at the
      same time as another step is ending and the logic to create
      and delete cgroups overlaps.
      bug 447
      260c5485
    • Morris Jette's avatar
      Problem allocating threads with GPUs · dab7fb02
      Morris Jette authored
      If a node has GRES and multiple threads per core the select/cons_res
      plugin can get stuck in an infinite loop.
      See bug 475
      Contributed by:
      PREVOST Ludovic
      NEC HPC Europe
      dab7fb02