1. 13 Dec, 2013 1 commit
    • Morris Jette's avatar
      Fix slurmstepd race condition causing abort · be703c47
      Morris Jette authored
      Fix slurmstepd race condition when separate threads are reading and
      modifying the job's environment, which can result in the slurmstepd failing
      with an invalid memory reference. Observed at shutdown when trying
      to run the task epilog and trying to read the env var:
      SLURM_STEP_KILLED_MSG_NODE_ID
      be703c47
  2. 12 Dec, 2013 1 commit
    • Morris Jette's avatar
      slurmstepd variable initialization · 06b41cdc
      Morris Jette authored
      Without this patch, free() is called on a random memory location
      (i.e. whatever is on the stack), which can result in slurmstepd
      dying and a completed job not being purged in a timely fashion.
      06b41cdc
  3. 11 Dec, 2013 2 commits
  4. 09 Dec, 2013 2 commits
    • Morris Jette's avatar
      Modify squeue to support longer job ID values · 17f27007
      Morris Jette authored
      This is needed for job arrays with discontiguous task ID values
      (e.g. "123_[1,3,5,...99999]")
      17f27007
    • Morris Jette's avatar
      Improve sview support for job arrays · d998640f
      Morris Jette authored
      Previously job arrays were only listed with their native job ID
      (e.g. 123_0 listed as 123, 123_1 as 124, etc). Now lists the job ID
      using both format (e.g. "123_1 (124)"). The same format is used
      for job step IDs (e.g. "123_1.2 (124.2)").
      d998640f
  5. 08 Dec, 2013 1 commit
  6. 07 Dec, 2013 2 commits
  7. 06 Dec, 2013 2 commits
  8. 05 Dec, 2013 1 commit
  9. 04 Dec, 2013 1 commit
  10. 03 Dec, 2013 3 commits
    • Morris Jette's avatar
      Improve REQUEST_JOB_INFO_SINGLE RPC performance · 80d3b343
      Morris Jette authored
      Use hash function to locate job records for improved performance.
      80d3b343
    • Morris Jette's avatar
      Improve REQUEST_JOB_INFO_SINGLE RPC performance · 14bcfe58
      Morris Jette authored
      Change partition write lock to a read lock as we use a different
      mechanism for hidden partitions in getting individual jobs.
      14bcfe58
    • Morris Jette's avatar
      Correct job dependency string · 08265c03
      Morris Jette authored
      Correct logic returning remaining job dependencies in job information
      reported by scontrol and squeue. Eliminates vestigial descriptors with
      no job ID values (e.g. "afterany"). As depdencies are removed, the
      job ID values were removed from the strings, but not the descriptors.
      This eliminates both. It also checks the full job ID to make sure we do
      not remove "afterany:1234" when job "123" completes.
      08265c03
  11. 02 Dec, 2013 2 commits
  12. 29 Nov, 2013 2 commits
  13. 27 Nov, 2013 1 commit
  14. 26 Nov, 2013 1 commit
  15. 14 Nov, 2013 1 commit
  16. 13 Nov, 2013 1 commit
    • Morris Jette's avatar
      Corrections to advanced reservation logic with overlapping jobs. · d6954b77
      Morris Jette authored
      This might have worked fine for core reservations or when there
      are sufficient idle nodes to use, the the select_g_resv_test()
      function clears the node bitmap for nodes that it can not use
      and the reservation create logic did not restore that bitmap
      after a failed resource selection attempt. This logic restores
      the node bitmap on a failed call to select_g_resv_test() so we
      can add nodes to the bitmap of available nodes rather than having
      it repeatedly cleared.
      The logic also adds some performance enhancements that I will
      add to in the next commit.
      d6954b77
  17. 08 Nov, 2013 1 commit
  18. 05 Nov, 2013 1 commit
  19. 04 Nov, 2013 2 commits
  20. 01 Nov, 2013 2 commits
    • Morris Jette's avatar
      Fix for used_cpu_run_secs bad calcuation · f247ff3a
      Morris Jette authored
      Add argument to priority plugin's priority_p_reconfig function to note
      when the association and QOS used_cpu_run_secs field has been reset.
      Without this flag, we remove time on "scontrol setdebug" or
      "scontrol setdebugflag" that can result in used_cpu_run_secs
      going negative or otherwise get bad values.
      Correction to logic added in commit 6d793189
      bug 423
      f247ff3a
    • Morris Jette's avatar
      sched/wiki, sched/wiki2 - Fix to allow job start · 1f2348ab
      Morris Jette authored
      Fix to work with change logic introduced in Slurm version 2.6.3
      scheduling logic which prevented Maui/Moab from starting jobs.
      1f2348ab
  21. 29 Oct, 2013 3 commits
  22. 28 Oct, 2013 4 commits
  23. 25 Oct, 2013 3 commits