1. 02 Mar, 2012 7 commits
    • Morris Jette's avatar
      Mods in priority/multifactor for prio=1 · b223af49
      Morris Jette authored
      In SLURM verstion 2.4, we now schedule jobs at priority=1 and no longer treat
      it as a special case.
      b223af49
    • Morris Jette's avatar
      Cosmetic mods to priority logic · 0810353e
      Morris Jette authored
      0810353e
    • Morris Jette's avatar
      Merge branch 'slurm-2.3' · ec372e00
      Morris Jette authored
      ec372e00
    • Morris Jette's avatar
      cray/srun wrapper, don't use aprun -q by default · ea9adc17
      Morris Jette authored
      In cray/srun wrapper, only include aprun "-q" option when srun "--quiet"
      option is used.
      ea9adc17
    • Morris Jette's avatar
      Change a slurmd msg from info() to debug() · 73f915bf
      Morris Jette authored
      73f915bf
    • Morris Jette's avatar
      Merge branch 'slurm-2.3' · c06064bc
      Morris Jette authored
      c06064bc
    • Morris Jette's avatar
      Fix for possible SEGV · ed56303c
      Morris Jette authored
      Here's what seems to have happened:
      
      - A job was pending, waiting for resources.
      - slurm.conf was changed to remove some nodes, and a scontrol reconfigure was done.
      - As a result of the reconfigure, the pending job became non-runnable, due to "Requested node configuration is not available". The scheduler set the job state to JOB_FAILED and called delete_job_details.
      - scontrol reconfigure was done again.
      - read_slurm_conf called _restore_job_dependencies.
      - _restore_job_dependencies called build_feature_list for each job in the job list
      - When build_feature_list tried to reference the now deleted job details for the failed job, it got a segmentation fault.
      
      The problem was reported by a customer on Slurm 2.2.7.  I have not been able to reproduce it on 2.4.0-pre3, although the relevant code looks the same. There may be a timing window. The attached patch attempts to fix the problem by adding a check to _restore_job_dependencies.  If the job state is JOB_FAILED, the job is skipped.
      
      Regards,
      Martin
      
      This is an alternative solutionh to bug316980fix.patch
      ed56303c
  2. 01 Mar, 2012 1 commit
  3. 29 Feb, 2012 1 commit
  4. 28 Feb, 2012 7 commits
  5. 27 Feb, 2012 2 commits
  6. 25 Feb, 2012 1 commit
    • Morris Jette's avatar
      Print negative time as "INVALID" · 131ff55e
      Morris Jette authored
      If a time value to be printed (e.g. job run time) is negative then
      print the value as "INVALID" rather than with negative numbers
      (e.g. "-123--12:-12:-12").
      131ff55e
  7. 24 Feb, 2012 19 commits
  8. 23 Feb, 2012 2 commits