1. 21 Nov, 2012 1 commit
    • Matthieu Hautreux's avatar
      Correct a bug in consecutive steps management due to asynchronous step completions · 4c97337d
      Matthieu Hautreux authored
      When using consecutive steps, it appears that in some cases, the time required
      by the slurmstepd on the execution nodes to inform the controler of the completion
      of the step is higher than the time required to request the following step.
      In that scenario, the controler can reject the step by returning the error code
      ESLURM_REQUESTED_NODE_CONFIG_UNAVAILABLE even if the step could be executed if
      all the former steps were correctly finished.
      
      This can be reproduced by launching consecutive steps and introducing dalys in
      the spank epilog on the execution nodes.
      
      The behavior is changed to only defer the execution of the step by returning
      ESLURM_NODES_BUSY when all the available nodes are not idle considering the
      former steps.
      4c97337d
  2. 20 Nov, 2012 2 commits
  3. 19 Nov, 2012 3 commits
  4. 09 Nov, 2012 1 commit
  5. 07 Nov, 2012 4 commits
  6. 05 Nov, 2012 2 commits
  7. 02 Nov, 2012 3 commits
  8. 26 Oct, 2012 1 commit
  9. 25 Oct, 2012 3 commits
  10. 24 Oct, 2012 1 commit
  11. 23 Oct, 2012 4 commits
  12. 22 Oct, 2012 1 commit
  13. 19 Oct, 2012 3 commits
  14. 18 Oct, 2012 9 commits
  15. 17 Oct, 2012 1 commit
  16. 16 Oct, 2012 1 commit