1. 08 Feb, 2017 3 commits
  2. 07 Feb, 2017 1 commit
  3. 03 Feb, 2017 1 commit
  4. 31 Jan, 2017 3 commits
  5. 30 Jan, 2017 3 commits
    • Danny Auble's avatar
      Fix regression from commits · a4c51165
      Danny Auble authored
      e3a7bdcc
      f9804256
      d72b13f2
      
      Reference bug 3366
      
      If you are running on a Bluegene system we rely on the prolog to take us out of configuring
      state.  These commits work good for system rebooting the nodes where the prolog is running,
      but in the case of Bluegene this is the opposite desire :).   These commits on a Bluegene
      pretty much make it so a batch job never gets launched.
      a4c51165
    • Morris Jette's avatar
      Clear job BeginTime reason · 0abbf727
      Morris Jette authored
      Clear job's reason of "BeginTime" in a more timely fashion and/or prevents
          them from being stuck in a PENDING state. There are multiple ways of
          clearing the reason, especially on a lightly loaded system, but the
          state can persist indefinitely on a heavily loaded system.
      bug 3368
      0abbf727
    • Morris Jette's avatar
      will_run fix for job with begin time in past · f75abc9c
      Morris Jette authored
      Fix to logic for getting expected start time of existing job ID with
          explicit begin time that is in the past. Previous logic would
          compare that (past) begin time with advanced reservations that
          would compete with it rather than the current time.
      f75abc9c
  6. 29 Jan, 2017 4 commits
  7. 28 Jan, 2017 4 commits
  8. 27 Jan, 2017 2 commits
  9. 26 Jan, 2017 3 commits
  10. 25 Jan, 2017 10 commits
  11. 24 Jan, 2017 3 commits
  12. 23 Jan, 2017 3 commits
    • Morris Jette's avatar
      For batch step, reset job memory after node boot · 0277629b
      Morris Jette authored
      Reset a job's memory limit based upon what's available after node
        reboot, which can change on a KNL if the MCDRAM mode is changes
        on reboot
      0277629b
    • Morris Jette's avatar
      Fix for backfill launch job with reboot · d72b13f2
      Morris Jette authored
      This bug was likely the root cause of bug 3366. If the backfill scheduler
        allocates resources for a batch job and a node reboot is required, the
        batch launch RPC would be sent to the agent. At that point, there is a
        race condition between the agent and the job_time_limit() function
        testing for boot completion. If the job_time_limit() function ran
        first, it would trigger a second launch RPC request getting sent to
        the agent.
      bug 3366
      d72b13f2
    • Morris Jette's avatar
      Cleaner job configuring logic · f9804256
      Morris Jette authored
      Clean up logic to test if job is configuring
      bug 3366
      f9804256