1. 02 Apr, 2014 2 commits
    • Morris Jette's avatar
      Minor tweak to scheduler cycle timing · 8fb863f9
      Morris Jette authored
      Decrease maximimum scheduler main loop run time from 10 secs to
      4 secs for improved performance.
      If running with sched/backfill, do not run through all jobs on
      periodic scheduling loop, but only the default depth. The
      backfill scheduler can go through more jobs anyway due to its
      ability to relinquish and recover locks.
      See bug 616
      8fb863f9
    • Morris Jette's avatar
      launch/poe - fix network value · ad7100b8
      Morris Jette authored
      if an job step's network value is set by poe, either by directly
      executing poe or srun launching poe, that value was not being
      propagated to the job step creation RPC and the network was not
      being set up for the proper protocol (e.g. mpi, lapi, pami, etc.).
      The previous logic would only work if the srun execute line
      explicitly set the protocol using the --network option.
      ad7100b8
  2. 31 Mar, 2014 2 commits
  3. 26 Mar, 2014 1 commit
  4. 25 Mar, 2014 1 commit
  5. 24 Mar, 2014 1 commit
    • Morris Jette's avatar
      job array dependency recovery fix · fca71890
      Morris Jette authored
      When slurmctld restarted, it would not recover dependencies on
      job array elements and would just discard the depenency. This
      corrects the parsing problem to recover the dependency. The old code
      would print a mesage like this and discard it:
      slurmctld: error: Invalid dependencies discarded for job 51: afterany:47_*
      fca71890
  6. 21 Mar, 2014 2 commits
    • Danny Auble's avatar
      NRT - Fix minor typos · 675b25ad
      Danny Auble authored
      675b25ad
    • Danny Auble's avatar
      NRT - Fix issue with 1 node jobs. It turns out the network does need to · 440932df
      Danny Auble authored
      be setup for 1 node jobs.  Here are some of the reasons from IBM...
      
      1. PE expects it.
      2. For failover, if there was some challenge or difficulty with the
         shared-memory method of data transfer, the protocol stack might
         want to go through the adapter instead.
      3. For flexibility, the protocol stack might want to be able to transfer
         data using some variable combination of shared memory and adapter-based
         communication, and
      4. Possibly most important, for overall performance, it might be that
         bandwidth or efficiency (BW per CPU cycles) might be better using the
         adapter resources.  (An obvious case is for large messages, it might
         require a lot fewer CPU cycles to program the DMA engines on the
         adapter to move data between tasks, rather than depend on the CPU
         to move the data with loads and stores, or page re-mapping -- and
         a DMA engine might actually move the data more quickly, if it's well
         integrated with the memory system, as it is in the P775 case.)
      440932df
  7. 20 Mar, 2014 2 commits
  8. 19 Mar, 2014 2 commits
  9. 18 Mar, 2014 4 commits
  10. 17 Mar, 2014 1 commit
  11. 15 Mar, 2014 2 commits
  12. 14 Mar, 2014 3 commits
  13. 12 Mar, 2014 1 commit
  14. 11 Mar, 2014 6 commits
  15. 10 Mar, 2014 1 commit
  16. 08 Mar, 2014 2 commits
  17. 07 Mar, 2014 6 commits
  18. 06 Mar, 2014 1 commit