1. 09 Jan, 2013 3 commits
  2. 08 Jan, 2013 11 commits
  3. 07 Jan, 2013 1 commit
  4. 04 Jan, 2013 5 commits
    • jette's avatar
      Use local no-mem functions · 3a6bd336
      jette authored
      Make sure out of memory gets logged properly for slurmctld in foreground
      
      Fix slurmd and slurmdbd to log out of memory to stdout in foreground
      3a6bd336
    • jette's avatar
      Use local no-mem functions · 5e1d0210
      jette authored
      5e1d0210
    • Mark A. Grondona's avatar
      mpi/mvapich: Don't set MPIRUN_PROCESSES by default · fd5b0e56
      Mark A. Grondona authored
      The MPIRUN_PROCESSES variable set by the mpi/mvapich plugin probably
      is not needed for most if not all recent versions of mvapich.
      This environment variable also negatively affects job scalability
      since its length is proportional to the number of tasks in a job.
      In fact, for very large jobs, the increased environment size can
      lead to failures in execve(2).
      
      Since MPIRUN_PROCESSES *might* be required in some older versions of
      mvapich, this patch disables the setting of that variable completely
      only if SLURM_NEED_MVAPICH_MPIRUN_PROCESSES is not set in the job's
      environment. (Thus, by default MPIRUN_PROCESSES is disabled, but
      the old behavior may be restored by setting the environment variable
      above)
      fd5b0e56
    • jette's avatar
      b196f153
    • jette's avatar
      Fix logic in hostset_create for invalid input · 33cb1e40
      jette authored
      33cb1e40
  5. 03 Jan, 2013 16 commits
  6. 02 Jan, 2013 1 commit
    • Morris Jette's avatar
      Revert commit b2c18ec1 · ac27d503
      Morris Jette authored
      The original patch works fine to avoid cancelling a job when all
      of it's nodes go unresponsive, but I don't see any way to easily
      address nodes coming back into service. We want to cancel jobs
      that have some up nodes and some down nodes, but the nodes will
      come back into service indivually rather than all at once.
      ac27d503
  7. 31 Dec, 2012 1 commit
  8. 29 Dec, 2012 2 commits