1. 14 Jan, 2013 8 commits
    • Morris Jette's avatar
      select/cons_res plugin: CPU allocation logic fix · 1ef41ac9
      Morris Jette authored
      Correction to CPU allocation count logic in for cores without hyperthreading.
      1ef41ac9
    • Hongjia Cao's avatar
      Add SLURM_SRUN_REDUCE_TASK_EXIT_MSG environment variable · 96986199
      Hongjia Cao authored
      With jobs launched using srun directly which end abnormally, there will
      be a step-killed-message(slurmd[cn123]: *** 1234.0 KILLED AT ... WITH
      SIGNAL 9 ***) from each node. And/or there will be a
      task-exit-message(srun: error: task[0-1]: Terminated) for each node. For
      large scale jobs, these messages become tedious and the other error
      messages will be buried. The attached two patches(for slurm-2.5.1)
      introduce two environment variables to control the output of such
      messages:
      
      SLURM_STEP_KILLED_MSG_NODE_ID: if set, only the specified node will
      print the step-killed-message;
      
      SLURM_SRUN_REDUCE_TASK_EXIT_MSG: if set and non-zero, successive task
      exit messages with the same exit code will be printed only once.
      96986199
    • Hongjia Cao's avatar
      Add SLURM_STEP_KILLED_MSG_NODE_ID environment variable · 232ab305
      Hongjia Cao authored
      With jobs launched using srun directly which end abnormally, there will
      be a step-killed-message(slurmd[cn123]: *** 1234.0 KILLED AT ... WITH
      SIGNAL 9 ***) from each node. And/or there will be a
      task-exit-message(srun: error: task[0-1]: Terminated) for each node. For
      large scale jobs, these messages become tedious and the other error
      messages will be buried. The attached two patches(for slurm-2.5.1)
      introduce two environment variables to control the output of such
      messages:
      
      SLURM_STEP_KILLED_MSG_NODE_ID: if set, only the specified node will
      print the step-killed-message;
      
      SLURM_SRUN_REDUCE_TASK_EXIT_MSG: if set and non-zero, successive task
      exit messages with the same exit code will be printed only once.
      232ab305
    • Morris Jette's avatar
      Merge branch 'slurm-2.5' · fef33d8d
      Morris Jette authored
      fef33d8d
    • Morris Jette's avatar
      Add debugging hint to MPI guide for MPICH2 · dd8c22c7
      Morris Jette authored
      dd8c22c7
    • Yair Yarom's avatar
      Fix bug in accounting_storage/pgsql · 667cbf15
      Yair Yarom authored
      667cbf15
    • Morris Jette's avatar
      08cfbf0a
    • Morris Jette's avatar
      Revision of gres topology bug fix · e9c216c4
      Morris Jette authored
      e9c216c4
  2. 11 Jan, 2013 10 commits
  3. 10 Jan, 2013 15 commits
  4. 09 Jan, 2013 7 commits