1. 01 Sep, 2011 2 commits
  2. 31 Aug, 2011 2 commits
  3. 25 Aug, 2011 5 commits
  4. 24 Aug, 2011 5 commits
  5. 23 Aug, 2011 1 commit
  6. 22 Aug, 2011 2 commits
  7. 19 Aug, 2011 1 commit
    • Morris Jette's avatar
      Treat duplicate switch name in topology.conf as fatal error · d2a30013
      Morris Jette authored
      One of our testers created an illegal topology.conf file.
      
      He has a config you probably wouldn't see in production, but can see in
      testing when you are sometimes given a collection of miscellaneous
      resources.
      
                |-- nodes
      switch1 --|
                |-- switch2 -- nodes
      
      He tried the topology.conf file below. Switch s1 is defined twice. Slurm
      accepted this config, but wouldn't allocate nodes from both switches to
      one job.
      
      SwitchName=s1 Nodes=xna[14-26]
      SwitchName=s2 Nodes=xna[41-43]
      SwitchName=s1 Switches=s2
      
      I believe slurm shouldn't allow the second definition of switch s1. The
      attached patch checks for duplicate switch names.
      Patch from Rod Schultz, Bull.
      d2a30013
  8. 17 Aug, 2011 1 commit
  9. 16 Aug, 2011 1 commit
  10. 12 Aug, 2011 2 commits
  11. 11 Aug, 2011 2 commits
  12. 10 Aug, 2011 3 commits
  13. 09 Aug, 2011 3 commits
    • Morris Jette's avatar
      Cray srun wrapper, map --share and --exclusive options · 08538cb8
      Morris Jette authored
      This change applies only to Cray systems and only when the srun
      wrapper for aprun. Map --exclusive to -F exclusive and --share to
      -F share. Note this does not consider the partition's Shared
      configuration, so it is an imperfect mapping of options.
      08538cb8
    • Morris Jette's avatar
      Cray DOWN node will be treated as transient condition · 493aa97a
      Morris Jette authored
      A node DOWN to ALPS will be marked DOWN to SLURM only after reaching
      SlurmdTimeout. In the interim, the node state will be NO_RESPOND. This
      change makes behavior makes SLURM handling of the node DOWN state more
      consistent with ALPS. This change effects only Cray systems.
      493aa97a
    • Morris Jette's avatar
      Fix node state acctg for cray. · acfa9aca
      Morris Jette authored
      Fix the node state accounting to be consistent with the node state
      set by ALPS.
      acfa9aca
  14. 05 Aug, 2011 2 commits
  15. 04 Aug, 2011 2 commits
    • Morris Jette's avatar
      Require SchedulerTimeSlice be at least 5 secs · c9b0eafe
      Morris Jette authored
      Require SchedulerTimeSlice configuration parameter to be at least 5 seconds
      to avoid thrashing slurmd daemon.
      Addresses Cray bug 774692
      c9b0eafe
    • Morris Jette's avatar
      Job step now gets all of job's GRES by default · 1078426e
      Morris Jette authored
      Change in GRES behavior for job steps: A job step's default generic
          resource allocation will be set to that of the job. If a job step's --gres
          value is set to "none" then none of the generic resources which have been
          allocated to the job will be allocated to the job step.
      Add srun environment value of SLURM_STEP_GRES to set default --gres value
          for a job step.
      1078426e
  16. 03 Aug, 2011 2 commits
  17. 02 Aug, 2011 2 commits
  18. 01 Aug, 2011 2 commits