1. 29 Mar, 2011 8 commits
    • Moe Jette's avatar
      Patch #29: Provides an /etc/sysconfig/slurm for Cray systems. · e1df3d89
      Moe Jette authored
                   We have installed the same file this morning on all
                   our systems (including a non-Cray cluster which also
                   is SuSe based). I have verified that the limits get
                   picked up by looking at /proc/$(pidof slurmd)/limits.
      
      select/cray: override ulimits on SuSe based system
      
      This provides a sample /etc/sysconfig/slurm file to override ulimits on Suse
      systems such as Cray.
      
      Since slurm respects limits configured by the system administrator, and since
      Cray/SuSe systems (in contrast to Debian-based systems) do not automatically
      exempt processes owned by the super-user from pam_limits configured in
      /etc/security/limits.conf, it can (and did) happen on Cray systems that such
      limits cause premature and counter-intuitive interaction with slurmd frontend
      nodes.
      
      The provided file overrides limits, using sensible defaults which have
      been inspired by the defaults set for processes owned by user root 
      e1df3d89
    • Moe Jette's avatar
      Patch #27: Converts Cray-specific #PBS directives in sbatch. I · df13b7f3
      Moe Jette authored
                  did this in preparation for the migration from PBS
                  which will start next week.
      
      sbatch: support mpp.* PBS variants
      
      This adds support for Cray-specific PBS directives:
       * mppwidth: Task width (corresponds to --ntasks). This is not
                   directly mapped, depends on the other parameters.
       * mppmem:   Memory in units of k/m/g. Default unit is Mbyte, kbyte units
                   are rounded up to the next Mbyte. Actual amount depends on
      	     mppnppn.
       * mppdepth: Task depth, maps into --cpus-per-task.
       * mppnppn:  Processing elements per node, maps into --ntasks-per-node.
       * mppnodes: Nodelist. In contrast to PBS, requires nid%05u prefix, i.e
                   the comma-separated list contains single entries nid%05u 
      	     and/or ranges nid%05u-nid%05u.
      df13b7f3
    • Moe Jette's avatar
      tch #26: Display negative priority rather than large unsigned · be59fd49
      Moe Jette authored
                  value (due to uint32_t conversion) in sprio. Helpful
                  when fine-tuning weight parameters.
      
      sprio: print overall priority value even if it is less than 0
      
      With some combinations of component values and low weight factors, it can happen that the
      priority computed by the priority/multifactor plugin lies below 0 (and would be rounded
      up to 2).
      
      When this condition happens, the negative values are difficult to interpret and can give
      the wrong impression that the resulting priority is very large (due to the conversion
      into a large unsigned number). 
      
      In our tests we found it more helpful to display the negative priority value: a user can
      know that SLURM does not use negative values, having the absolute value gives a better
      indication how much weight to add to the other factors so that the overall priority
      centers around 0.
      
      Before:
        JOBID     USER   PRIORITY        AGE  FAIRSHARE    JOBSIZE  PARTITION        QOS   NICE
         9968   sukysj       8955        218          0        236          0          0  -8500
        10065   amsmax 4294957826          9          0        340          0          0   9821
        10066   amsmax 4294957826          9          0        340          0          0   9821
        10067   amsmax 4294957826          9          0        340          0          0   9821
        10068   amsmax 4294957826          9          0        340          0          0   9821
        10069   amsmax 4294957826          9          0        340          0          0   9821
        10070   amsmax 4294957826          9          0        340          0          0   9821
      
      After:
        JOBID     USER   PRIORITY        AGE  FAIRSHARE    JOBSIZE  PARTITION        QOS   NICE
         9968   sukysj       8955        218          0        236          0          0  -8500
        10065   amsmax      -9470          9          0        340          0          0   9821
        10066   amsmax      -9470          9          0        340          0          0   9821
        10067   amsmax      -9470          9          0        340          0          0   9821
        10068   amsmax      -9470          9          0        340          0          0   9821
        10069   amsmax      -9470          9          0        340          0          0   9821
        10070   amsmax      -9470          9          0        340          0          0   9821
      be59fd49
    • Moe Jette's avatar
      Patch #25: Skip sprio display of jobs whose priority has been · e2cc4b2e
      Moe Jette authored
                  set directly (since the priority factor fields are 0).
      
      
      i
      rity/multifactor: skip jobs whose priority has been set directly
      
      This avoids displaying "house numbers" in sprio if the priority has been
      set directly, as in the following example for aghasemi (whose group is a
      "bottom-feeder" with a fixed priority of 10):
      
      palu> squeue
      JOBID  USER     ACCOUNT           NAME PARTITION ST REASON     START_TIME           TIME  TIME_LEFT NODES   PRIORITY
      6971   robinson g13               cp2k       day PD Resources  2011-03-16T13:09     0:00      40:00    35      10327
      6983   rpopescu s190              bash       day PD Resources  N/A                  0:00    1:00:00     1       8254
      6958   aghasemi s142         poslow007       day PD Priority   2011-03-16T15:28     0:00    1:00:00   108         10
      
      palu> sprio
       JOBID     USER   PRIORITY        AGE  FAIRSHARE    JOBSIZE  PARTITION        QOS   NICE
        6958 aghasemi      10000          0          0          0          0          0 -10000
        6964 rpopescu       8353         71          0         56          0          0  -8225
        6971 robinson      10327         63          0       1988          0          0  -8276
        ...
      e2cc4b2e
    • Moe Jette's avatar
      Patch #24: Typos (please note it also contains my own ones), · 22200093
      Moe Jette authored
                  this is ongoing, whenever I see something, I add it
                  to such a patch.
      22200093
    • Moe Jette's avatar
      ac70eb56
    • Danny Auble's avatar
      fix for setting error state · 59b4fb21
      Danny Auble authored
      59b4fb21
    • Danny Auble's avatar
  2. 28 Mar, 2011 7 commits
  3. 27 Mar, 2011 3 commits
  4. 26 Mar, 2011 15 commits
  5. 25 Mar, 2011 7 commits