1. 13 Dec, 2013 3 commits
  2. 12 Dec, 2013 5 commits
  3. 11 Dec, 2013 3 commits
  4. 10 Dec, 2013 3 commits
  5. 09 Dec, 2013 2 commits
    • Morris Jette's avatar
      Modify squeue to support longer job ID values · 17f27007
      Morris Jette authored
      This is needed for job arrays with discontiguous task ID values
      (e.g. "123_[1,3,5,...99999]")
      17f27007
    • Morris Jette's avatar
      Improve sview support for job arrays · d998640f
      Morris Jette authored
      Previously job arrays were only listed with their native job ID
      (e.g. 123_0 listed as 123, 123_1 as 124, etc). Now lists the job ID
      using both format (e.g. "123_1 (124)"). The same format is used
      for job step IDs (e.g. "123_1.2 (124.2)").
      d998640f
  6. 08 Dec, 2013 2 commits
    • jette's avatar
      Describe previous commit in NEWS · b19bd476
      jette authored
      b19bd476
    • jette's avatar
      Fix for dynamic changes to GRES · b9fe6815
      jette authored
      If the GRES is associated with specific files AND
      the GRES count is reset using scontrol AND
      the slurmd is restarted either without a gres.conf file or with a count and no specific files AND
      the GRES count is then increased using scontrol the GRES bitmap will not match its count
      
      This fixes the root cause of the mismatch between bitmap size and GRES
      count and should render the rebuilding of the bitmap unnecessary.
      The rebuilding was handled in the following commits
      commit ec4df3bf
      commit 1712d619
      b9fe6815
  7. 07 Dec, 2013 2 commits
  8. 06 Dec, 2013 5 commits
    • Jason Bacon's avatar
      Improve hwloc support for various processors · ac5d734b
      Jason Bacon authored
      Using CPU: Intel(R) Pentium(R) 4 CPU 2.40GHz (2392.04-MHz 686-class CPU)
        Origin = "GenuineIntel"  Id = 0xf27  Family = f  Model = 2  Stepping = 7
      Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
      
      It's also using an older version of hwloc (1.3.1) and I have not yet tested it with a newer one, but since 0 and -1 are legitimate returns values for hwloc_get_nbobjs_by_type(), I think they should be handled in any case.
      
      From the hwloc_get_nbobjs_by_type() man page:
      
      static inline int  hwloc_get_nbobjs_by_type (hwloc_topology_ttopology,
             hwloc_obj_type_ttype) [static]
             Returns the width of level type type. If no object for that type
             exists, 0 is returned. If there are several levels with objects of that
             type, -1 is returned.
      
      I'm attaching a smarter patch that handles both 0 and -1 return values for both CORE and SOCKET.  It logs a warning if it has to fudge a 0 return code and bails out with a helpful error message for -1, which I have no idea how to handle.  At least people won't have to waste time tracking down the problem this way.
      
      Happy Friday,
      
          Jason
      ac5d734b
    • Trofinoff  Stephen's avatar
      Added ApbasilTimeout parameter to the cray.conf · 270f696e
      Trofinoff Stephen authored
      This adds a mechanism to kill a hung apbasil command
      270f696e
    • Morris Jette's avatar
      Fix bad print · 1712d619
      Morris Jette authored
      error introduced in commit ec4df3bf
      1712d619
    • Jason Bacon's avatar
      Fix for hwloc returning zero core count · ec4df3bf
      Jason Bacon authored
      ec4df3bf
    • Morris Jette's avatar
      Fix for gres count change · 4e56260f
      Morris Jette authored
      A abort has been reported if the node's gres count differs from
      it's bitmap. This has been induced by changing the count manually
      (e.g. scontrol update nodename=tux123 gres=gpu:4"). I have not
      been able to reproduce this problem, but this will resize the
      bitmap in order to avoid the assert failure.
      4e56260f
  9. 05 Dec, 2013 2 commits
  10. 04 Dec, 2013 2 commits
  11. 03 Dec, 2013 4 commits
  12. 02 Dec, 2013 3 commits
  13. 29 Nov, 2013 4 commits
    • Morris Jette's avatar
      Improve test failure clean up · b1699285
      Morris Jette authored
      b1699285
    • Morris Jette's avatar
      Rewrite of cgroup locking · 13aa9184
      Morris Jette authored
      There was already cgroup locking in the version 14.03 code base
      using different variable names and slighly different logic from
      that in commit 3f6d9e36.
      This commit is a variant of that commit in order to make the logic
      in version 2.6 match that of our next release (logic which is
      already pretty well tested).
      bug 447
      13aa9184
    • Morris Jette's avatar
      proctrack/cgroup - Add lock preventing race condition · 3f6d9e36
      Morris Jette authored
      proctrack/cgroup - Add locking to prevent race condition where one job step
      is ending for a user or job at the same time another job stepsis starting
      and the user or job container is deleted from under the starting job step.
      bug 447
      3f6d9e36
    • Morris Jette's avatar
      Remove redundant variables · d704c747
      Morris Jette authored
      This eliminates some now redundant arrays and variable copying
      introduced in commit 74d1a4b4
      bug 525
      d704c747