1. 05 Jun, 2015 1 commit
  2. 04 Jun, 2015 2 commits
  3. 03 Jun, 2015 1 commit
    • Morris Jette's avatar
      switch/cray: Refine PMI_CRAY_NO_SMP_ENV set · ef66b2eb
      Morris Jette authored
      switch/cray: Refine logic to set PMI_CRAY_NO_SMP_ENV environment variable.
      Rather than testing for the task distribution option, test the actual
      task IDs to see fi they are monotonically increasing across all nodes.
      Based upon idea from Brian Gilmer (Cray).
      ef66b2eb
  4. 02 Jun, 2015 3 commits
  5. 01 Jun, 2015 1 commit
  6. 30 May, 2015 1 commit
  7. 29 May, 2015 5 commits
  8. 28 May, 2015 1 commit
  9. 27 May, 2015 1 commit
    • Morris Jette's avatar
      Map job --mem-per-cpu=0 to --mem=0. · 33c77302
      Morris Jette authored
      However, --mem=0 now reflects the appropriate amount of memory in the
      system, --mem-per-cpu=0 hasn't changed.  This allows all the memory to
      be allocated in a cgroup but is not "consumed" and is available for
      other jobs running on the same host.
      Eric Martin, Washington University School of Medicine
      33c77302
  10. 26 May, 2015 1 commit
  11. 22 May, 2015 1 commit
  12. 21 May, 2015 1 commit
  13. 20 May, 2015 2 commits
  14. 19 May, 2015 1 commit
  15. 16 May, 2015 1 commit
  16. 15 May, 2015 2 commits
  17. 14 May, 2015 2 commits
  18. 13 May, 2015 3 commits
  19. 12 May, 2015 1 commit
  20. 11 May, 2015 1 commit
    • Morris Jette's avatar
      Purge old step data on job requeue · beecc7b0
      Morris Jette authored
      Make sure that old step data is purged when a job is requeued.
      Without this logic, if a job terminates abnormally then old step
      data may be left in slurmctld. If the job is then requeued and
      started on a different node, referencing that old job step data
      can result in abnormal events. One specific failure mode is if
      the job is requeued on a node with a different number of cores,
      and the step terminated RPC arrives later, the job and step
      bitmaps of allocated cores can differ in size generating an
      abort.
      bug 1660
      beecc7b0
  21. 08 May, 2015 3 commits
  22. 07 May, 2015 1 commit
  23. 06 May, 2015 1 commit
  24. 05 May, 2015 1 commit
  25. 01 May, 2015 1 commit
  26. 30 Apr, 2015 1 commit
    • Morris Jette's avatar
      Change slurmctld agent timeout · 98e08216
      Morris Jette authored
      In slurmctld communication agent, make the thread timeout be the configured
      value of MessageTimeout (or 30 seconds, whichever is larger) rather than
      30 seconds.
      98e08216