1. 20 Apr, 2016 2 commits
    • Janne Blomqvist's avatar
      Support the intel_pstate scaling driver · a4f35c45
      Janne Blomqvist authored
      I noticed that the CpuFreqDef config option was only partially implemented. The value was parsed, but the never used. So I took the liberty of re-purposing it to mean sort of the opposite, namely the frequency governor to use when running a job step in case the job doesn't explicitly provide any --cpu-freq option.
      
      I also changed the default of the CpuFreqGovernors option to be "ondemand,performance", since ondemand isn't available with the intel_pstate driver.
      
      Otherwise the patch should be relatively straightforward and only changes a few minor things here and there.
      a4f35c45
    • Tim Wickberg's avatar
      836decb1
  2. 15 Apr, 2016 1 commit
    • Morris Jette's avatar
      Network topology option · bd42eaf7
      Morris Jette authored
      Add TopologyParam option of "TopoOptional" to optimize network topology
          only for jobs requesting it.
      bug 2567
      bd42eaf7
  3. 14 Apr, 2016 8 commits
  4. 13 Apr, 2016 7 commits
  5. 12 Apr, 2016 3 commits
  6. 11 Apr, 2016 5 commits
  7. 09 Apr, 2016 1 commit
    • Morris Jette's avatar
      backfill scheduling enhancement · e62a9270
      Morris Jette authored
      When determining when a pending job will be able to start, rather
        than testing after removing each running job and trying to schedule
        the pending jobs, remove multiple jobs that all end about the
        same time before testing. This reduces the number of calls to
        the job placement logic, which is time consuming.
      e62a9270
  8. 08 Apr, 2016 1 commit
  9. 07 Apr, 2016 2 commits
  10. 06 Apr, 2016 7 commits
  11. 05 Apr, 2016 1 commit
    • Morris Jette's avatar
      Fix backfill scheduler race condition · d8b18ff8
      Morris Jette authored
      Fix backfill scheduler race condition that could cause invalid pointer in
          select/cons_res plugin. Bug introduced in 15.08.9, commit:
          efd9d35e
      
      The scenario is as follows
      1. Backfill scheduler is running, then releases locks
      2. Main scheduling loop starts a job "A"
      3. Backfill scheduler resumes, finds job "A" in its queue and
         resets it's partition pointer.
      4. Job "A" completes and tries to remove resource allocation record
         from select/cons_res data structure, but fails to find it because
         it is looking in the table for the wrong partition.
      5. Job "A" record gets purged from slurmctld
      6. Select/cons_res plugin attempts to operate on resource allocation
         data structure, finds pointer into the now purged data structure
         of job "A" and aborts or gets SEGV
      Bug 2603
      d8b18ff8
  12. 04 Apr, 2016 2 commits