1. 21 Aug, 2017 2 commits
    • Isaac Hartung's avatar
      Print numbers using exponential format as needed · c125759d
      Isaac Hartung authored
      Print numbers using exponential format if required to fit in allocated
          field width. The sacctmgr and sshare commands are impacted.
      bug 1749
      c125759d
    • Alejandro Sanchez's avatar
      select/cons_res - fix bug with Dragonfly and --switches count timeout · 46c0919d
      Alejandro Sanchez authored
      Given a configuration with TopologyParam including Dragonfly option, if a
      job requested --switches count, the count timeout specified by either
      the job request or max_switch_wait SchedulerParameters was not respected.
      This was due to leaf_switch_count variable not being incremented in
      _eval_nodes_dfly() function when needed, as we do in _eval_nodes_topo(),
      the later being a execution path which already succeed to wait for the
      switch count timeout.
      
      Bug 4056
      46c0919d
  2. 18 Aug, 2017 2 commits
  3. 17 Aug, 2017 1 commit
  4. 16 Aug, 2017 1 commit
  5. 15 Aug, 2017 4 commits
  6. 14 Aug, 2017 3 commits
  7. 12 Aug, 2017 1 commit
  8. 11 Aug, 2017 5 commits
  9. 10 Aug, 2017 2 commits
  10. 07 Aug, 2017 2 commits
  11. 04 Aug, 2017 6 commits
  12. 03 Aug, 2017 1 commit
    • Morris Jette's avatar
      pack job step I/O race condition fix · 71a34f56
      Morris Jette authored
      Fix I/O race condition on step termination for srun launching multiple
         pack job groups. Without this change application output might be
         lost and/or the srun command might hang after some tasks exit.
      71a34f56
  13. 02 Aug, 2017 4 commits
  14. 01 Aug, 2017 3 commits
  15. 31 Jul, 2017 1 commit
  16. 28 Jul, 2017 2 commits
    • Danny Auble's avatar
      Fix issue when an alternate munge key when communicating on a persistent · 591dc036
      Danny Auble authored
      connection.
      
      Bug 4009
      591dc036
    • Alejandro Sanchez's avatar
      jobcomp/elasticsearch - save state on REQUEST_CONTROL. · 8944b77a
      Alejandro Sanchez authored
      jobcomp/elasticsearch saves/load the state to/from elasticsearch_state.  Since
      the jobcomp API isn't designed with save/load state operations, the plugin
      _save_state() isn't extern and not available from outside the plugin itself,
      thus it is highly coupled to fini() function. This state doesn't follow the
      same execution path as the rest of Slurm states, where in save_all_sate()
      they are all independently scheduled. So we save it manually here on a RPC
      of type REQUEST_CONTROL.
      
      This enables that when the Primary ctld issues a REQUEST_CONTROL to the Backup
      which is currently in controller mode, the Backup will save the state and when
      the Primary assumes control again it can process the saved pending jobs.  The
      other way around was already controlled, because when the Primary is running
      in controller mode and the Backup issues a REQUEST_CONTROL, the Primary is
      shutdown and when breaking the ctld main() function while(1) loop, there was
      already a g_slurm_jobcomp_fini() call in place.
      
      Bug 3908
      8944b77a