1. 27 Jul, 2017 2 commits
    • Alejandro Sanchez's avatar
      Fix bug when tracking multiple simultaneous spawned ping cycles · f7463ef5
      Alejandro Sanchez authored
      When more than 1 ping cycle is spawned simultaneously (for instance
      REQUEST_PING + REQUEST_NODE_REGISTRATION_STATUS for the selected nodes),
      we do not track a separate ping_start time for each cycle. When ping_begin()
      is called, the information about the previous ping cycle is lost. Then when
      ping_end() is called for the first of the two cycles, we set ping_start=0,
      which is incorrectly used to see if the last cycle ran for more than
      PING_TIMEOUT seconds (100s), thus incorrectly triggering the:
      
       error("Node ping apparently hung, many nodes may be DOWN or configured "
             "SlurmdTimeout should be increased");
      
      Bug 3914
      f7463ef5
    • Tim Shaw's avatar
      04b431b4
  2. 26 Jul, 2017 5 commits
  3. 25 Jul, 2017 1 commit
  4. 24 Jul, 2017 3 commits
  5. 21 Jul, 2017 3 commits
  6. 19 Jul, 2017 4 commits
  7. 18 Jul, 2017 1 commit
    • Dominik Bartkiewicz's avatar
      Fix issue with multiple jobs from an array to start. · b40bd8d3
      Dominik Bartkiewicz authored
      By removing the real locks we can get into a race condition where the prolog
      starts and finishes before we get here and then we end up waiting forever.
      
      Making the mutex a static seemed to help in many cases, but didn't
      completely close the window.  Changing slurm_cond_wait to
      slurm_cond_timedwait fixed the scenario where we would hit the window, but
      not degrade performance the original commit provides.
      
      There were also spots where if the job or step didn't exist it wouldn't
      signal the conditional also providing a spot this could get stuck not
      starting the job.
      
      Fix regression from commit 52ce3ff0
      
      Bug 3977
      b40bd8d3
  8. 14 Jul, 2017 3 commits
    • Tim Shaw's avatar
      Fix example code to actually work. · 5733505a
      Tim Shaw authored
      Code provided by Ole Nielsen <Ole.H.Nielsen@fysik.dtu.dk>
      
      Bug 3985
      5733505a
    • Danny Auble's avatar
      Fix whitespace, no code change. · dc6f910b
      Danny Auble authored
      dc6f910b
    • Danny Auble's avatar
      Fix issue with whole gres not being printed out with Slurm tools. · 028bf3e1
      Danny Auble authored
      This is a regression from commit fec995e0.
      
      It turns out using tok here was erroneous for situations where the gres had
      a type and name and potentially a count (i.e. network:gigabit:1)
      
      _get_gres_req_cnt() would alter the incoming char *config which is what tok
      was.  So when we print it back to the requested string it would only have
      what was there to the first ':'.  As we didn't need to \0 out the first char
      as we skip over it anyway I just kept track of what the replaced \0 was for
      the number portion and put it back when we are done copying it.
      
      Related to bug 3521
      028bf3e1
  9. 13 Jul, 2017 7 commits
  10. 10 Jul, 2017 1 commit
  11. 07 Jul, 2017 5 commits
  12. 06 Jul, 2017 1 commit
  13. 05 Jul, 2017 4 commits