1. 10 Sep, 2018 1 commit
  2. 06 Sep, 2018 1 commit
  3. 31 Aug, 2018 1 commit
  4. 30 Aug, 2018 1 commit
  5. 29 Aug, 2018 2 commits
  6. 27 Aug, 2018 2 commits
  7. 22 Aug, 2018 2 commits
    • Danny Auble's avatar
      Fix pmi2 to build with gcc 8.0+. · f8dda518
      Danny Auble authored
      Bug 5608
      
      Tim approved
      f8dda518
    • Brian Christiansen's avatar
      Fix segfault when starting ctld w/out dbd · eab9f405
      Brian Christiansen authored
      If the dbd comes up after a job array has been submitted to the
      controller, the controller calls _update_job_tres() which calls
      assoc_mgr_set_tres_cnt_array() which allocates memory for the job's
      tres_alloc_cnt. The job array gets scheduled, but job_array_split()
      doesn't NULL out the pending job's tres_alloc_cnt, so both the array
      task and the pending array job are pointing to the same memory. The
      array task calls job_set_alloc_tres() which free's the running job's
      tres_alloc_cnt and now the pending array job is pointing to bad memory
      and when the array splits again the new array task tries to free
      tres_alloc_cnt in job_set_alloc_tres() and segfaults.
      
      Bug 5604
      eab9f405
  8. 21 Aug, 2018 2 commits
  9. 20 Aug, 2018 1 commit
    • Michael Hinton's avatar
      Do not truncate MySQL database name at 33 characters. · 15c19c03
      Michael Hinton authored
      MySQL permits up to 64-character database names, but Slurm was truncating
      at 33-characters. If we exceed this limit, let the mysql_query fail and
      give the admin a chance to sort it out, rather than truncating and then
      failing to query against the un-truncated name later on.
      
      While here correct the fatal() message.
      
      Bug 5586.
      15c19c03
  10. 18 Aug, 2018 1 commit
  11. 16 Aug, 2018 5 commits
  12. 15 Aug, 2018 2 commits
  13. 14 Aug, 2018 3 commits
  14. 13 Aug, 2018 3 commits
  15. 11 Aug, 2018 5 commits
  16. 10 Aug, 2018 1 commit
  17. 09 Aug, 2018 2 commits
  18. 07 Aug, 2018 3 commits
  19. 06 Aug, 2018 1 commit
    • Tim Wickberg's avatar
      Modify slurm_send_only_node_msg() to catch issues with socket. · 06582da8
      Tim Wickberg authored
      There are subtle issues involved in treating a TCP transmission
      as a unidirectional message delivery layer.
      
      The original code path looks like: connect(), write(), close().
      But Linux handles the write() and close() asynchronously behind the
      scenes, and does not block until that write() has been ACK'd by the
      remote end. So the write() and close() may succeed, even with data
      still in flight. A communication error - and message loss - would
      have been silently ignored, leading to unreliable message transmission.
      
      Worse yet, one side of the connection would believe it sent the message,
      while the receive side swears it never saw the packets. This leads to
      infrequent and yet seemingly impossible data loss, and a very tough
      bug to chase down.
      
      This teardown code tries to force the connection to shut down in an
      orderly manner, giving Slurm a chance to catch a connection problem
      and the upstream calling path an opportunity to retransmit.
      
      This teardown code is based on an approach described in Section 7.5
      of "UNIX Network Programming" Volume 1 (Third Edition), specifically
      the subsection regarding SO_LINGER. (And also covers why SO_LINGER is
      not sufficent to prevent this issue.)
      
      Bug 5164.
      06582da8
  20. 04 Aug, 2018 1 commit