1. 01 Jun, 2017 5 commits
    • Tim Wickberg's avatar
      Handle file deletion for purge_old_job() in a separate thread. · b9719be2
      Tim Wickberg authored
      File deletion can be slow, especially when StateSaveLocation in on
      NFS or other network filesystems. Since purge_old_job() holds all
      the slurmctld write locks, this is especially performance sensitive.
      
      Moving this to an independent thread lets the slower filesystem
      cleanup happen without owning these locks. purge_old_job() then
      results in the purged job ids being queued in the purge_list.
      
      A race with the job id potentially wrapping around again is already
      prevented by _dup_job_file_test() in get_next_job_id().
      
      Bug 3763.
      b9719be2
    • Tim Wickberg's avatar
      Make _delete_job_details a static function. · ce2cd1b2
      Tim Wickberg authored
      Only called from _list_delete_job once the MinJobAge has
      passed.
      ce2cd1b2
    • Tim Wickberg's avatar
      Remove timeout code from job_purge_old. · 843e5d38
      Tim Wickberg authored
      This will need to be handled differently. The timeout can
      lead to the purge process falling further and further behind
      on high throughput systems if the number of job scripts that
      can be deleted within a second is lower than the job submission
      and completion rate of the cluster, eventually leading to
      the MaxJobCount limit being reached.
      
      Bug 3763.
      843e5d38
    • Danny Auble's avatar
      Better commit from last · cff4e661
      Danny Auble authored
      cff4e661
    • Danny Auble's avatar
  2. 31 May, 2017 6 commits
  3. 30 May, 2017 6 commits
  4. 26 May, 2017 6 commits
  5. 25 May, 2017 13 commits
  6. 24 May, 2017 4 commits