1. 30 Sep, 2019 3 commits
  2. 26 Sep, 2019 3 commits
  3. 25 Sep, 2019 1 commit
    • Albert Gil's avatar
      Fix scancel --full for proctrack/cgroups · 4dfb3ad6
      Albert Gil authored
      Now the signaling of the batch step and the handeling of the flags is totally
      handled in _kill_all_active_steps() in slurmd, and _handle_signal_container()
      in stepd to ensure that:
      - if KILL_JOB_BATCH then only batch container is signaled
      - if KILL_FULL_JOB then batch script and its children are also signaled
      - if both of the above then only the batch script and its children are signaled
      
      We do not relay anymore on proctrack_g_signal() to handle the batch step
      signaling anymore, therefore it works the same for all proctrack plugins.
      
      This commit also includes minor related fixes in other code handling such
      signaling flags, and documentation improvement.
      
      Bug 7282
      4dfb3ad6
  4. 23 Sep, 2019 1 commit
  5. 20 Sep, 2019 2 commits
  6. 16 Sep, 2019 1 commit
  7. 12 Sep, 2019 3 commits
  8. 10 Sep, 2019 1 commit
  9. 06 Sep, 2019 2 commits
  10. 04 Sep, 2019 3 commits
  11. 03 Sep, 2019 4 commits
  12. 29 Aug, 2019 4 commits
  13. 28 Aug, 2019 1 commit
    • Alejandro Sanchez's avatar
      Don't update [min|max]_exit_code on job array task requeue. · 0e42eb87
      Alejandro Sanchez authored
      Only do so once the task actually finishes. Otherwise, a requeued task
      could set an incorrect max_exit_code even if completed with exit code 0
      after re-running again, leading to problems with i.e. other jobs with an
      afterok type of dependency on such array relying on the incorrectly set
      max_exit_code.
      
      Bug 7552.
      0e42eb87
  14. 23 Aug, 2019 1 commit
  15. 20 Aug, 2019 2 commits
    • Danny Auble's avatar
      Handle situation where a slurmctld tries to communicate with slurmdbd more... · af7b4531
      Danny Auble authored
      Handle situation where a slurmctld tries to communicate with slurmdbd more than once at the same time.
      
      What can happen here is the slurmdbd/slurmctld connection gets hung up
      somehow.  If the slurmctld is restarted a new connection is made along
      side the old connection.  When the old connection gets unwedged the old
      connection will clear out the registration of the slurmctld making it so
      no updates are sent to that slurmctld.
      
      What this does is checks for old connections when a registration message
      comes in.  If we find one we print error set the rem_port = 0 and
      remove it from the list.  This makes it so when it gets unwedged we just
      close the socket instead of remove the registration.
      
      Bug 5213
      af7b4531
    • Alejandro Sanchez's avatar
      Fix NEWS entry for the previous commit a04eea2e. · d0729247
      Alejandro Sanchez authored
      Bug 7360.
      d0729247
  16. 19 Aug, 2019 2 commits
  17. 16 Aug, 2019 1 commit
  18. 15 Aug, 2019 1 commit
  19. 14 Aug, 2019 4 commits
    • Morris Jette's avatar
      COMPLETING nodes available immediately for job will-run test · 0666db61
      Morris Jette authored
      Consider jobs in COMPLETING state as being available immediatley for
      a job will-run evaluation. This assumes the completion will happen
      very soon after the test is run.
      
      bug 6769
      0666db61
    • Morris Jette's avatar
      Avoid select plugin resource usage underflow from duplicate job free · 2dd1f448
      Morris Jette authored
      All of the select plugins were performing a duplicate resource free
      for jobs in completing state when performing a will-run test for
      new jobs. This would frequently result in underflow messages.
      
      Bug 6769
      2dd1f448
    • Marshall Garey's avatar
      Enforce use of spank_option_getopt(). · 0ddeb0ed
      Marshall Garey authored
      Building off the prevoius commit, spank_option_getopt() is now valid in
      more functions than before, so we document and enforce from where
      spank_option_getopt() can safely be called and return ESPANK_NOT_AVAIL
      if it is called from any invalid SPANK context.
      
      Bug 7065.
      0ddeb0ed
    • Marshall Garey's avatar
      Add spank options to cache in remote callback. · dca0de85
      Marshall Garey authored
      When spank option callbacks are called, the options are added to a cache
      in memory so that spank_option_getopt() can retrieve the options when
      called. However, this was only happening when callbacks were called from
      the local context, so we make sure that the options are added to the
      cache when the callbacks are called from the remote context as well.
      
      Bug 7065.
      dca0de85