1. 08 Nov, 2019 2 commits
    • Michael Hinton's avatar
      Fix issues with --gpu-bind while using cgroups · 5b13fbb3
      Michael Hinton authored
      CUDA_VISIBLE_DEVICES was not being set to the correct GPU indexes when
      cgroups were being used. These issues were exhibited with at least the
      map_gpu and mask_gpu binding options.
      
      The issue was that usable_gres is a bitmask of GRESs in the step's
      cgroup, but bit_test() was looking at bit i, which is the index of the
      global gres_list (not constrained by cgroups).
      
      Bug 7509
      5b13fbb3
    • Felip Moll's avatar
      Fix regression on update from older versions with DefMemPerCPU · 6abe1e75
      Felip Moll authored
      In 19.05 JOB_MEM_SET flag was added along with a conditional check on
      this flag that changed the pn_min_memory when validating job limits.
      This caused that after an upgrade, PD jobs in earlier versions didn't
      have this flag and the memory was incorrectly set when their limits were
      checked before starting. The patch here addresses this issue adding this
      flag to jobs from an older protocol version when loading the state
      files.
      
      Bug 8011
      6abe1e75
  2. 07 Nov, 2019 1 commit
    • Marshall Garey's avatar
      Allow coordinators to delete users. · 0d579734
      Marshall Garey authored
      Previously, coordinators could delete specific associations, but could
      not delete users. Allow coordinators to delete users if the users are
      only part of accounts that the coordinator is over.
      
      Bug 7413.
      0d579734
  3. 31 Oct, 2019 5 commits
  4. 29 Oct, 2019 1 commit
  5. 28 Oct, 2019 2 commits
  6. 25 Oct, 2019 2 commits
    • Albert Gil's avatar
      Enforce PART_NODES if only Partition is specified · c8ce5a53
      Albert Gil authored
      Bug 7490
      c8ce5a53
    • Marshall Garey's avatar
      Avoid abort in dev-build · fe945037
      Marshall Garey authored
      If not enforcing QOS, it's possible to submit a job without a qos. If
      submitting such a job to multiple partitions where at least one has a
      qos, slurmctld would abort in a development build. A non-development
      build didn't segfault only because _find_qos_part doesn't dereference
      the NULL pointer. Prevent the abort.
      
      Bug 7171
      fe945037
  7. 24 Oct, 2019 1 commit
  8. 23 Oct, 2019 1 commit
  9. 22 Oct, 2019 2 commits
    • Gavin Howard's avatar
      Fix abort initializing a configuration without acct_gather.conf. · a301635f
      Gavin Howard authored
      Previous logic would only call s_p_hashtbl_create() to create the hashtable
      when the file acct_gather.conf could be successfully stat()'d. This lead to
      a subsequent attempt to pack the non-created hashtable into a buffer which
      triggered the abort.
      
      This makes it so the hashtable is uncondtionally created no matter if the
      file is missing.
      
      Bug 7893.
      a301635f
    • Michael Hinton's avatar
      auth/munge - truncate FQDN to shortname for AllocNodes. · 50eaa012
      Michael Hinton authored
      gethostbyaddr() can potentially return a fully-qualified domain name,
      which breaks backwards compatibility with the shortname AllocNodes
      expected pre 19.05.
      
      Bug 7653.
      50eaa012
  10. 21 Oct, 2019 2 commits
  11. 18 Oct, 2019 1 commit
  12. 16 Oct, 2019 2 commits
  13. 15 Oct, 2019 2 commits
  14. 11 Oct, 2019 1 commit
  15. 09 Oct, 2019 1 commit
  16. 08 Oct, 2019 2 commits
  17. 07 Oct, 2019 1 commit
  18. 04 Oct, 2019 3 commits
  19. 03 Oct, 2019 2 commits
  20. 02 Oct, 2019 3 commits
  21. 01 Oct, 2019 3 commits