1. 15 Jan, 2016 1 commit
  2. 14 Jan, 2016 2 commits
    • Janne Blomqvist's avatar
      Rework group caching to work better in environments with enumeration disabled. · 48a4cdf8
      Janne Blomqvist authored
      The initgroups()/getgrouplist() caching in slurmd is changed to not require enumeration, instead individual entries are cached when first needed. This cache is always enabled, thus the CacheGroups configuration setting has been removed. The time that each cache entry is considered valid is determined by the GroupUpdateTime configuration parameter. scontrol reconfig will purge the cache. The default value for the GroupUpdateForce configuration parameter has changed, as systems where /etc/group contains all the groups instead of some external system like NIS, LDAP are nowadays probably the exception rather than the rule.
      
      For slurmctld, the group cache still uses enumeration, but this is needed only to take care of special situations like multiple groups with the same GID. With enumeration disabled, group caching still works otherwise. validate_groups() does a little more optional work in order to handle the case where the user p...
      48a4cdf8
    • Morris Jette's avatar
      Avoid slurmstepd abort if malloc fails for accounting · 360fb080
      Morris Jette authored
      If a node is out of memory, then the malloc performed by slurmstepd
        periodically may fail, killing the slurmstepd and orphaning it's
        processes.
      bug 2341
      360fb080
  3. 13 Jan, 2016 2 commits
    • Morris Jette's avatar
      backfill scheduling with group limits fix · 3ee1632f
      Morris Jette authored
      Backfill scheduling fix: If a job can't be started due to a "group" resource
          limit, rather than reserve resources for it when the next job ends, don't
          reserve any resources for it. The problem with the original logic is that
          if a lot of resources are reserved for such pending jobs, then jobs futher
          down the queue may defered when they really can and should be started. An
          ideal solution would track all of the TRES resources through time as jobs
          start and end, but we don't have that logic in the backfill scheduler and
          don't want that extra overhead in the backfill scheduler.
      bugs 2326 and 2282
      3ee1632f
    • Alejandro Sanchez's avatar
      Add more partition info to "scontrol write config" · f428705b
      Alejandro Sanchez authored
      bug 2303
      f428705b
  4. 12 Jan, 2016 5 commits
  5. 11 Jan, 2016 6 commits
  6. 08 Jan, 2016 2 commits
    • Tim Wickberg's avatar
      Remove Sun Constellation. · bf04fa4d
      Tim Wickberg authored
      Update NEWS file for final removal of Sun Constellation, Elan, and
      IBM Federation (switch/nrt plugin replaces). Clean up documentation
      and few outstanding ifdef blocks. Unless you were defining
      HAVE_SUN_CONST there are no functional changes.
      bf04fa4d
    • Tim Wickberg's avatar
      Change slurmstepd to initialize authentication before task launch. · 870273ca
      Tim Wickberg authored
      Otherwise upgrading slurm on a compute node while tasks are running
      will cause plugin mismatch, as slurmstepd would not load the library
      until task completion before. Bug 2319.
      870273ca
  7. 07 Jan, 2016 7 commits
  8. 06 Jan, 2016 8 commits
  9. 05 Jan, 2016 3 commits
  10. 04 Jan, 2016 4 commits
    • Morris Jette's avatar
      Fix for job state reason · 65bb07dc
      Morris Jette authored
      Set job's reason to "Priority" when higher priority job in that partition
          (or reservation) can not start rather than leaving the reason set to
          "Resources".
      bug 2285
      65bb07dc
    • Morris Jette's avatar
      Support partition-specific memory allocation tracking · fc1636f3
      Morris Jette authored
      The partition-specific SelectTypeParameters parameter can now be used to
          change the memory allocation tracking specification in the global
          SelectTypeParameters configuration parameter. Supported partition-specific
          values are CR_Core, CR_Core_Memory, CR_Socket and CR_Socket_Memory. If the
          global SelectTypeParameters value includes memory allocation management and
          the partition-specific value does not, then memory allocation management for
          that partition will NOT be supported (i.e. memory can be over-allocated).
          Likewise the global SelectTypeParameters might not include memory management
          while the partition-specific value does.
      bug 2239
      fc1636f3
    • Danny Auble's avatar
      5f6e70e0
    • Danny Auble's avatar
      66cd9cbc