1. 04 Jan, 2018 2 commits
    • Alejandro Sanchez's avatar
      task/cgroup - clarify messages when job/step memory[+swap] limit is hit. · 38e15bf7
      Alejandro Sanchez authored
      There are out of memory conditions where spikes of memory usage hit the
      limit set. When this happens (failcnt > 0), the Kernel might be able
      to reclaim unused pages and the process can continue without oom-killer
      actually killing the process. This may or may not result in an app
      problem, thus we want to better clarify the message.
      
      A separate bug will track the potential addition of a new feature to
      better discern memory limits being hit from oom-killer actually killing
      the process. There are mechanisms to register a notifier through the
      cgroup.event_control control file, so that the application can be
      notified through eventfd when OOM-Killer actually kills the process.
      
      Bug 3820.
      38e15bf7
    • Alejandro Sanchez's avatar
      Docs - Add a reference to Slurm Support How-To guide in the PMIx web. · bbfd1890
      Alejandro Sanchez authored
      Link published to the slurm-users list by Ralph Castain:
      https://pmix.github.io/pmix/how-to
      bbfd1890
  2. 03 Jan, 2018 13 commits
  3. 02 Jan, 2018 4 commits
  4. 28 Dec, 2017 3 commits
  5. 22 Dec, 2017 5 commits
  6. 20 Dec, 2017 13 commits