• Morris Jette's avatar
    node_features/knl_cray: add UME monitoring · 0c596661
    Morris Jette authored
    Add logic to monitor Uncorrectable Memory Errors (UME) and notify
      active jobs in case they run for a while afterwards. This copies
      logic from knl_generic to knl_cray. There may be a different UME
      monitoring system for Cray systems in the future. The original
      knl_generic development is in commit 56ff27da
    bug 3341
    0c596661
To find the state of this project's repository at the time of any of these versions, check out the tags.