• Alejandro Sanchez's avatar
    Do not defer slurmd node registration if HealthCheckProgram fails · b31fa177
    Alejandro Sanchez authored
    This behavior was introduced in bug 2504, commit 7fb0c981 and bug 2643
    commit 988edf12 respectively.
    
    The reasoning is that sysadmins who see nodes with Reason "Not Responding"
    but they can manually ping/access the node end up confused. That reason
    should only be set if the node is trully not responding, but not if the
    HealthCheckProgram execution failed or returned non-zero exit code. For
    that case, the program itself would take the appropiate actions, such
    as draining the node and setting an appropiate Reason.
    
    Bug 3931
    b31fa177
To find the state of this project's repository at the time of any of these versions, check out the tags.