backup slurmctld treats dupl job state file fatal
If the backup slurmctld assumes primary status, then do NOT purge any job state files (batch script and environment files) but if any attempt is made to re-use them consider this a fatal error. It may indicate that multiple primary slurmctld daemons are active (e.g. both backup and primary are functioning as primary and there is a split brain problem).
Please register or sign in to comment