validate job gres on slurmctld restart
this new function adds logic to validate that a job's gres request is still valid after slurmctld restart or reconfiguration. specifically if a job allocation includes GRES associated with specific Files then kill the job if the GRES count changes (say the job is allocated GPUs and the slurmctld configuration changes to remove those GPUs, so the job and node bitmaps are different sizes).
Please register or sign in to comment