improve slurmd job record keeping
Recover a list of running jobs when slurmd restarts. This job list is used to determine when a job suspend can take place. This patch also adds job suspend suspend/retry logic since a job suspended immediately after launch can (briefly) return an error that is it not ready yet.
Please register or sign in to comment