Correct job node_cnt value for job completion plugin
When using the jobcomp/script interface, we have noticed the NODECNT environment variable is off-by-one when logging completed jobs in the NODE_FAIL state (though the NODELIST is correct). This appears to be because in many places in job_completion_logger() is called after deallocate_nodes(), which appears to decrement job->node_cnt for DOWN nodes. If job_completion_logger() only called the job completion plugin, then I would guess that it might be safe to move this call ahead of deallocate_nodes(). However, it seems like job_completion_logger() also does a bunch of accounting stuff (?), so perhaps that would need to be split out first? Also, there is the possibility that this is working as designed, though if so a well placed comment in the code might be appreciated. If the decreased nodecount is intended, though, should the DOWN nodes also be removed from the job's NODELIST? - Mark Grondona
Please register or sign in to comment