Fix both socket-[un]constrained GRES allocation issues.
Do not assume that these sock_gres_t pointers always exist: bits_by_sock bits_by_sock[s] If they don't, that means there are no current iteration socket `s` constrained GRES and so the logic shouldn't allocate the current iteration GRES `g`. Analogously, do not assume that bits_any_sock sock_gres_t member pointer is always valid. If it isn't, it means there are no socket-unconstrained GRES available to satisfy the job request, so the logic should not allocate the current iteration GRES `g`. Otherwise, job/node struct members holding GRES allocation information would end up being incorrect, leading to improper allocations and also leading to errors logged in slurmctld log at deallocation time like: error: gres/gpu: job <X> dealloc node <Y> GRES count underflow (0 < 1) Bug 7827
Please register or sign in to comment