Commit 2f3f0bc7 authored by Morris Jette's avatar Morris Jette
Browse files

Fix GRES underflow error

If GRES are associated with specific CPUs and a job allocation includes
GRES, which are not associated with the specific CPUs allocated to the
job, then when the job is deallocated, an underflow error results. To
reproduce:

gres.conf:
Name=gpu File=/dev/tty0 CPUs=0-5
Name=gpu File=/dev/tty1 CPUs=6-11
Name=gpu File=/dev/tty2 CPUs=12-17
Name=gpu File=/dev/tty3 CPUs=18-23

Then
$ srun --gres=gpu:2 -N1 --ntasks-per-node=2 hostname

In slurmctld log file:
error: gres/gpu: job 695 dealloc node smd1 topo gres count underflow

Logic modified to increment the count based upon the specific GRES
actually allocated, ignoring the associated CPUs (too late to consider
that after the GRES as picked).
parent 7dedc6cf
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment