-
Morris Jette authored
1. I submit a first job that uses 1 GPU: $ srun --gres gpu:1 --pty bash $ echo $CUDA_VISIBLE_DEVICES 0 2. while the first one is still running, a 2-GPU job asking for 1 task per node waits (and I don't really understand why): $ srun --ntasks-per-node=1 --gres=gpu:2 --pty bash srun: job 2390816 queued and waiting for resources 3. whereas a 2-GPU job requesting 1 core per socket (so just 1 socket) actually gets GPUs allocated from two different sockets! $ srun -n 1 --cores-per-socket=1 --gres=gpu:2 -p testk --pty bash $ echo $CUDA_VISIBLE_DEVICES 1,2 With this change #2 works the same way as #3. bug 1725
e1a00772
To find the state of this project's repository at the time of any of these versions, check out the tags.