Fix scheduling inconsistency with GRES (e1a00772) · Commits · Manuel G. Marciani / ces_slurm_simulator

Commit e1a00772 authored Jun 09, 2015 by

Morris Jette

Fix scheduling inconsistency with GRES

1. I submit a first job that uses 1 GPU:
$ srun --gres gpu:1 --pty bash
$ echo $CUDA_VISIBLE_DEVICES
0

2. while the first one is still running, a 2-GPU job asking for 1 task per node
waits (and I don't really understand why):
$ srun --ntasks-per-node=1 --gres=gpu:2 --pty bash
srun: job 2390816 queued and waiting for resources

3. whereas a 2-GPU job requesting 1 core per socket (so just 1 socket) actually
gets GPUs allocated from two different sockets!
$ srun -n 1  --cores-per-socket=1 --gres=gpu:2 -p testk --pty bash
$ echo $CUDA_VISIBLE_DEVICES
1,2

With this change #2 works the same way as #3.
bug 1725

parent 5f337d38

Hide whitespace changes

Inline Side-by-side

Please register or to comment