Commit 2c7c5459 authored by Dorian Krause's avatar Dorian Krause Committed by Morris Jette
Browse files

Fux use-after-free in srun

This commit fixes a bug in the multi-prog handling. When running
salloc -N 2 srun -O --multi-prog mp.conf where mp.conf reads

0-192 true

srun crashes can be observed. valgrind reports:

==6857== Invalid read of size 4
==6857==    at 0x45938D: bit_realloc (bitstring.c:189)
==6857==    by 0x5977A9: _update_task_mask (multi_prog.c:335)
==6857==    by 0x597A5E: _validate_ranks (multi_prog.c:403)
==6857==    by 0x597D1E: verify_multi_name (multi_prog.c:469)
==6857==    by 0x6E7B4BE: launch_p_handle_multi_prog_verify (launch_slurm.c:453)
==6857==    by 0x58A25D: launch_g_handle_multi_prog_verify (launch.c:493)
==6857==    by 0x58E556: _opt_args (opt.c:1927)
==6857==    by 0x58A3B9: initialize_and_process_args (opt.c:270)
==6857==    by 0x591F82: init_srun (srun_job.c:459)
==6857==    by 0x427E70: srun (srun.c:193)
==6857==    by 0x428E23: main (srun.wrapper.c:17)
==6857==  Address 0x5ace440 is 16 bytes inside a block of size 28 free'd
==6857==    at 0x4C2BB4A: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6857==    by 0x446886: slurm_xrealloc (xmalloc.c:139)
==6857==    by 0x45944C: bit_realloc (bitstring.c:191)
==6857==    by 0x5977A9: _update_task_mask (multi_prog.c:335)
==6857==    by 0x597A5E: _validate_ranks (multi_prog.c:403)
==6857==    by 0x597D1E: verify_multi_name (multi_prog.c:469)
==6857==    by 0x6E7B4BE: launch_p_handle_multi_prog_verify (launch_slurm.c:453)
==6857==    by 0x58A25D: launch_g_handle_multi_prog_verify (launch.c:493)
==6857==    by 0x58E556: _opt_args (opt.c:1927)
==6857==    by 0x58A3B9: initialize_and_process_args (opt.c:270)
==6857==    by 0x591F82: init_srun (srun_job.c:459)
==6857==    by 0x427E70: srun (srun.c:193)
parent 8712a254
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment