Problem when using srun --uid in conjunction with --jobid (patch included)
Hi, With slurm 2.3.2 (or 2.3.3), I encounter the following error when trying to launch as root a command attached to a running user's job even if I use the --uid=<user> option : sila@suse112:~> squeue JOBID PARTITION NAME USER STATE TIME TIMELIMIT NODES CPUS NODELIST(REASON) 551 debug mysleep. sila RUNNING 0:02 UNLIMITED 1 1 n1 root@suse112:~ # srun --jobid=551 hostname srun: error: Unable to create job step: Access/permission denied <--normal behaviour root@suse112:~ # srun --jobid=551 --uid=sila hostname srun: error: Unable to create job step: Invalid user id <--problem By increasing slurmctld verbosity, the log files displays the follwing error : slurmctld: debug2: Processing RPC: REQUEST_JOB_ALLOCATION_INFO_LITE from uid=0 slurmctld: debug: _slurm_rpc_job_alloc_info_lite JobId=551 NodeList=n1 usec=1442 slurmctld: debug2: Processing RPC: REQUEST_JOB_STEP_CREATE from uid=0 slurmctld: error: Security violation, JOB_STEP_CREATE RPC from uid=0 to run as uid 1001 which occurs in function : _slurm_rpc_job_step_create (src/slurmctld/proc_req.c) Here's my patch to prevent the command from failing (but I'm not sure that there is no side effects) :
Please register or sign in to comment