- 03 Feb, 2012 1 commit
-
-
Morris Jette authored
Fix for srun allocating running within existing allocation with --exclude option and --nnodes count small enough to remove more nodes. > salloc -N 8 salloc: Granted job allocation 1000008 > srun -N 2 -n 2 --exclude=tux3 hostname srun: error: Unable to create job step: Requested node configuration is not available Patch from Phil Eckert, LLNL.
-
- 02 Feb, 2012 1 commit
-
-
Morris Jette authored
Fix bug in step task distribution when nodes are not configured in numeric order. Patch from Hongjia Cao, NUDT.
-
- 01 Feb, 2012 5 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Fix bug when requeued batch job is scheduled to run on a different node zero, but attemts job launch on old node zero causing fatal error "Invalid host_index -1 for job #"
-
Morris Jette authored
-
Morris Jette authored
Avoid slurmctld abort due to bad pointer when setting an advanced reservation MAINT flag if it contains no nodes (only licenses).
-
- 31 Jan, 2012 6 commits
-
-
Danny Auble authored
blocks are in an error state.
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
-
Didier GAZEN authored
Hi, With slurm 2.3.2 (or 2.3.3), I encounter the following error when trying to launch as root a command attached to a running user's job even if I use the --uid=<user> option : sila@suse112:~> squeue JOBID PARTITION NAME USER STATE TIME TIMELIMIT NODES CPUS NODELIST(REASON) 551 debug mysleep. sila RUNNING 0:02 UNLIMITED 1 1 n1 root@suse112:~ # srun --jobid=551 hostname srun: error: Unable to create job step: Access/permission denied <--normal behaviour root@suse112:~ # srun --jobid=551 --uid=sila hostname srun: error: Unable to create job step: Invalid user id <--problem By increasing slurmctld verbosity, the log files displays the follwing error : slurmctld: debug2: Processing RPC: REQUEST_JOB_ALLOCATION_INFO_LITE from uid=0 slurmctld: debug: _slurm_rpc_job_alloc_info_lite JobId=551 NodeList=n1 usec=1442 slurmctld: debug2: Processing RPC: REQUEST_JOB_STEP_CREATE from uid=0 slurmctld: error: Security violation, JOB_STEP_CREATE RPC from uid=0 to run as uid 1001 which occurs in function : _slurm_rpc_job_step_create (src/slurmctld/proc_req.c) Here's my patch to prevent the command from failing (but I'm not sure that there is no side effects) :
-
Danny Auble authored
to give a correct priority on the first decay cycle after a restart of the slurmctld. Patch from Martin Perry, Bull.
-
- 27 Jan, 2012 7 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Danny Auble authored
Lucero Palau.
-
Morris Jette authored
-
Morris Jette authored
Patch from Mark Nelson
-
Morris Jette authored
This patch was previously applied to SLURM v2.4 and is being back-ported due to problems being reported in SLURM v2.3. Original commit is here https://github.com/SchedMD/slurm/commit/4c0eea7b8c20ccb1cacad51838a1ea8257cc637d
-
Matthieu Hautreux authored
-
- 25 Jan, 2012 1 commit
-
-
Morris Jette authored
Set DEFAULT flag in partition structure when slurmctld reads the configuration file. Patch from Rémi Palancher. Note the flag is set when the information is sent via RPC for sinfo.
-
- 24 Jan, 2012 3 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Do not test for CUDA environment variable on Cray system
-
- 23 Jan, 2012 3 commits
-
-
Morris Jette authored
needed for test24.1
-
Morris Jette authored
-
Philip D. Eckert authored
Moe, Here it is, I have added a subroutine to env.c to unset the user's environment and then called it from sbatch in main. I also removed the comment from the sbatch man page indicating that it wasn't working the same for a regular user as it did for Moab. It should now be functionally the same. I think there is still a difference between how sbatch functions with an environment in a file than it does from when Moab excve'd the environment. However, I'm not sure what it would be at this point. Again, for so many iterations.... Phil
-
- 22 Jan, 2012 1 commit
-
-
Philip D. Eckert authored
Moe, After doing more extensive testing, I came to realize that we had made a bad basic assumption. We believed that the user's environment should only be what was sent in the file via the --export-file option. However, that broke the previous behavior, especially in regard to Moab jobs. It also caused the SLURM defined environment variables to be lost as well. This patch will enable the correct behavior for Moab on top of SLURM whne using the --export-file option, but the behavior is less that pefect for using it stand alone with sbatch. When using the option with sbatch as a user, the file environment is read in, and then when the env_array_merge is made, some variables may get overwritten. This is good for the SLURM and MPI vairables, but not so good for others., The problem is trying to reconcile two sources of environment is very problematic. I also added a caveat in the man page. I made changes in my branch of SchedMD SLURM for 2.3, here is the patch. Phil
-
- 20 Jan, 2012 2 commits
-
-
Danny Auble authored
-
Morris Jette authored
Fix for possible invalid memory reference in slurmctld in job dependency logic. Patch from Carles Fenoy (Barcelona Supercomputer Center).
-
- 19 Jan, 2012 5 commits
-
-
Danny Auble authored
all jobs would be returned even if the flag was set. Patch from Bill Brophy, Bull.
-
Morris Jette authored
-
Morris Jette authored
We replaced references to "pipe" with a more generic "file descriptor". We also replaced a while loop in env.c with a for loop.
-
Morris Jette authored
-
Morris Jette authored
-
- 18 Jan, 2012 4 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Fix bug in --switch option with topology resulting in bad switch count use. Patch from Alejandro Lucero Palau (Barcelona Supercomputer Center).
-
- 17 Jan, 2012 1 commit
-
-
Morris Jette authored
-