- 24 Feb, 2012 6 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Fixes for jobs with long argument lists
-
- 23 Feb, 2012 1 commit
-
-
Danny Auble authored
-
- 22 Feb, 2012 4 commits
-
-
Pär Andersson authored
Replace two xstrcat() calls per argument with a single xmalloc() call. This significantly speeds up handling of REQUEST_JOB_INFO RPCs when some jobs have long argument lists.
-
Pär Andersson authored
-
Pär Andersson authored
Change argc from uint16_t to uint32_t in slurmctld and slurmstepd. Rest of the code already use uint32_t for argc.
-
Pär Andersson authored
-
- 21 Feb, 2012 1 commit
-
-
jette authored
Fixes a bunch of warnings of this type warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call detected in body
-
- 20 Feb, 2012 2 commits
-
-
jette authored
In current version of slurm initscript, a stop action returns a non null exit code as slurmstatus exit code is directly used and the daemons are stopped. Ensure that when called from slurmstop, slurmstatus error code is reversed to correctly match the attended error code of the stop stage. Port of v2.4 commit a09bffa5 Matthieu Hautreux authored 3 months ag
-
jette authored
Patch from Aleksej Saushev.
-
- 06 Feb, 2012 3 commits
-
-
Danny Auble authored
c0a7a7a4
-
Danny Auble authored
is a convenience function in BSD and glibc that internally calls the equivalent of int masterfd = open("/dev/ptmx", flags); grantpt (masterfd); unlockpt (masterfd); int slavefd = open (slave, O_RDRW|O_NOCTTY); (in psuedocode) On Linux, with some combinations of glibc/kernel (in this case glibc-2.14/Linux-3.1), the equivalent of grantpt(3) was failing in slurmstepd with EPERM, because the allocated pty was getting root ownership instead of the user running the slurm job. From the POSIX description of grantpt: "The grantpt() function shall change the mode and ownership of the slave pseudo-terminal device... The user ID of the slave shall be set to the real UID of the calling process..." http://pubs.opengroup.org/onlinepubs/007904875/functions/grantpt.html This means that for POSIX-compliance, the real user id of slurmstepd must be the user executing the SLURM job at the time openpty(3) is called. Unfortunately, the real user id of slurmstepd at this point is still root, and only the effective uid is set to the user. This patch is a work-around that uses the (non-portable) setresuid(2) system call to reset the real and effective uids of the slurmstepd process to the job user, but keep the saved uid of root. Then after the openpty(3) call, the previous credentials are reestablished using the same call.
-
Danny Auble authored
-
- 03 Feb, 2012 1 commit
-
-
Morris Jette authored
Fix for srun allocating running within existing allocation with --exclude option and --nnodes count small enough to remove more nodes. > salloc -N 8 salloc: Granted job allocation 1000008 > srun -N 2 -n 2 --exclude=tux3 hostname srun: error: Unable to create job step: Requested node configuration is not available Patch from Phil Eckert, LLNL.
-
- 02 Feb, 2012 1 commit
-
-
Morris Jette authored
Fix bug in step task distribution when nodes are not configured in numeric order. Patch from Hongjia Cao, NUDT.
-
- 01 Feb, 2012 5 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Fix bug when requeued batch job is scheduled to run on a different node zero, but attemts job launch on old node zero causing fatal error "Invalid host_index -1 for job #"
-
Morris Jette authored
-
Morris Jette authored
Avoid slurmctld abort due to bad pointer when setting an advanced reservation MAINT flag if it contains no nodes (only licenses).
-
- 31 Jan, 2012 6 commits
-
-
Danny Auble authored
blocks are in an error state.
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
-
Didier GAZEN authored
Hi, With slurm 2.3.2 (or 2.3.3), I encounter the following error when trying to launch as root a command attached to a running user's job even if I use the --uid=<user> option : sila@suse112:~> squeue JOBID PARTITION NAME USER STATE TIME TIMELIMIT NODES CPUS NODELIST(REASON) 551 debug mysleep. sila RUNNING 0:02 UNLIMITED 1 1 n1 root@suse112:~ # srun --jobid=551 hostname srun: error: Unable to create job step: Access/permission denied <--normal behaviour root@suse112:~ # srun --jobid=551 --uid=sila hostname srun: error: Unable to create job step: Invalid user id <--problem By increasing slurmctld verbosity, the log files displays the follwing error : slurmctld: debug2: Processing RPC: REQUEST_JOB_ALLOCATION_INFO_LITE from uid=0 slurmctld: debug: _slurm_rpc_job_alloc_info_lite JobId=551 NodeList=n1 usec=1442 slurmctld: debug2: Processing RPC: REQUEST_JOB_STEP_CREATE from uid=0 slurmctld: error: Security violation, JOB_STEP_CREATE RPC from uid=0 to run as uid 1001 which occurs in function : _slurm_rpc_job_step_create (src/slurmctld/proc_req.c) Here's my patch to prevent the command from failing (but I'm not sure that there is no side effects) :
-
Danny Auble authored
to give a correct priority on the first decay cycle after a restart of the slurmctld. Patch from Martin Perry, Bull.
-
- 27 Jan, 2012 7 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Danny Auble authored
Lucero Palau.
-
Morris Jette authored
-
Morris Jette authored
Patch from Mark Nelson
-
Morris Jette authored
This patch was previously applied to SLURM v2.4 and is being back-ported due to problems being reported in SLURM v2.3. Original commit is here https://github.com/SchedMD/slurm/commit/4c0eea7b8c20ccb1cacad51838a1ea8257cc637d
-
Matthieu Hautreux authored
-
- 25 Jan, 2012 1 commit
-
-
Morris Jette authored
Set DEFAULT flag in partition structure when slurmctld reads the configuration file. Patch from Rémi Palancher. Note the flag is set when the information is sent via RPC for sinfo.
-
- 24 Jan, 2012 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
-