Usage: srun [OPTIONS...] executable [args...] parallel run options -n, --nprocs=nprocs number of processes to run -c, --cpus=ncpus number of cpus required per process -N, --nodes=nnodes number of nodes on which to run -p, --partition=partition partition requested -I, --immediate exit if resources are not immediately available -O, --overcommit overcommit resources -l, --label-output prepend task number to lines of stdout/err -m, --distribution=block|cyclic distribution method for tasks ( block | cyclic) -B, --base-node=hostname start allocation at base node -J, --job-name=jobname name of job -o, --output=out location of stdout redirection -i, --input=in location of stdin redirection -e, --error=err location of stderr redirection -v, --verbose verbose operation -d, --debug enable debug allocate only -A, --allocate allocate resources and spawn a shell attach to running job -a, --attach=id attach to running job with job id = id constraint options --mincpus=n cpus per node --mem=MB minimum amount of real memory --vmem=MB minimum amount of virtual memory --tmp=MB minimun amount of temp disk -C, --constraint=constraint list specify a list of constraints --contiguous demand a contiguous range of nodes -w, --nodelist=host1,host2,... request a specific list of hosts Help options -?, --help Show this help message --usage Display brief usage message
Environment Var | Option |
---|---|
SLURM_NPROCS | -n, --nprocs=n |
SLURM_CPUS_PER_TASK | -c, --cpus=n |
SLURM_NNODES | -N, --nodes=n |
SLURM_PARTITION | -p, --partition=partition |
SLURM_STDOUTMODE | -o, --output=out |
SLURM_STDINMODE | -i, --input=in |
SLURM_STDERRMODE | -e, --errro=err |
SLURM_DISTRIBUTION | -m, --distribution=(block|cyclic) |
SLURM_DEBUG | -d, --debug |
-n, --nprocs=nprocsRequest that srun allocate and initiate nprocs processes. The number of processes per node may be controlled with the -c and -N options. The default is one process.
-c, --cpus=ncpusRequest that ncpus cpus be allocated per process. This is useful if the job will be multithreaded and more than one cpu is required for optimal performance. The default is one cpu per process.
-N, --nodes=nnodesRequest that nnodes be allocated to this job. The default is to allocate one cpu per process.
-p, --partition=NRequest that nodes be allocated from partition N. N should be a number argument. The partition numbers are assigned by the slurm administrator. The default partition is partition 0 (zero).
-I, --immediatesrun will exit if resources are not immediately available. By default, the immediate option is off, and srun will block until resources become available.
-O, --overcommitBy default, specifying the -n and -N options such that more than one process is allocated to a cpu is an error. The overcommit option allows this behavior.
-l, --label-outputRequest that a task id be prepended to stdout and stderr during a run.
-m, --distribution=(block|cyclic)Change the way in which the nproc processes are distributed over the nnodes nodes. For block distribution, the processes are allocated in-order to the cpus on a node. For cyclic distribution, the processes are distributed in a round-robin fashion to the allocated nodes. The default distribution type is block.
-B, --base-node=hostnameRequest a specific node to be the first node in the allocation. The default is "any."
-J, --job-name=jobnameName the job. The default is an empty name.
-o, --output=outChange how stdout is redirected. Normal redirection is that all processes stdout is redirected to srun's stdout. If a filename is specified, all stdout will be redirected to this file. If the filename ends in a '%' character, each task will create a separate file for stdout named as filename.[task_id] where task id is the task number of the process.
-i, --input=inChange how stdin is redirected. By default, stdin is redirected from srun to task 0. stdin may be redirected from a file, or a different file per task using the naming scheme described above for the -o option.
-e, --error=errChange how stderr is redirected. By default, stderr is redirected to the same place as stdout. Thus, if it is desired that stdout and stderr both go to the same file, then --output is the only option that need be specified. The --error option is provided to allow for redirection of stderr and stdout to differing locations. The argument takes the same form as -o option.
-v, --verboseIncrease the verbosity of srun . multiple -v's will increase output.
-d, --debugPut srun into debug mode.
-A, --allocateAllocate resources and spawn a subshell which has access to these resources. This allows multiple runs under the same set of nodes with the same number of processes in each run. It is an error to specify both --allocate and a command to run.
-a, --attach=idAttach to a currently running job. The running job must be detached. Reattaching to a running job will cause stdout and stderr to be redirected to srun and will allow signals to be forwarded to the remote tasks.
-C, --constraints=specify a list of constraints. Constraints are typically a comma separated list of "variable=value" pairs, such as "ncpus=2,mem=1024" which will constrain the list of nodes considered for the job to those that have the requested attributes.
-w, --nodelist=Request that the job be run on a specific list of nodes. The nodelist is a comma separated list of hostnames. Lists of consecutive hosts may be specified in range form if the cluster naming convention allows this. For example the nodelist "host1,host2,host3" may be specified as "host[1-3]." See more in "Hostname Ranges" below.
--contiguousOnly allow the job to run on a contiguous range of hosts.
Once srun has processed user options it generates a node
allocation request, unless it is running within an environment that
already has nodes allocated to it (see --allocate). srun
then forwards this request to the slurm job manager. If the request
can not be met immediately, srun will block and wait for
the resources to become available unless the --immediate option is
specified, in which case srun will terminate.
Once the appropriate resources have been allocated, srun
will start all processes on the assigned nodes. Once all processes
are running, stdout and stderr will be displayed and stdin will be
forwarded to process 0, unless these defaults have been changed
with --output, --input, or --error. All signals except for SIGQUIT
and SIGKILL will be forwarded to all remote processes. srun
will terminate once all remote processes have exited. The
exit status of srun will represent the maximum exit status of
the remote processes.
If allocate mode is specified via --allocate, no remote processes
are started when the node allocation is complete. Instead, srun
will spawn a subshell that will have access to the allocated resources.
Thus, subsequent invocations of srun within the subshell
will run across the nodes allocated with --allocate. If any of the
node allocation options (-n, -c, -N) are specified from within the
subshell, it will be assumed that a new allocation is being requested
and srun will allocate a new set of nodes. Resources allocated
with --allocate will be released when the subshell exits.
If I/O is not to be redirected from/to a terminal then srun will, by
default, put itself into the "background." To accomplish this, srun
will run a copy of itself on the first of the allocated nodes for the
job then terminate. The new srun task will then initiate the rest of the
processes, and manage io redirection, etc.
In order to "reattach" stdout, stderr, and signal forwarding to a "backgrounded" job, you may run srun with the --attach=jid option. This will reattach your current terminal to the running job. Normally, no other options are valid with --attach. You may also need to reattach to a job if the node you are on during an srun session goes down. In this case, slurm will automatically "background" all active srun sessions on the failed node, sending their output to a file in the current working directory of the program. To regain control of the srun session, simply reattach to the job. Note that jobs that are receiving stdin from a terminal cannot be "backgrounded."
Last Modified December 21, 2001
Maintained by Moe Jette jette1@llnl.gov