Overview
Simple Linux Utility for Resource Management (SLURM) is an open source, fault-tolerant,
and highly scalable cluster management and job scheduling system for large and
small Linux clusters. Components include machine status, partition management,
job management, scheduling, and stream copy modules. SLURM requires no kernel
modifications for it operation and is relatively self-contained.
There is an overview of the components and their interactions available in
a separate document, SLURM: Simple Linux Utility for
Resource Management [PDF].
SLURM is written in the C language and uses a GNU autoconf configuration
engine. While initially written for Linux, other UNIX-like operating systems should
be easy porting targets. Code should adhere to the
Linux kernel coding style. (Some components of SLURM have been taken from
various sources. Some of these components are written in C++ or do not conform
to the Linux kernel coding style. However, new code written for SLURM should
follow these standards.)
Many of these modules have been built and tested on a variety of Unix computers
including Red Hat Linux, IBM's AIX, Sun's Solaris, and Compaq's Tru-64. The only
module at this time that is operating system dependent is src/slurmd/read_proc.c.
We will be porting and testing on additional platforms in future releases.
Plugins
To make the use of different infrastructures possible, SLURM uses a general
purpose plugin mechanism. A SLURM plugin is a dynamically linked code object that
is loaded explicitly at run time by the SLURM libraries. It provides a customized
implementation of a well-defined API connected to tasks such as authentication,
interconnect fabric, task scheduling, etc. A set of functions is defined for use
by all of the different infrastructures of a particular variety. When a SLURM
daemon is initiated, it reads the configuration file to determine which of the
available plugins should be used. A plugin developer's
guide is available with general information about plugins. Most plugin
types also have their own documenation available, such as
SLURM Authentication Plugin API and
SLURM Job Completion Logging API.
Directory Structure
The contents of the SLURM directory structure will be described below in increasing
detail as the structure is descended. The top level directory contains the scripts
and tools required to build the entire SLURM system. It also contains a variety
of subdirectories for each type of file.
General build tools/files include: acinclude.m4, autogen.sh,
configure.ac, Makefile.am, Make-rpm.mk, META, README,
slurm.spec.in, and the contents of the auxdir directory. autoconf
and make commands are used to build and install
SLURM in an automated fashion. NOTE: autoconf
version 2.52 or higher is required to build SLURM. Execute
autoconf -V to check your version number.
The build process is described in the README file.
Copyright and disclaimer information are in the files COPYING and DISCLAIMER.
All of the top-level subdirectories are described below.
auxdirUsed for building SLURM.
docDocumentation including man pages.
etcSample configuration files.
slurmHeader files for API use. These files must be installed. Placing
these header files in this location makes for better code portability.
srcContains all source code and header files not in the "slurm" subdirectory
described above.
testsuiteDejaGnu is used as a testing framework and all of its files
are here.
Documentation
All of the documentation is in the subdirectory doc. Man pages for the
APIs, configuration file, commands, and daemons are in doc/man. Various
documents suitable for public consumption are in doc/html. Overall SLURM
design documents including various figures are in doc/pubdesign. Various
design documents (many of which are dated) can be found in doc/slides and
doc/txt. A survey of available resource managers as of 2001 is in doc/survey.
Source Code
Functions are divided into several categories, each in its own subdirectory.
The details of each directory's contents are proved below. The directories are
as follows:
apiApplication Program Interfaces into
the SLURM code. Used to send and get SLURM information from the central manager.
These are the functions user applications might utilize.
commonGeneral purpose functions for widespread use throughout SLURM.
pluginsPlugin functions for various infrastructure. A separate subdirectory
is used for each plugin class:
auth for user authentication,
checkpoint for system-initiated checkpoint
and restart of user jobs,
jobcomp for job completion logging,
sched for job scheduler,
select for a job's node selection,
switch for switch (interconnect) specific functions,
etc.
scancelUser command to cancel (or signal) a job or job step.
scontrolAdministrator tool to manage SLURM.
sinfoUser command to get information on SLURM nodes and partitions.
slurmctldSLURM central manager daemon code.
slurmdSLURM daemon code to manage the compute server nodes including
the execution of user applications.
smapUser command to view layout of nodes, partitions, and jobs.
This is particularly valuable on systems like Blue Gene, which has a three
dimension torus topography.
squeueUser command to get information on SLURM jobs and job steps.
srunUser command to submit a job, get an allocation, and/or initiation
a parallel job step.
Configuration
Configuration files are included in the etc subdirectory. slurm.conf.example
includes a description of all configuration options and default settings. See
doc/man/man5/slurm.conf.5 for more details.
init.d.slurm
is a script that determines which SLURM daemon(s) should execute on any node based
upon the configuration file contents. It will also manage these daemons: starting,
signalling, restarting, and stopping them.
Test Suite
The testsuite files use a DejaGnu framework for testing. These tests
are very limited in scope. We also have a set of Expect SLURM tests available
as a separate distribution. These tests are executed after SLURM has been installed
and the daemons initiated. About 110 test scripts exercise all SLURM commands
and options including stress tests. Get these test from
ftp://ftp.llnl.gov/pub/linux/slurm-qa
Tricks of the Trade
You can make a single node appear to SLURM as a Linux cluster by manually
defining HAVE_FRONT_END to have a non-zero value in the file config.h.
All (fake) nodes should be defined in the slurm.conf file.
These nodes should be configured with a single NodeAddr value
indicating the node on which single slurmd daemon
executes. Initiate one slurmd and one
slurmctld daemon. Do not initiate too many
simultaneous job steps to avoid overloading the
slurmd daemon executing them all.
|