
About
Overview
What's New
Publications
SLURM Team
Using
Documentation
FAQ
Getting Help
Mailing Lists
Installing
Platforms
Download
Guide |
 |
SLURM: A Highly Scalable Resource Manager
SLURM is an open-source resource manager designed for Linux clusters of all
sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive
access to resources (computer nodes) to users for some duration of time so they
can perform work. Second, it provides a framework for starting, executing, and
monitoring work (typically a parallel job) on a set of allocated nodes. Finally,
it arbitrates conflicting requests for resources by managing a queue of pending
work.
SLURM is not a sophisticated batch system, but it does provide an Applications
Programming Interface (API) for integration with external schedulers such as The
Maui Scheduler. While other resource managers do exist, SLURM is unique in
several respects:
- Its source code is freely available under the GNU
General Public License.
- It is designed to operate in a heterogeneous cluster with up to thousands
of nodes.
- It is portable; written in C with a GNU autoconf configuration engine. While
initially written for Linux, other UNIX-like operating systems should be easy
porting targets. A plugin mechanism exists to support various interconnects, authentication
mechanisms, schedulers, etc.
- SLURM is highly tolerant of system failures, including failure of the node
executing its control functions.
- It is simple enough for the motivated end user to understand its source and
add functionality.
|