NEWS 350 KB
Newer Older
    reservations (e.g. specific different core count for each node).
 -- Added slurmctld/dynalloc plugin for MapReduce+ support.
 -- Added "DynAllocPort" configuration parameter.
 -- Added partition paramter of SelectTypeParameters to override system-wide
    value.
 -- Added cr_type to partition_info data structure.
 -- Added allocated memory to node information available (within the existing
    select_nodeinfo field of the node_info_t data structure). Added Allocated
    Memory to node information displayed by sview and scontrol commands.
 -- Make sched/backfill the default scheduling plugin rather than sched/builtin
    (FIFO).
 -- Added support for a job having different priorities in different partitions.
 -- Added new SchedulerParameters configuration parameter of "bf_continue"
    which permits the backfill scheduler to continue considering jobs for
    backfill scheduling after yielding locks even if new jobs have been
    submitted. This can result in lower priority jobs from being backfill
    scheduled instead of newly arrived higher priority jobs, but will permit
    more queued jobs to be considered for backfill scheduling.
 -- Added support to purge reservation records from accounting.
 -- Cray - Add support for Basil 1.3
Morris Jette's avatar
Morris Jette committed
* Changes in SLURM 2.6.0pre1
============================
 -- Add "state" field to job step information reported by scontrol.
 -- Notify srun to retry step creation upon completion of other job steps
    rather than polling. This results in much faster throughput for job step
    execution with --exclusive option.
 -- Added "ResvEpilog" and "ResvProlog" configuration parameters to execute a
    program at the beginning and end of each reservation.
 -- Added "slurm_load_job_user" function. This is a variation of
    "slurm_load_jobs", but accepts a user ID argument, potentially resulting
    in substantial performance improvement for "squeue --user=ID"
 -- Added "slurm_load_node_single" function. This is a variation of
    "slurm_load_nodes", but accepts a node name argument, potentially resulting
    in substantial performance improvement for "sinfo --nodes=NAME".
 -- Added "HealthCheckNodeState" configuration parameter identify node states
    on which HealthCheckProgram should be executed.
 -- Remove sacct --dump --formatted-dump options which were deprecated in
    2.5.
 -- Added support for job arrays (phase 1 of effort). See "man sbatch" option
    -a/--array for details.
 -- Add new AccountStorageEnforce options of 'nojobs' and 'nosteps' which will
    allow the use of accounting features like associations, qos and limits but
    not keep track of jobs or steps in accounting.
 -- Cray - Add new cray.conf parameter of "AlpsEngine" to specify the
    communication protocol to be used for ALPS/BASIL.
 -- select/cons_res plugin: Correction to CPU allocation count logic in for
    cores without hyperthreading.
 -- Added new SelectTypeParameter value of "CR_ALLOCATE_FULL_SOCKET".
 -- Added PriorityFlags value of "TICKET_BASED" and merged priority/multifactor2
    plugin into priority/multifactor plugin.
 -- Add "KeepAliveTime" configuration parameter controlling how long sockets
    used for srun/slurmstepd communications are kept alive after disconnect.
 -- Added SLURM_SUBMIT_HOST to salloc, sbatch and srun job environment.
 -- Added SLURM_ARRAY_TASK_ID to environment of job array.
 -- Added squeue --array/-r option to optimize output for job arrays.
 -- Added "SlurmctldPlugstack" configuration parameter for generic stack of
    slurmctld daemon plugins.
 -- Removed contribs/arrayrun tool. Use native support for job arrays.
 -- Modify default installation locations for RPMs to match "make install":
    _prefix /usr/local
    _slurm_sysconfdir %{_prefix}/etc/slurm
    _mandir %{_prefix}/share/man
    _infodir %{_prefix}/share/info
 -- Add acct_gather_energy/ipmi which works off freeipmi for energy gathering
Morris Jette's avatar
Morris Jette committed
* Changes in Slurm 2.5.8
========================
 -- Fix for slurmctld segfault on NULL front-end reason field.
 -- Avoid gres step allocation errors when a job shrinks in size due to either
    down nodes or explicit resizing. Generated slurmctld errors of this type:
    "step_test ... gres_bit_alloc is NULL"
 -- Fix bug that would leak memory and over-write the AllowGroups field if on
    "scontrol reconfig" when AllowNodes is manually changed using scontrol.
 -- Get html/man files to install in correct places with rpms.
 -- Remove --program-prefix from spec file since it appears to be added by
    default and appeared to break other things.
 -- Updated the automake min version in autogen.sh to be correct.
 -- Select/cons_res - Correct total CPU count allocated to a job with
    --exclusive and --cpus-per-task options
 -- switch/nrt - Don't allocate network resources unless job step has 2+ nodes.
 -- select/cons_res - Avoid extraneous "oversubscribe" error messages.
 -- Reorder get config logic to avoid deadlock.
 -- Enforce QOS MaxCPUsMin limit when job submission contains no user-specified
    time limit.
 -- EpilogSlurmctld pthread is passed required arguments rather than a pointer
    to the job record, which under some conditions could be purged and result
    in an invalid memory reference.
* Changes in Slurm 2.5.7
========================
 -- Fix for linking to the select/cray plugin to not give warning about
    undefined variable.
 -- Add missing symbols to the xlator.h
 -- Avoid placing pending jobs in AdminHold state due to backfill scheduler
    interactions with advanced reservation.
 -- Accounting - make average by task not cpu.
 -- CRAY - Change logging of transient ALPS errors from error() to debug().
 -- POE - Correct logic to support poe option "-euidevice sn_all" and
    "-euidevice sn_single".
 -- Accounting - Fix minor initialization error.
 -- POE - Correct logic to support srun network instances count with POE.
 -- POE - With the srun --launch-cmd option, report proper task count when
    the --cpus-per-task option is used without the --ntasks option.
 -- POE - Fix logic binding tasks to CPUs.
 -- sview - Fix race condition where new information could of slipped past
    the node tab and we didn't notice.
 -- Accounting - Fix an invalid memory read when slurmctld sends data about
    start job to slurmdbd.
 -- If a prolog or epilog failure occurs, drain the node rather than setting it
    down and killing all of its jobs.
 -- Priority/multifactor - Avoid underflow in half-life calculation.
 -- POE - pack missing variable to allow fanout (more than 32 nodes)
 -- Prevent clearing reason field for pending jobs. This bug was introduced in
    v2.5.5 (see "Reject job at submit time ...").
 -- BGQ - Fix issue with preemption on sub-block jobs where a job would kill
    all preemptable jobs on the midplane instead of just the ones it needed to.
 -- switch/nrt - Validate dynamic window allocation size.
 -- BGQ - When --geo is requested do not impose the default conn_types.
Danny Auble's avatar
Danny Auble committed
 -- CRAY - Support CLE 4.2.0
 -- RebootNode logic - Defers (rather than forgets) reboot request with job
    running on the node within a reservation.
 -- switch/nrt - Correct network_id use logic. Correct support for user sn_all
    and sn_single options.
 -- sched/backfill - Modify logic to reduce overhead under heavy load.
 -- Fix job step allocation with --exclusive and --hostlist option.
 -- Select/cons_res - Fix bug resulting in error of "cons_res: sync loop not
    progressing, holding job #"
 -- checkpoint/blcr - Reset max_nodes from zero to NO_VAL on job restart.
 -- launch/poe - Fix for hostlist file support with repeated host names.
 -- priority/multifactor2 - Prevent possible divide by zero.
 -- srun - Don't check for executable if --test-only flag is used.
 -- energy - On a single node only use the last task for gathering energy.
    Since we don't currently track energy usage per task (only per step).
    Otherwise we get double the energy.
Danny Auble's avatar
Danny Auble committed
* Changes in Slurm 2.5.6
========================
 -- Gres fix for requeued jobs.
 -- Gres accounting - Fix regression in 2.5.5 for keeping track of gres
    requested and allocated.
Danny Auble's avatar
Danny Auble committed

Morris Jette's avatar
Morris Jette committed
* Changes in Slurm 2.5.5
========================
 -- Fix for sacctmgr add qos to handle the 'flags' option.
 -- Export SLURM_ environment variables from sbatch, even if "--export"
    option does not explicitly list them.
 -- If node is in more than one partition, correct counting of allocated CPUs.
 -- If step requests more CPUs than possible in specified node count of job
    allocation then return ESLURM_TOO_MANY_REQUESTED_CPUS rather than
    ESLURM_NODES_BUSY and retrying.
 -- CRAY - Fix SLURM_TASKS_PER_NODE to be set correctly.
 -- Accounting - more checks for strings with a possible `'` in it.
 -- sreport - Fix by adding planned down time to utilization reports.
 -- Do not report an error when sstat identifies job steps terminated during
    its execution, but log using debug type message.
 -- Select/cons_res - Permit node removed from job by going down to be returned
    to service and re-used by another job.
 -- Select/cons_res - Tighter packing of job allocations on sockets.
 -- SlurmDBD - fix to allow user root along with the slurm user to register a
    cluster.
 -- Select/cons_res - Fix for support of consecutive node option.
 -- Select/cray - Modify build to enable direct use of libslurm library.
 -- Bug fixes related to job step allocation logic.
 -- Cray - Disable enforcement of MaxTasksPerNode, which is not applicable
    with launch/aprun.
 -- Accounting - When rolling up data from past usage ignore "idle" time from
    a reservation when it has the "Ignore_Jobs" flag set.  Since jobs could run
    outside of the reservation in it's nodes without this you could have
    double time.
 -- Accounting - Minor fix to avoid reuse of variable erroneously.
 -- Reject job at submit time if the node count is invalid. Previously such a
    job submitted to a DOWN partition would be queued.
 -- Purge vestigial job scripts when the slurmd cold starts or slurmstepd
    terminates abnormally.
Jason Bacon's avatar
Jason Bacon committed
 -- Add support for FreeBSD.
 -- Add sanity check for NULL cluster names trying to register.
 -- BGQ - Push action 'D' info to scontrol for admins.
 -- Reset a job's reason from PartitionDown when the partition is set up.
 -- BGQ - Handle issue where blocks would have a pending job on them and
    while it was free cnodes would go into software error and kill the job.
 -- BGQ - Fix issue where if for some reason we are freeing a block with
    a pending job on it we don't kill the job.
 -- BGQ - Fix race condition were a job could of been removed from a block
    without it still existing there.  This is extremely rare.
 -- BGQ - Fix for when a step completes in Slurm before the runjob_mux notifies
    the slurmctld there were software errors on some nodes.
 -- BGQ - Fix issue on state recover if block states are not around
    and when reading in state from DB2 we find a block that can't be created.
    You can now do a clean start to rid the bad block.
 -- Modify slurmdbd to retransmit to slurmctld daemon if it is not responding.
 -- BLUEGENE - Fix issue where when doing backfill preemptable jobs were
    never looked at to determine eligibility of backfillable job.
 -- Cray/BlueGene - Disable srun --pty option unless LaunchType=launch/slurm.
 -- CRAY - Fix sanity check for systems with more than 32 cores per node.
 -- CRAY - Remove other objects from MySQL query that are available from
    the XML.
 -- BLUEGENE - Set the geometry of a job when a block is picked and the job
    isn't a sub-block job.
 -- Cray - avoid check of macro versions of CLE for version 5.0.
 -- CRAY - Fix memory issue with reading in the cray.conf file.
 -- CRAY - If hostlist is given with srun make sure the node count is the same
    as the hosts given.
 -- CRAY - If task count specified, but no tasks-per-node, then set the tasks
    per node in the BASIL reservation request.
 -- CRAY - fix issue with --mem option not giving correct amount of memory
    per cpu.
 -- CRAY - Fix if srun --mem is given outside an allocation to set the
    APRUN_DEFAULT_MEMORY env var for aprun.  This scenario will not display
    the option when used with --launch-cmd.
 -- Change sview to use GMutex instead of GStaticMutex
 -- CRAY - set APRUN_DEFAULT_MEMROY instead of CRAY_AUTO_APRUN_OPTIONS
 -- sview - fix issue where if a partition was completely in one state the
    cpu count would be reflected correctly.
 -- BGQ - fix for handling half rack system in STATIC of OVERLAP mode to
    implicitly create full system block.
 -- CRAY - Dynamically create BASIL XML buffer to resize as needed.
 -- Fix checking if QOS limit MaxCPUMinsPJ is set along with DenyOnLimit to
    deny the job instead of holding it.
 -- Make sure on systems that use a different launcher than launch/slurm not
    to attempt to signal tasks on the frontend node.
 -- Cray - when a step is requested count other steps running on nodes in the
    allocation as taking up the entire node instead of just part of the node
    allocated.  And always enforce exclusive on a step request.
 -- Cray - display correct nodelist, node/cpu count on steps.
Morris Jette's avatar
Morris Jette committed
* Changes in Slurm 2.5.4
========================
 -- Fix bug in PrologSlurmctld use that would block job steps until node
    responds.
 -- CRAY - If a partition has MinNodes=0 and a batch job doesn't request nodes
    put the allocation to 1 instead of 0 which prevents the allocation to
    happen.
 -- Better debug when the database is down and using the --cluster option in
    the user commands.
 -- When asking for job states with sacct, default to 'now' instead of midnight
 -- Fix for handling a test-only job or immediate job that fails while being
    built.
 -- Comment out all of the logic in the job_submit/defaults plugin. The logic
    is only an example and not meant for actual use.
 -- Eliminate configuration file 4096 character line limitation.
 -- More robust logic for tree message forward
 -- BGQ - When cnodes fail in a timeout fashion correctly look up parent
    midplane.
 -- Correct sinfo "%c" (node's CPU count) output value for Bluegene systems.
 -- Backfill - Responsive improvements for systems with large numbers of jobs
    (>5000) and using the SchedulerParameters option bf_max_job_user.
 -- slurmstepd: ensure that IO redirection openings from/to files correctly
    handle interruption
 -- BGQ - Able to handle when midplanes go into Hardware::SoftwareFailure
 -- GRES - Correct tracking of specific resources used after slurmctld restart.
    Counts would previously go negative as jobs terminate and decrement from
    a base value of zero.
 -- Fix for priority/multifactor2 plugin to not assert when configured with
    --enable-debug.
 -- Select/cons_res - If the job request specified --ntasks-per-socket and the
    allocation using is cores, then pack the tasks onto the sockets up to the
    specified value.
 -- BGQ - If a cnode goes into an 'error' state and the block containing the
    cnode does not have a job running on it do not resume the block.
 -- BGQ - Handle blocks that don't free themselves in a reasonable time better.
 -- BGQ - Fix for signaling steps when allocation ends before step.
 -- Fix for backfill scheduling logic with job preemption; starts more jobs.
 -- xcgroup - remove bugs with EINTR management in write calls
 -- jobacct_gather - fix total values to not always == the max values.
 -- Fix for handling node registration messages from older versions without
    energy data.
 -- BGQ - Allow user to request full dimensional mesh.
 -- sdiag command - Correction to jobs started value reported.
 -- Prevent slurmctld assert when invalid change to reservation with running
    jobs is made.
 -- BGQ - If signal is NODE_FAIL allow forward even if job is completing
    and timeout in the runjob_mux trying to send in this situation.
 -- BGQ - More robust checking for correct node, task, and ntasks-per-node
    options in srun, and push that logic to salloc and sbatch.
 -- GRES topology bug in core selection logic fixed.
 -- Fix to handle init.d script for querying status and not return 1 on
    success.
* Changes in SLURM 2.5.3
========================
 -- Gres/gpu plugin - If no GPUs requested, set CUDA_VISIBLE_DEVICES=NoDevFiles.
    This bug was introduced in 2.5.2 for the case where a GPU count was
    configured, but without device files.
 -- task/affinity plugin - Fix bug in CPU masks for some processors.
 -- Modify sacct command to get format from SACCT_FORMAT environment variable.
 -- BGQ - Changed order of library inclusions and fixed incorrect declaration
    to compile correctly on newer compilers
 -- Fix for not building sview if glib exists on a system but not the gtk libs.
 -- BGQ - Fix for handling a job cleanup on a small block if the job has long
    since left the system.
 -- Fix race condition in job dependency logic which can result in invalid
    memory reference.
Morris Jette's avatar
Morris Jette committed
* Changes in SLURM 2.5.2
========================
 -- Fix advanced reservation recovery logic when upgrading from version 2.4.
 -- BLUEGENE - fix for QOS/Association node limits.
 -- Add missing "safe" flag from print of AccountStorageEnforce option.
 -- Fix logic to optimize GRES topology with respect to allocated CPUs.
 -- Add job_submit/all_partitions plugin to set a job's default partition
    to ALL available partitions in the cluster.
 -- Modify switch/nrt logic to permit build without libnrt.so library.
 -- Handle srun task launch failure without duplicate error messages or abort.
 -- Fix bug in QoS limits enforcement when slurmctld restarts and user not yet
    added to the QOS list.
 -- Fix issue where sjstat and sjobexitmod was installed in 2 different RPMs.
 -- Fix for job request of multiple partitions in which some partitions lack
    nodes with required features.
 -- Permit a job to use a QOS they do not have access to if an administrator
    manually set the job's QOS (previously the job would be rejected).
 -- Make more variables available to job_submit/lua plugin: slurm.MEM_PER_CPU,
    slurm.NO_VAL, etc.
 -- Fix topology/tree logic when nodes defined in slurm.conf get re-ordered.
 -- In select/cons_res, correct logic to allocate whole sockets to jobs. Work
    by Magnus Jonsson, Umea University.
 -- In select/cons_res, correct logic when job removed from only some nodes.
 -- Avoid apparent kernel bug in 2.6.32 which apparently is solved in
    at least 3.5.0.  This avoids a stack overflow when running jobs on
    more than 120k nodes.
 -- BLUEGENE - If we made a block that isn't runnable because of a overlapping
    block, destroy it correctly.
 -- Switch/nrt - Dynamically load libnrt.so from within the plugin as needed.
    This eliminates the need for libnrt.so on the head node.
 -- BLUEGENE - Fix in reservation logic that could cause abort.
* Changes in SLURM 2.5.1
========================
 -- Correction to hostlist sorting for hostnames that contain two numeric
    components and the first numeric component has various sizes (e.g.
    "rack9blade1" should come before "rack10blade1")
 -- BGQ - Only poll on initialized blocks instead of calling getBlocks on
    each block independently.
 -- Fix of task/affinity plugin logic for Power7 processors having hyper-
    threading disabled (cpu mask has gaps).
Chris Reed's avatar
Chris Reed committed
 -- Fix of job priority ordering with sched/builtin and priority/multifactor.
Chris Read's avatar
Chris Read committed
    Patch from Chris Read.
 -- CRAY - Fix for setting up the aprun for a large job (+2000 nodes).
 -- Fix for race condition related to compute node boot resulting in node being
    set down with reason of "Node <name> unexpectedly rebooted"
 -- RAPL - Fix for handling errors when opening msr files.
 -- BGQ - Fix for salloc/sbatch to do the correct allocation when asking for
    -N1 -n#.
 -- BGQ - in emulation make it so we can pretend to run large jobs (>64k nodes)
 -- BLUEGENE - Correct method to update conn_type of a job.
 -- BLUEGENE - Fix issue with preemption when needing to preempt multiple jobs
    to make one job run.
 -- Fixed issue where if an srun dies inside of an allocation abnormally it
    would of also killed the allocation.
 -- FRONTEND - fixed issue where if a systems nodes weren't defined in the
    slurm.conf with NodeAddr's signals going to a step could be handled
    incorrectly.
 -- If sched/backfill starts a job with a QOS having NO_RESERVE and not job
    time limit, start it with the partition time limit (or one year if the
    partition has no time limit) rather than NO_VAL (140 year time limit);
 -- Alter hostlist logic to allocate large grid dynamically instead of on
    stack.
 -- Change RPC version checks to support version 2.5 slurmctld with version 2.4
    slurmd daemons.
Morris Jette's avatar
Morris Jette committed
 -- Correct core reservation logic for use with select/serial plugin.
 -- Exit scontrol command on stdin EOF.
 -- Disable job --exclusive option with select/serial plugin.
* Changes in SLURM 2.5.0
========================
 -- Add DenyOnLimit flag for QOS to deny jobs at submission time if they
    request resources that reach a 'Max' limit.
 -- Permit SlurmUser or operator to change QOS of non-pending jobs (e.g.
    running jobs).
 -- BGQ - move initial poll to beginning of realtime interaction, which will
    also cause it to run if the realtime server ever goes away.
* Changes in SLURM 2.5.0-rc2
============================
 -- Modify sbcast logic to survive slurmd daemon restart while file a
    transmission is in progress.
 -- Add retry logic to munge encode/decode calls. This is needed if the munge
    deamon is under very heavy load (e.g. with 1000 slurmd daemons per compute
    node).
 -- Add launch and acct_gather_energy plugins to RPMs.
 -- Restore support for srun "--mpi=list" option.
 -- CRAY - Introduce step accounting for a Cray.
 -- Modify srun to abandon I/O 60 seconds after the last task ends. Otherwise
    an aborted slurmstepd can cause the srun process to hang indefinitely.
 -- ENERGY - RAPL - alter code to close open files (and only open them once
    where needed)
 -- If the PrologSlurmctld fails, then requeue the job an indefinite number
    of times instead of only one time.
* Changes in SLURM 2.5.0-rc1
============================
 -- Added Prolog and Epilog Guide (web page). Based upon work by Jason Sollom,
    Cray Inc. and used by permission.
 -- Restore gang scheduling functionality. Preemptor was not being scheduled.
    Fix for bugzilla #3.
Morris Jette's avatar
Morris Jette committed
 -- Add "cpu_load" to node information. Populate CPULOAD in node information
    reported to Moab cluster manager.
Morris Jette's avatar
Morris Jette committed
 -- Preempt jobs only when insufficient idle resources exist to start job,
    regardless of the node weight.
 -- Added priority/multifactor2 plugin based upon ticket distribution system.
    Work by Janne Blomqvist, Aalto University.
 -- Add SLURM_NODELIST to environment variables available to Prolog and Epilog.
 -- Permit reservations to allow or deny access by account and/or user.
 -- Add ReconfigFlags value of KeepPartState. See "man slurm.conf" for details.
 -- Modify the task/cgroup plugin adding a task_pre_launch_priv function and
    move slurmstepd outside of the step's cgroup. Work by Matthieu Hautreux.
 -- Intel MIC processor support added using gres/mic plugin. BIG thanks to
    Olli-Pekka Lehto, CSC-IT Center for Science Ltd.
 -- Accounting - Change empty jobacctinfo structs to not actually be used
    instead of putting 0's into the database we put NO_VALS and have sacct
    figure out jobacct_gather wasn't used.
 -- Cray - Prevent calling basil_confirm more than once per job using a flag.
 -- Fix bug with topology/tree and job with min-max node count. Now try to
    get max node count rather than minimizing leaf switches used.
 -- Add AccountingStorageEnforce=safe option to provide method to avoid jobs
    launching that wouldn't be able to run to completion because of a
    GrpCPUMins limit.
 -- Add support for RFC 5424 timestamps in logfiles. Disable with configuration
    option of "--disable-rfc5424time". By Janne Blomqvist, Aalto University.
 -- CRAY - Replace srun.pl with launch/aprun plugin to use srun to wrap the
    aprun process instead of a perl script.
Danny Auble's avatar
Danny Auble committed
 -- srun - Rename --runjob-opts to --launcher-opts to be used on systems other
    than BGQ.
 -- Added new DebugFlags - Energy for AcctGatherEnergy plugins.
 -- start deprecation of sacct --dump --fdump
 -- BGQ - added --verbose=OFF when srun --quiet is used
 -- Added acct_gather_energy/rapl plugin to record power consumption by job.
    Work by Yiannis Georgiou, Martin Perry, et. al., Bull

* Changes in SLURM 2.5.0.pre3
=============================
 -- Add Google search to all web pages.
 -- Add sinfo -T option to print reservation information. Work by Bill Brophy,
    Bull.
 -- Force slurmd exit after 2 minute wait, even if threads are hung.
 -- Change node_req field in struct job_resources from 8 to 32 bits so we can
    run more than 256 jobs per node.
 -- sched/backfill: Improve accuracy of expected job start with respect to
    reservations.
 -- sinfo partition field size will be set the the length of the longest
    partition name by default.
 -- Make it so the parse_time will return a valid 0 if given epoch time and
    set errno == ESLURM_INVALID_TIME_VALUE on error instead.
 -- Correct srun --no-alloc logic when node count exceeds node list or task
    task count is not a multiple of the node count. Work by Hongjia Cao, NUDT.
 -- Completed integration with IBM Parallel Environment including POE and IBM's
    NRT switch library.

* Changes in SLURM 2.5.0.pre2
=============================
 -- When running with multiple slurmd daemons per node, enable specifying a
    range of ports on a single line of the node configuration in slurm.conf.
 -- Add reservation flag of Part_Nodes to allocate all nodes in a partition to
    a reservation and automatically change the reservation when nodes are
    added to or removed from the reservation. Based upon work by
    Bill Brophy, Bull.
 -- Add support for advanced reservation for specific cores rather than whole
    nodes. Current limiations: homogeneous cluster, nodes idle when reservation
    created, and no more than one reservation per node. Code is still under
    development. Work by Alejandro Lucero Palau, et. al, BSC.
 -- Add DebugFlag of Switch to log switch plugin details.
 -- Correct job node_cnt value in job completion plugin when job fails due to
    down node. Previously was too low by one.
 -- Add new srun option --cpu-freq to enable user control over the job's CPU
    frequency and thus it's power consumption. NOTE: cpu frequency is not
    currently preserved for jobs being suspended and later resumed. Work by
    Don Albert, Bull.
 -- Add node configuration information about "boards" and optimize task
    placement on minimum number of boards. Work by Rod Schultz, Bull.
* Changes in SLURM 2.5.0.pre1
=============================
 -- Add new output to "scontrol show configuration" of LicensesUsed. Output is
    "name:used/total"
 -- Changed jobacct_gather plugin infrastructure to be cleaner and easier to
    maintain.
 -- Change license option count separator from "*" to ":" for consistency with
    the gres option (e.g. "--licenses=foo:2 --gres=gpu:2"). The "*" will still
    be accepted, but is no longer documented.
 -- Permit more than 100 jobs to be scheduled per node (new limit is 250
Danny Auble's avatar
Danny Auble committed
 -- Restructure of srun code to allow outside programs to utilize existing
    logic.
Morris Jette's avatar
Morris Jette committed
* Changes in SLURM 2.4.6
========================
 -- Correct WillRun authentication logic when issued for non-job owner.
Danny Auble's avatar
Danny Auble committed
 -- BGQ - fix memory leak
 -- BGQ - Fix to check block for action 'D' if it also has nodes in error.
* Changes in SLURM 2.4.5
========================
 -- Cray - On job kill requeust, send SIGCONT, SIGTERM, wait KillWait and send
    SIGKILL. Previously just sent SIGKILL to tasks.
 -- BGQ - Fix issue when running srun outside of an allocation and only
    specifying the number of tasks and not the number of nodes.
 -- BGQ - validate correct ntasks_per_node
 -- BGQ - when srun -Q is given make runjob be quiet
 -- Modify use of OOM (out of memory protection) for Linux 2.6.36 kernel
    or later. NOTE: If you were setting the environment variable
    SLURMSTEPD_OOM_ADJ=-17, it should be set to -1000 for Linux 2.6.36 kernel
    or later.
 -- BGQ - Fix job step timeout actually happen when done from within an
    allocation.
 -- Reset node MAINT state flag when a reservation's nodes or flags change.
 -- Accounting - Fix issue where QOS usage was being zeroed out on a
    slurmctld restart.
 -- BGQ - Add 64 tasks per node as a valid option for srun when used with
    overcommit.
 -- BLUEGENE - With Dynamic layout mode - Fix issue where if a larger block
    was already in error and isn't deallocating and underlying hardware goes
    bad one could get overlapping blocks in error making the code assert when
    a new job request comes in.
 -- BGQ - handle pending actions on a block better when trying to deallocate it.
 -- Accounting - Fixed issue where if nodenames have changed on a system and
    you query against that with -N and -E you will get all jobs during that
    time instead of only the ones running on -N.
Danny Auble's avatar
Danny Auble committed
 -- BGP - Fix for HTC mode
 -- Accounting - If a job start message fails to the SlurmDBD reset the db_inx
    so it gets sent again.  This isn't a major problem since the start will
    happen when the job ends, but this does make things cleaner.
 -- If an salloc is waiting for an allocation to happen and is canceled by the
    user mark the state canceled instead of completed.
 -- Fix issue in accounting if a user puts a '\' in their job name.
 -- Accounting - Fix for if asking for users or accounts that were deleted
    with associations get the deleted associations as well.
 -- BGQ - Handle shared blocks that need to be removed and have jobs running
    on them.  This should only happen in extreme conditions.
 -- Fix inconsistency for hostlists that have more than 1 range.
 -- BGQ - Add mutex around recovery for the Real Time server to avoid hitting
    DB2 so hard.
 -- BGQ - If an allocation exists on a block that has a 'D' action on it fail
    job on future step creation attempts.
Morris Jette's avatar
Morris Jette committed
* Changes in SLURM 2.4.4
========================
 -- BGQ - minor fix to make build work in emulated mode.
 -- BGQ - Fix if large block goes into error and the next highest priority jobs
    are planning on using the block.  Previously it would fail those jobs
    erroneously.
 -- BGQ - Fix issue when a cnode going to an error (not SoftwareError) state
    with a job running or trying to run on it.
 -- Execute slurm_spank_job_epilog when there is no system Epilog configured.
 -- Fix for srun --test-only to work correctly with timelimits
 -- BGQ - If a job goes away while still trying to free it up in the
    database, and the job is running on a small block make sure we free up
    the correct node count.
 -- BGQ - Logic added to make sure a job has finished on a block before it is
    purged from the system if its front-end node goes down.
 -- Modify strigger so that a filter option of "--user=0" is supported.
 -- Correct --mem-per-cpu logic for core or socket allocations with multiple
    threads per core.
 -- Fix for older < glibc 2.4 systems to use euidaccess() instead of eaccess().
 -- BLUEGENE - Do not alter a pending job's node count when changing it's
 -- BGQ - Add functionality to make it so we track the actions on a block.
    This is needed for when a free request is added to a block but there are
    jobs finishing up so we don't start new jobs on the block since they will
    fail on start.
 -- BGQ - Fixed InactiveLimit to work correctly to avoid scenarios where a
    user's pending allocation was started with srun and then for some reason
    the slurmctld was brought down and while it was down the srun was removed.
 -- Fixed InactiveLimit math to work correctly
 -- BGQ - Add logic to make it so blocks can't use a midplane with a nodeboard
    in error for passthrough.
 -- BGQ - Make it so if a nodeboard goes in error any block using that midplane
    for passthrough gets removed on a dynamic system.
 -- BGQ - Fix for printing realtime server debug correctly.
 -- BGQ - Cleaner handling of cnode failures when reported through the runjob
    interface instead of through the normal method.
 -- smap - spread node information across multiple lines for larger systems.
 -- Cray - Defer salloc until after PrologSlurmctld completes.
 -- Correction to slurmdbd communications failure handling logic, incorrect
    error codes returned in some cases.
* Changes in SLURM 2.4.3
========================
 -- Accounting - Fix so complete 32 bit numbers can be put in for a priority.
 -- cgroups - fix if initial directory is non-existent SLURM creates it
    correctly.  Before the errno wasn't being checked correctly
 -- BGQ - fixed srun when only requesting a task count and not a node count
    to operate the same way salloc or sbatch did and assign a task per cpu
    by default instead of task per node.
 -- Fix salloc --gid to work correctly.  Reported by Brian Gilmer
 -- BGQ - fix smap to set the correct default MloaderImage
 -- BLUEGENE - updated documentation.
 -- Close the batch job's environment file when it contains no data to avoid
    leaking file descriptors.
 -- Fix sbcast's credential to last till the end of a job instead of the
    previous 20 minute time limit.  The previous behavior would fail for
    large files 20 minutes into the transfer.
 -- Return ESLURM_NODES_BUSY rather than ESLURM_NODE_NOT_AVAIL error on job
    submit when required nodes are up, but completing a job or in exclusive
    job allocation.
Danny Auble's avatar
Danny Auble committed
 -- Add HWLOC_FLAGS so linking to libslurm works correctly
 -- BGQ - If using backfill and a shared block is running at least one job
    and a job comes through backfill and can fit on the block without ending
    jobs don't set an end_time for the running jobs since they don't need to
    end to start the job.
 -- Initialize bind_verbose when using task/cgroup.
 -- BGQ - Fix for handling backfill much better when sharing blocks.
 -- BGQ - Fix for making small blocks on first pass if not sharing blocks.
 -- BLUEGENE - Remove force of default conn_type instead of leaving NAV
    when none are requested.  The Block allocator sets it up temporarily so
    this isn't needed.
 -- BLUEGENE - Fix deadlock issue when dealing with bad hardware if using
    static blocks.
 -- Fix to mysql plugin during rollup to only query suspended table when jobs
    reported some suspended time.
 -- Fix compile with glibc 2.16 (Kacper Kowalik)
 -- BGQ - fix for deadlock where a block has error on it and all jobs
    running on it are preemptable by scheduling job.
 -- proctrack/cgroup: Exclude internal threads from "scontrol list pids".
    Patch from Matthieu Hautreux, CEA.
 -- Memory leak fixed for select/linear when preempting jobs.
 -- Fix if updating begin time of a job to update the eligible time in
    accounting as well.
 -- BGQ - make it so you can signal steps when signaling the job allocation.
 -- BGQ - Remove extra overhead if a large block has many cnode failures.
 -- Priority/Multifactor - Fix issue with age factor when a job is estimated to
    start in the future but is able to run now.
 -- CRAY - update to work with ALPS 5.1
 -- BGQ - Handle issue of speed and mutexes when polling instead of using the
    realtime server.
 -- BGQ - Fix minor sorting issue with sview when sorting by midplanes.
 -- Accounting - Fix for handling per user max node/cpus limits on a QOS
    correctly for current job.
 -- Update documentation for -/+= when updating a reservation's
    users/accounts/flags
 -- Update pam module to work if using aliases on nodes instead of actual
    host names.
 -- Correction to task layout logic in select/cons_res for job with minimum
    and maximum node count.
 -- BGQ - Put final poll after realtime comes back into service to avoid
    having the realtime server go down over and over again while waiting
    for the poll to finish.
 -- task/cgroup/memory - ensure that ConstrainSwapSpace=no is correctly
    handled. Work by Matthieu Hautreux, CEA.
 -- CRAY - Fix for sacct -N option to work correctly
 -- CRAY - Update documentation to describe installation from rpm instead
    or previous piecemeal method.
 -- Fix sacct to work with QOS' that have previously been deleted.
 -- Added all available limits to the output of sacctmgr list qos
* Changes in SLURM 2.4.2
========================
 -- BLUEGENE - Correct potential deadlock issue when hardware goes bad and
    there are jobs running on that hardware.
 -- If job is submitted to more than one partition, it's partition pointer can
    be set to an invalid value. This can result in the count of CPUs allocated
    on a node being bad, resulting in over- or under-allocation of its CPUs.
    Patch by Carles Fenoy, BSC.
 -- Fix bug in task layout with select/cons_res plugin and --ntasks-per-node
    option. Patch by Martin Perry, Bull.
 -- BLUEGENE - remove race condition where if a block is removed while waiting
    for a job to finish on it the number of unused cpus wasn't updated
    correctly.
 -- BGQ - make sure we have a valid block when creating or finishing a step
    allocation.
 -- BLUEGENE - If a large block (> 1 midplane) is in error and underlying
    hardware is marked bad remove the larger block and create a block over
    just the bad hardware making the other hardware available to run on.
 -- BLUEGENE - Handle job completion correctly if an admin removes a block
    where other blocks on an overlapping midplane are running jobs.
 -- BLUEGENE - correctly remove running jobs when freeing a block.
 -- BGQ - correct logic to place multiple (< 1 midplane) steps inside a
    multi midplane block allocation.
 -- BGQ - Make it possible for a multi midplane allocation to run on more
    than 1 midplane but not the entire allocation.
 -- BGL - Fix for syncing users on block from Tim Wickberg
 -- Fix initialization of protocol_version for some messages to make sure it
    is always set when sending or receiving a message.
 -- Reset backfilled job counter only when explicitly cleared using scontrol.
    Patch from Alejandro Lucero Palau, BSC.
 -- BLUEGENE - Fix for handling blocks when a larger block will not free and
    while it is attempting to free underlying hardware is marked in error
    making small blocks overlapping with the freeing block.  This only
    applies to dynamic layout mode.
 -- Cray and BlueGene - Do not treat lack of usable front-end nodes when
    slurmctld deamon starts as a fatal error. Also preserve correct front-end
    node for jobs when there is more than one front-end node and the slurmctld
    daemon restarts.
 -- Correct parsing of srun/sbatch input/output/error file names so that only
    the name "none" is mapped to /dev/null and not any file name starting
    with "none" (e.g. "none.o").
 -- BGQ - added version string to the load of the runjob_mux plugin to verify
    the current plugin has been loaded when using runjob_mux_refresh_config
 -- CGROUPS - Use system mount/umount function calls instead of doing fork
    exec of mount/umount from Janne Blomqvist.
 -- BLUEGENE - correct start time setup when no jobs are blocking the way
    from Mark Nelson
 -- Fixed sacct --state=S query to return information about suspended jobs
    current or in the past.
 -- FRONTEND - Made error warning more apparent if a frontend node isn't
    configured correctly.
 -- BGQ - update documentation about runjob_mux_refresh_config which works
    correctly as of IBM driver V1R1M1 efix 008.
* Changes in SLURM 2.4.1
========================
 -- Fix bug for job state change from 2.3 -> 2.4 job state can now be preserved
    correctly when transitioning.  This also applies for 2.4.0 -> 2.4.1, no
    state will be lost. (Thanks to Carles Fenoy)

Danny Auble's avatar
Danny Auble committed
* Changes in SLURM 2.4.0
========================
 -- Cray - Improve support for zero compute note resource allocations.
    Partition used can now be configured with no nodes nodes.
 -- BGQ - make it so srun -i<taskid> works correctly.
 -- Fix parse_uint32/16 to complain if a non-digit is given.
 -- Add SUBMITHOST to job state passed to Moab vial sched/wiki2. Patch by Jon
    Bringhurst (LANL).
 -- BGQ - Fix issue when running with AllowSubBlockAllocations=Yes without
    compiling with --enable-debug
 -- Modify scontrol to require "-dd" option to report batch job's script. Patch
    from Don Albert, Bull.
 -- Modify SchedulerParamters option to match documentation: "bf_res="
    changed to "bf_resolution=". Patch from Rod Schultz, Bull.
 -- Fix bug that clears job pending reason field. Patch fron Don Lipari, LLNL.
 -- In etc/init.d/slurm move check for scontrol after sourcing
    /etc/sysconfig/slurm. Patch from Andy Wettstein, University of Chicago.
 -- Fix in scheduling logic that can delay jobs with min/max node counts.
 -- BGQ - fix issue where if a step uses the entire allocation and then
    the next step in the allocation only uses part of the allocation it gets
    the correct cnodes.
 -- BGQ - Fix checking for IO on a block with new IBM driver V1R1M1 previous
    function didn't always work correctly.
 -- BGQ - Fix issue when a nodeboard goes down and you want to combine blocks
    to make a larger small block and are running with sub-blocks.
 -- BLUEGENE - Better logic for making small blocks around bad nodeboard/card.
 -- BGQ - When using an old IBM driver cnodes that go into error because of
    a job kill timeout aren't always reported to the system.  This is now
    handled by the runjob_mux plugin.
 -- BGQ - Added information on how to setup the runjob_mux to run as SlurmUser.
 -- Improve memory consumption on step layouts with high task count.
 -- BGQ - quiter debug when the real time server comes back but there are
    still messages we find when we poll but haven't given it back to the real
    time yet.
 -- BGQ - fix for if a request comes in smaller than the smallest block and
    we must use a small block instead of a shared midplane block.
 -- Fix issues on large jobs (>64k tasks) to have the correct counter type when
    packing the step layout structure.
 -- BGQ - fix issue where if a user was asking for tasks and ntasks-per-node
    but not node count the node count is correctly figured out.
 -- Move logic to always use the 1st alphanumeric node as the batch host for
    batch jobs.
 -- BLUEGENE - fix race condition where if a nodeboard/card goes down at the
    same time a block is destroyed and that block just happens to be the
    smallest overlapping block over the bad hardware.
 -- Fix bug when querying accounting looking for a job node size.
 -- BLUEGENE - fix possible race condition if cleaning up a block and the
    removal of the job on the block failed.
 -- BLUEGENE - fix issue if a cable was in an error state make it so we can
    check if a block is still makable if the cable wasn't in error.
 -- Put nodes names in alphabetic order in node table.
 -- If preempted job should have a grace time and preempt mode is not cancel
    but job is going to be canceled because it is interactive or other reason
    it now receives the grace time.
 -- BGQ - Modified documents to explain new plugin_flags needed in bg.properties
    in order for the runjob_mux to run correctly.
 -- BGQ - change linking from libslurm.o to libslurmhelper.la to avoid warning.
Danny Auble's avatar
Danny Auble committed

* Changes in SLURM 2.4.0.rc1
Morris Jette's avatar
Morris Jette committed
=============================
Morris Jette's avatar
Morris Jette committed
 -- Improve task binding logic by making fuller use of HWLOC library,
    especially with respect to Opteron 6000 series processors. Work contributed
    by Komoto Masahiro.
 -- Add new configuration parameter PriorityFlags, based upon work by
    Carles Fenoy (Barcelona Supercomputer Center).
 -- Modify the step completion RPC between slurmd and slurmstepd in order to
    eliminate a possible deadlock. Based on work by Matthieu Hautreux, CEA.
 -- Change the owner of slurmctld and slurmdbd log files to the appropriate
    user. Without this change the files will be created by and owned by the
    user starting the daemons (likely user root).
 -- Reorganize the slurmstepd logic in order to better support NFS and
    Kerberos credentials via the AUKS plugin. Work by Matthieu Hautreux, CEA.
 -- Fix bug in allocating GRES that are associated with specific CPUs. In some
    cases the code allocated first available GRES to job instead of allocating
    GRES accessible to the specific CPUs allocated to the job.
 -- spank: Add callbacks in slurmd: slurm_spank_slurmd_{init,exit}
    and job epilog/prolog: slurm_spank_job_{prolog,epilog}
 -- spank: Add spank_option_getopt() function to api
 -- Change resolution of switch wait time from minutes to seconds.
 -- Added CrpCPUMins to the output of sshare -l for those using hard limit
    accounting.  Work contributed by Mark Nelson.
 -- Added mpi/pmi2 plugin for complete support of pmi2 including acquiring
    additional resources for newly launched tasks. Contributed by Hongjia Cao,
    NUDT.
 -- BGQ - fixed issue where if a user asked for a specific node count and more
    tasks than possible without overcommit the request would be allowed on more
    nodes than requested.
 -- Add support for new SchedulerParameters of bf_max_job_user, maximum number
    of jobs to attempt backfilling per user. Work by Bjørn-Helge Mevik,
    University of Oslo.
 -- BLUEGENE - fixed issue where MaxNodes limit on a partition only limited
    larger than midplane jobs.
 -- Added cpu_run_min to the output of sshare --long.  Work contributed by
    Mark Nelson.
 -- BGQ - allow regular users to resolve Rack-Midplane to AXYZ coords.
 -- Add sinfo output format option of "%R" for partition name without "*"
    appended for default partition.
 -- Cray - Add support for zero compute note resource allocation to run batch
    script on front-end node with no ALPS reservation. Useful for pre- or post-
    processing.
 -- Support for cyclic distribution of cpus in task/cgroup plugin from Martin
    Perry, Bull.
 -- GrpMEM limit for QOSes and associations added Patch from Bjørn-Helge Mevik,
    University of Oslo.
 -- Various performance improvements for up to 500% higher throughput depending
    upon configuration. Work supported by the Oak Ridge National Laboratory
    Extreme Scale Systems Center.
 -- Added jobacct_gather/cgroup plugin.  It is not advised to use this in
    production as it isn't currently complete and doesn't provide an equivalent
    substitution for jobacct_gather/linux yet. Work by Martin Perry, Bull.
Morris Jette's avatar
Morris Jette committed
* Changes in SLURM 2.4.0.pre4
=============================
 -- Add logic to cache GPU file information (bitmap index mapping to device
    file number) in the slurmd daemon and transfer that information to the
    slurmstepd whenever a job step is initiated. This is needed to set the
    appropriate CUDA_VISIBLE_DEVICES environment variable value when the
    devices are not in strict numeric order (e.g. some GPUs are skipped).
    Based upon work by Nicolas Bigaouette.
Danny Auble's avatar
Danny Auble committed
 -- BGQ - Remove ability to make a sub-block with a geometry with one or more
    of it's dimensions of length 3.  There is a limitation in the IBM I/O
    subsystem that is problematic with multiple sub-blocks with a dimension
    of length 3, so we will disallow them to be able to be created.  This
    mean you if you ask the system for an allocation of 12 c-nodes you will
    be given 16.  If this is ever fix in BGQ you can remove this patch.
 -- BLUEGENE - Better handling blocks that go into error state or deallocate
    while jobs are running on them.
 -- BGQ - fix for handling mix of steps running at same time some of which
    are full allocation jobs, and others that are smaller.
 -- BGQ - fix for core dump after running multiple sub-block jobs on static
    blocks.
 -- BGQ - fixed sync issue where if a job finishes in SLURM but not in mmcs
    for a long time after the SLURM job has been flushed from the system
    we don't have to worry about rebooting the block to sync the system.
 -- BGQ - In scontrol/sview node counts are now displayed with
    CnodeCount/CnodeErrCount so to point out there are cnodes in an error state
    on the block.  Draining the block and having it reboot when all jobs are
    gone will clear up the cnodes in Software Failure.
 -- Change default SchedulerParameters max_switch_wait field value from 60 to
    300 seconds.
 -- BGQ - catch errors from the kill option of the runjob client.
 -- BLUEGENE - make it so the epilog runs until slurmctld tells it the job is
    gone.  Previously it had a timelimit which has proven to not be the right
    thing.
 -- FRONTEND - fix issue where if a compute node was in a down state and
    an admin updates the node to idle/resume the compute nodes will go
    instantly to idle instead of idle* which means no response.
Danny Auble's avatar
Danny Auble committed
 -- Fix regression in 2.4.0.pre3 where number of submitted jobs limit wasn't
    being honored for QOS.
 -- Cray - Enable logging of BASIL communications with environment variables.
    Set XML_LOG to enable logging. Set XML_LOG_LOC to specify path to log file
    or "SLURM" to write to SlurmctldLogFile or unset for "slurm_basil_xml.log".
    Patch from Steve Tronfinoff, CSCS.
 -- FRONTEND - if a front end unexpectedly reboots kill all jobs but don't
    mark front end node down.
 -- FRONTEND - don't down a front end node if you have an epilog error
 -- BLUEGENE - if a job has an epilog error don't down the midplane it was
    running on.
 -- BGQ - added new DebugFlag (NoRealTime) for only printing debug from
    state change while the realtime server is running.
 -- Fix multi-cluster mode with sview starting on a non-bluegene cluster going
    to a bluegene cluster.
 -- BLUEGENE - ability to show Rack Midplane name of midplanes in sview and
    scontrol.
* Changes in SLURM 2.4.0.pre3
=============================
 -- Let a job be submitted even if it exceeds a QOS limit. Job will be left
    in a pending state until the QOS limit or job parameters change. Patch by
    Phil Eckert, LLNL.
Morris Jette's avatar
Morris Jette committed
 -- Add sacct support for the option "--name". Work by Yuri D'Elia, Center for
    Biomedicine, EURAC Research, Italy.
Danny Auble's avatar
Danny Auble committed
 -- BGQ - handle preemption.
 -- Add an srun shepard process to cancel a job and/or step of the srun process
    is killed abnormally (e.g. SIGKILL).
 -- BGQ - handle deadlock issue when a nodeboard goes into an error state.
 -- BGQ - more thorough handling of blocks with multiple jobs running on them.
 -- Fix man2html process to compile in the build directory instead of the
    source dir.
 -- Behavior of srun --multi-prog modified so that any program arguments
    specified on the command line will be appended to the program arguments
    specified in the program configuration file.
jette's avatar
jette committed
 -- Add new command, sdiag, which reports a variety of job scheduling
    statistics. Based upon work by Alejandro Lucero Palau, BSC.
 -- BLUEGENE - Added DefaultConnType to the bluegene.conf file.  This makes it
    so you can specify any connection type you would like (TORUS or MESH) as
    the default in dynamic mode.  Previously it always defaulted to TORUS.
 -- Made squeue -n and -w options more consistent with salloc, sbatch, srun,
    and scancel. Patch by Don Lipari, LLNL.
 -- Have sacctmgr remove user records when no associations exist for that user.
jette's avatar
jette committed
 -- Several header file changes for clean build with NetBSD. Patches from
    Aleksej Saushev.
 -- Fix for possible deadlock in accounting logic: Avoid calling
    jobacct_gather_g_getinfo() until there is data to read from the socket.
 -- Fix race condition that could generate "job_cnt_comp underflow" errors on
    front-end architectures.
 -- BGQ - Fix issue where a system with missing cables could cause core dump.
Morris Jette's avatar
Morris Jette committed
* Changes in SLURM 2.4.0.pre2
=============================
 -- CRAY - Add support for GPU memory allocation using SLURM GRES (Generic
    RESource) support. Work by Steve Trofinoff, CSCS.
 -- Add support for job allocations with multiple job constraint counts. For
    example: salloc -C "[rack1*2&rack2*4]" ... will allocate the job 2 nodes
    from rack1 and 4 nodes from rack2. Support for only a single constraint
    name been added to job step support.
 -- BGQ - Remove old method for marking cnodes down.
 -- BGQ - Remove BGP images from view in sview.
 -- BGQ - print out failed cnodes in scontrol show nodes.
 -- BGQ - Add srun option of "--runjob-opts" to pass options to the runjob
    command.
 -- FRONTEND - handle step launch failure better.
 -- BGQ - Added a mutex to protect the now changing ba_system pointers.
 -- BGQ - added new functionality for sub-block allocations - no preemption
    for this yet though.
 -- Add --name option to squeue to filter output by job name. Patch from Yuri
    D'Elia.
 -- BGQ - Added linking to runjob client libary which gives support to totalview
    to use srun instead of runjob.
 -- Add numeric range checks to scontrol update options. Patch from Phil
    Eckert, LLNL.
 -- Add ReconfigFlags configuration option to control actions of "scontrol
    reconfig". Patch from Don Albert, Bull.
 -- BGQ - handle reboots with multiple jobs running on a block.
 -- BGQ - Add message handler thread to forward signals to runjob process.
* Changes in SLURM 2.4.0.pre1
=============================
 -- BGQ - use the ba_geo_tables to figure out the blocks instead of the old
    algorithm.  The improves timing in the worst cases and simplifies the code
    greatly.
 -- BLUEGENE - Change to output tools labels from BP to Midplane
    (i.e. BP List -> MidplaneList).
 -- BLUEGENE - read MPs and BPs from the bluegene.conf
 -- Modify srun's SIGINT handling logic timer (two SIGINTs within one second) to
    be based microsecond rather than second timer.
 -- Modify advance reservation to accept multiple specific block sizes rather
    than a single node count.
 -- Permit administrator to change a job's QOS to any value without validating
    the job's owner has permission to use that QOS. Based upon patch by Phil
    Eckert (LLNL).
 -- Add trigger flag for a permanent trigger. The trigger will NOT be purged
    after an event occurs, but only when explicitly deleted.
 -- Interpret a reservation with Nodes=ALL and a Partition specification as
    reserving all nodes within the specified partition rather than all nodes
    on the system. Based upon patch by Phil Eckert (LLNL).
 -- Add the ability to reboot all compute nodes after they become idle. The
    RebootProgram configuration parameter must be set and an authorized user
    must execute the command "scontrol reboot_nodes". Patch from Andriy
    Grytsenko (Massive Solutions Limited).
 -- Modify slurmdbd.conf parsing to accept DebugLevel strings (quiet, fatal,
    info, etc.) in addition to numeric values. The parsing of slurm.conf was
    modified in the same fashion for SlurmctldDebug and SlurmdDebug values.
    The output of sview and "scontrol show config" was also modified to report
    those values as strings rather than numeric values.
 -- Changed default value of StateSaveLocation configuration parameter from
    /tmp to /var/spool.
 -- Prevent associations from being deleted if it has any jobs in running,
    pending or suspended state. Previous code prevented this only for running
    jobs.
 -- If a job can not run due to QOS or association limits, then do not cancel
    the job, but leave it pending in a system held state (priority = 1). The
    job will run when its limits or the QOS/association limits change. Based
    upon a patch by Phil Ekcert (LLNL).
 -- BGQ - Added logic to keep track of cnodes in an error state inside of a
    booted block.
 -- Added the ability to update a node's NodeAddr and NodeHostName with
    scontrol. Also enable setting a node's state to "future" using scontrol.
 -- Add a node state flag of CLOUD and save/restore NodeAddr and NodeHostName
    information for nodes with a flag of CLOUD.
 -- Cray: Add support for job reservations with node IDs that are not in
    numeric order. Fix for Bugzilla #5.
 -- BGQ - Fix issue with smap -R
 -- Fix association limit support for jobs queued for multiple partitions.
 -- BLUEGENE - fix issue for sub-midplane systems to create a full system
    block correctly.
 -- BLUEGENE - Added option to the bluegene.conf to tell you are running on
    a sub midplane system.
 -- Added the UserID of the user issuing the RPC to the job_submit/lua
    functions.
 -- Fixed issue where if a job ended with ESLURMD_UID_NOT_FOUND and
    ESLURMD_GID_NOT_FOUND where slurm would be a little over zealous
    in treating missing a GID or UID as a fatal error.
 -- If job time limit exceeds partition maximum, but job's minimum time limit
    does not, set job's time limit to partition maximum at allocation time.
* Changes in SLURM 2.3.6
========================
 -- Fix DefMemPerCPU for partition definitions.
 -- Fix to create a reservation with licenses and no nodes.
 -- Fix issue with assoc_mgr if a bad state file is given and the database
    isn't up at the time the slurmctld starts, not running the
    priority/multifactor plugin, and then the database is started up later.
 -- Gres: If a gres has a count of one and an associated file then when doing