NEWS 133 KB
Newer Older
Christopher J. Morrone's avatar
Christopher J. Morrone committed
This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.
Morris Jette's avatar
Morris Jette committed
* Changes in SLURM 2.4.0.pre4
=============================
 -- Add logic to cache GPU file information (bitmap index mapping to device
    file number) in the slurmd daemon and transfer that information to the
    slurmstepd whenever a job step is initiated. This is needed to set the
    appropriate CUDA_VISIBLE_DEVICES environment variable value when the
    devices are not in strict numeric order (e.g. some GPUs are skipped).
    Based upon work by Nicolas Bigaouette.
Danny Auble's avatar
Danny Auble committed
 -- BGQ - Remove ability to make a sub-block with a geometry with one or more
    of it's dimensions of length 3.  There is a limitation in the IBM I/O
    subsystem that is problematic with multiple sub-blocks with a dimension
    of length 3, so we will disallow them to be able to be created.  This
    mean you if you ask the system for an allocation of 12 c-nodes you will
    be given 16.  If this is ever fix in BGQ you can remove this patch.
 -- BLUEGENE - Better handling blocks that go into error state or deallocate
    while jobs are running on them.
 -- BGQ - fix for handling mix of steps running at same time some of which
    are full allocation jobs, and others that are smaller.
 -- BGQ - fix for core dump after running multiple sub-block jobs on static
    blocks.
 -- BGQ - fixed sync issue where if a job finishes in SLURM but not in mmcs
    for a long time after the SLURM job has been flushed from the system
    we don't have to worry about rebooting the block to sync the system.
 -- BGQ - In scontrol/sview node counts are now displayed with
    CnodeCount/CnodeErrCount so to point out there are cnodes in an error state
    on the block.  Draining the block and having it reboot when all jobs are
    gone will clear up the cnodes in Software Failure.
 -- Change default SchedulerParameters max_switch_wait field value from 60 to
    300 seconds.
 -- BGQ - catch errors from the kill option of the runjob client.
 -- BLUEGENE - make it so the epilog runs until slurmctld tells it the job is
    gone.  Previously it had a timelimit which has proven to not be the right
    thing.
 -- FRONTEND - fix issue where if a compute node was in a down state and
    an admin updates the node to idle/resume the compute nodes will go
    instantly to idle instead of idle* which means no response.
Danny Auble's avatar
Danny Auble committed
 -- Fix regression in 2.4.0.pre3 where number of submitted jobs limit wasn't
    being honored for QOS.
* Changes in SLURM 2.4.0.pre3
=============================
 -- Let a job be submitted even if it exceeds a QOS limit. Job will be left
    in a pending state until the QOS limit or job parameters change. Patch by
    Phil Eckert, LLNL.
Morris Jette's avatar
Morris Jette committed
 -- Add sacct support for the option "--name". Work by Yuri D'Elia, Center for
    Biomedicine, EURAC Research, Italy.
Danny Auble's avatar
Danny Auble committed
 -- BGQ - handle preemption.
 -- Add an srun shepard process to cancel a job and/or step of the srun process
    is killed abnormally (e.g. SIGKILL).
 -- BGQ - handle deadlock issue when a nodeboard goes into an error state.
 -- BGQ - more thorough handling of blocks with multiple jobs running on them.
 -- Fix man2html process to compile in the build directory instead of the
    source dir.
 -- Behavior of srun --multi-prog modified so that any program arguments
    specified on the command line will be appended to the program arguments
    specified in the program configuration file.
jette's avatar
jette committed
 -- Add new command, sdiag, which reports a variety of job scheduling
    statistics. Based upon work by Alejandro Lucero Palau, BSC.
 -- BLUEGENE - Added DefaultConnType to the bluegene.conf file.  This makes it
    so you can specify any connection type you would like (TORUS or MESH) as
    the default in dynamic mode.  Previously it always defaulted to TORUS.
 -- Made squeue -n and -w options more consistent with salloc, sbatch, srun,
    and scancel. Patch by Don Lipari, LLNL.
 -- Have sacctmgr remove user records when no associations exist for that user.
jette's avatar
jette committed
 -- Several header file changes for clean build with NetBSD. Patches from
    Aleksej Saushev.
 -- Fix for possible deadlock in accounting logic: Avoid calling
    jobacct_gather_g_getinfo() until there is data to read from the socket.
 -- Fix race condition that could generate "job_cnt_comp underflow" errors on
    front-end architectures.
 -- BGQ - Fix issue where a system with missing cables could cause core dump.
Morris Jette's avatar
Morris Jette committed
* Changes in SLURM 2.4.0.pre2
=============================
 -- CRAY - Add support for GPU memory allocation using SLURM GRES (Generic
    RESource) support. Work by Steve Trofinoff, CSCS.
 -- Add support for job allocations with multiple job constraint counts. For
    example: salloc -C "[rack1*2&rack2*4]" ... will allocate the job 2 nodes
    from rack1 and 4 nodes from rack2. Support for only a single constraint
    name been added to job step support.
 -- BGQ - Remove old method for marking cnodes down.
 -- BGQ - Remove BGP images from view in sview.
 -- BGQ - print out failed cnodes in scontrol show nodes.
 -- BGQ - Add srun option of "--runjob-opts" to pass options to the runjob
    command.
 -- FRONTEND - handle step launch failure better.
 -- BGQ - Added a mutex to protect the now changing ba_system pointers.
 -- BGQ - added new functionality for sub-block allocations - no preemption
    for this yet though.
 -- Add --name option to squeue to filter output by job name. Patch from Yuri
    D'Elia.
 -- BGQ - Added linking to runjob client libary which gives support to totalview
    to use srun instead of runjob.
 -- Add numeric range checks to scontrol update options. Patch from Phil
    Eckert, LLNL.
 -- Add ReconfigFlags configuration option to control actions of "scontrol
    reconfig". Patch from Don Albert, Bull.
 -- BGQ - handle reboots with multiple jobs running on a block.
 -- BGQ - Add message handler thread to forward signals to runjob process.
* Changes in SLURM 2.4.0.pre1
=============================
 -- BGQ - use the ba_geo_tables to figure out the blocks instead of the old
    algorithm.  The improves timing in the worst cases and simplifies the code
    greatly.
 -- BLUEGENE - Change to output tools labels from BP to Midplane
    (i.e. BP List -> MidplaneList).
 -- BLUEGENE - read MPs and BPs from the bluegene.conf
 -- Modify srun's SIGINT handling logic timer (two SIGINTs within one second) to
    be based microsecond rather than second timer.
 -- Modify advance reservation to accept multiple specific block sizes rather
    than a single node count.
 -- Permit administrator to change a job's QOS to any value without validating
    the job's owner has permission to use that QOS. Based upon patch by Phil
    Eckert (LLNL).
 -- Add trigger flag for a permanent trigger. The trigger will NOT be purged
    after an event occurs, but only when explicitly deleted.
 -- Interpret a reservation with Nodes=ALL and a Partition specification as
    reserving all nodes within the specified partition rather than all nodes
    on the system. Based upon patch by Phil Eckert (LLNL).
 -- Add the ability to reboot all compute nodes after they become idle. The
    RebootProgram configuration parameter must be set and an authorized user
    must execute the command "scontrol reboot_nodes". Patch from Andriy
    Grytsenko (Massive Solutions Limited).
 -- Modify slurmdbd.conf parsing to accept DebugLevel strings (quiet, fatal,
    info, etc.) in addition to numeric values. The parsing of slurm.conf was
    modified in the same fashion for SlurmctldDebug and SlurmdDebug values.
    The output of sview and "scontrol show config" was also modified to report
    those values as strings rather than numeric values.
 -- Changed default value of StateSaveLocation configuration parameter from
    /tmp to /var/spool.
 -- Prevent associations from being deleted if it has any jobs in running,
    pending or suspended state. Previous code prevented this only for running
    jobs.
 -- If a job can not run due to QOS or association limits, then do not cancel
    the job, but leave it pending in a system held state (priority = 1). The
    job will run when its limits or the QOS/association limits change. Based
    upon a patch by Phil Ekcert (LLNL).
 -- BGQ - Added logic to keep track of cnodes in an error state inside of a
    booted block.
 -- Added the ability to update a node's NodeAddr and NodeHostName with
    scontrol. Also enable setting a node's state to "future" using scontrol.
 -- Add a node state flag of CLOUD and save/restore NodeAddr and NodeHostName
    information for nodes with a flag of CLOUD.
 -- Cray: Add support for job reservations with node IDs that are not in
    numeric order. Fix for Bugzilla #5.
 -- BGQ - Fix issue with smap -R
 -- Fix association limit support for jobs queued for multiple partitions.
 -- BLUEGENE - fix issue for sub-midplane systems to create a full system
    block correctly.
 -- BLUEGENE - Added option to the bluegene.conf to tell you are running on
    a sub midplane system.
 -- Added the UserID of the user issuing the RPC to the job_submit/lua
    functions.
 -- Fixed issue where if a job ended with ESLURMD_UID_NOT_FOUND and
    ESLURMD_GID_NOT_FOUND where slurm would be a little over zealous
    in treating missing a GID or UID as a fatal error.
 -- If job time limit exceeds partition maximum, but job's minimum time limit
    does not, set job's time limit to partition maximum at allocation time.
Morris Jette's avatar
Morris Jette committed
* Changes in SLURM 2.3.4
========================
 -- Set DEFAULT flag in partition structure when slurmctld reads the
    configuration file. Patch from Rémi Palancher.
 -- Fix for possible deadlock in accounting logic: Avoid calling
    jobacct_gather_g_getinfo() until there is data to read from the socket.
 -- Fix typo in accounting when using reservations. Patch from Alejandro
    Lucero Palau.
 -- Fix to the multifactor priority plugin to calculate effective usage earlier
    to give a correct priority on the first decay cycle after a restart of the
    slurmctld. Patch from Martin Perry, Bull.
 -- Permit user root to run a job step for any job as any user. Patch from
    Didier Gazen, Laboratoire d'Aerologie.
 -- BLUEGENE - fix for not allowing jobs if all midplanes are drained and all
    blocks are in an error state.
 -- Avoid slurmctld abort due to bad pointer when setting an advanced
    reservation MAINT flag if it contains no nodes (only licenses).
Morris Jette's avatar
Morris Jette committed
 -- Fix bug when requeued batch job is scheduled to run on a different node
    zero, but attemts job launch on old node zero.
 -- Fix bug in step task distribution when nodes are not configured in numeric
    order. Patch from Hongjia Cao, NUDT.
 -- Fix for srun allocating running within existing allocation with --exclude
    option and --nnodes count small enough to remove more nodes. Patch from
    Phil Eckert, LLNL.
 -- Work around to handle certain combinations of glibc/kernel
    (i.e. glibc-2.14/Linux-3.1) to correctly open the pty of the slurmstepd
    as the job user. Patch from Mark Grondona, LLNL.
 -- Modify linking to include "-ldl" only when needed. Patch from Aleksej
    Saushev.
 -- Fix smap regression to display nodes that are drained or down correctly.
 -- Several bug fixes and performance improvements with related to batch
    scripts containing very large numbers of arguments. Patches from Par
    Andersson, NSC.
 -- Fixed extremely hard to reproduce threading issue in assoc_mgr.
 -- Correct "scontrol show daemons" output if there is more than one
    ControlMachine configured.
 -- Add node read lock where needed in slurmctld/agent code.
Morris Jette's avatar
Morris Jette committed
 -- Added test for LUA library named "liblua5.1.so.0" in addition to
    "liblua5.1.so" as needed by Debian. Patch by Remi Palancher.
 -- Added partition default_time field to job_submit LUA plugin. Patch by
    Remi Palancher.
 -- Fix bug in cray/srun wrapper stdin/out/err file handling.
 -- In cray/srun wrapper, only include aprun "-q" option when srun "--quiet"
    option is used.
 -- BLUEGENE - fix issue where if a small block was in error it could hold up
    the queue when trying to place a larger than midplane job.
Morris Jette's avatar
Morris Jette committed

* Changes in SLURM 2.3.3
========================
 -- Fix task/cgroup plugin error when used with GRES. Patch by Alexander
    Bersenev (Institute of Mathematics and Mechanics, Russia).
 -- Permit pending job exceeding a partition limit to run if its QOS flag is
    modified to permit the partition limit to be exceeded. Patch from Bill
    Brophy, Bull.
 -- BLUEGENE - Fixed preemption issue.
 -- sacct search for jobs using filtering was ignoring wckey filter.
 -- Fixed issue with QOS preemption when adding new QOS.
 -- Fixed issue with comment field being used in a job finishing before it
    starts in accounting.
 -- Add slashes in front of derived exit code when modifying a job.
 -- Handle numeric suffix of "T" for terabyte units. Patch from John Thiltges,
    University of Nebraska-Lincoln.
 -- Prevent resetting a held job's priority when updating other job parameters.
    Patch from Alejandro Lucero Palau, BSC.
Morris Jette's avatar
Morris Jette committed
 -- Improve logic to import a user's environment. Needed with --get-user-env
    option used with Moab. Patch from Mark Grondona, LLNL.
 -- Fix bug in sview layout if node count less than configured grid_x_width.
 -- Modify PAM module to prefer to use SLURM library with same major release
    number that it was built with.
 -- Permit gres count configuration of zero.
 -- Fix race condition where sbcast command can result in deadlock of slurmd
    daemon. Patch by Don Albert, Bull.
 -- Fix bug in srun --multi-prog configuration file to avoid printing duplicate
    record error when "*" is used at the end of the file for the task ID.
 -- Let operators see reservation data even if "PrivateData=reservations" flag
    is set in slurm.conf. Patch from Don Albert, Bull.
 -- Added new sbatch option "--export-file" as needed for latest version of
    Moab. Patch from Phil Eckert, LLNL.
 -- Fix for sacct printing CPUTime(RAW) where the the is greater than a 32 bit
    number.
 -- Fix bug in --switch option with topology resulting in bad switch count use.
    Patch from Alejandro Lucero Palau (Barcelona Supercomputer Center).
 -- Fix PrivateFlags bug when using Priority Multifactor plugin.  If using sprio
    all jobs would be returned even if the flag was set.
    Patch from Bill Brophy, Bull.
 -- Fix for possible invalid memory reference in slurmctld in job dependency
    logic. Patch from Carles Fenoy (Barcelona Supercomputer Center).
* Changes in SLURM 2.3.2
========================
 -- Add configure option of "--without-rpath" which builds SLURM tools without
    the rpath option, which will work if Munge and BlueGene libraries are in
    the default library search path and make system updates easier.
 -- Fixed issue where if a job ended with ESLURMD_UID_NOT_FOUND and
    ESLURMD_GID_NOT_FOUND where slurm would be a little over zealous
    in treating missing a GID or UID as a fatal error.
 -- Backfill scheduling - Add SchedulerParameters configuration parameter of
    "bf_res" to control the resolution in the backfill scheduler's data about
    when jobs begin and end. Default value is 60 seconds (used to be 1 second).
 -- Cray - Remove the "family" specification from the GPU reservation request.
 -- Updated set_oomadj.c, replacing deprecated oom_adj reference with
    oom_score_adj
 -- Fix resource allocation bug, generic resources allocation was ignoring the
    job's ntasks_per_node and cpus_per_task parameters. Patch from Carles
    Fenoy, BSC.
 -- Avoid orphan job step if slurmctld is down when a job step completes.
 -- Fix Lua link order, patch from Pär Andersson, NSC.
 -- Set SLURM_CPUS_PER_TASK=1 when user specifies --cpus-per-task=1.
 -- Fix for fatal error managing GRES. Patch by Carles Fenoy, BSC.
 -- Fixed race condition when using the DBD in accounting where if a job
    wasn't started at the time the eligible message was sent but started
    before the db_index was returned information like start time would be lost.
 -- Fix issue in accounting where normalized shares could be updated
    incorrectly when getting fairshare from the parent.
 -- Fixed if not enforcing associations  but want QOS support for a default
    qos on the cluster to fill that in correctly.
 -- Fix in select/cons_res for "fatal: cons_res: sync loop not progressing"
    with some configurations and job option combinations.
 -- BLUEGNE - Fixed issue with handling HTC modes and rebooting.
* Changes in SLURM 2.3.1
========================
 -- Do not remove the backup slurmctld's pid file when it assumes control, only
    when it actually shuts down. Patch from Andriy Grytsenko (Massive Solutions
    Limited).
 -- Avoid clearing a job's reason from JobHeldAdmin or JobHeldUser when it is
    otherwise updated using scontrol or sview commands. Patch based upon work
    by Phil Eckert (LLNL).
 -- BLUEGENE - Fix for if changing the defined blocks in the bluegene.conf and
    jobs happen to be running on blocks not in the new config.
Morris Jette's avatar
Morris Jette committed
 -- Many cosmetic modifications to eliminate warning message from GCC version
    4.6 compiler.
 -- Fix for sview reservation tab when finding correct reservation.
 -- Fix for handling QOS limits per user on a reconfig of the slurmctld.
 -- Do not treat the absence of a gres.conf file as a fatal error on systems
    configured with GRES, but set GRES counts to zero.
 -- BLUEGENE - Update correctly the state in the reason of a block if an
    admin sets the state to error.
 -- BLUEGENE - handle reason of blocks in error more correctly between
    restarts of the slurmctld.
 -- BLUEGENE - Fix minor potential memory leak when setting block error reason.
 -- BLUEGENE - Fix if running in Static/Overlap mode and full system block
    is in an error state, won't deny jobs.
 -- Fix for accounting where your cluster isn't numbered in counting order
    (i.e. 1-9,0 instead of 0-9).  The bug would cause 'sacct -N nodename' to
    not give correct results on these systems.
 -- Fix to GRES allocation logic when resources are associated with specific
    CPUs on a node. Patch from Steve Trofinoff, CSCS.
 -- Fix bugs in sched/backfill with respect to QOS reservation support and job
    time limits. Patch from Alejandro Lucero Palau (Barcelona Supercomputer
    Center).
 -- BGQ - fix to set up corner correctly for sub block jobs.
 -- Major re-write of the CPU Management User and Administrator Guide (web
    page) by Martin Perry, Bull.
 -- BLUEGENE - If removing blocks from system that once existed cleanup of old
    block happens correctly now.
 -- Prevent slurmctld crashing with configuration of MaxMemPerCPU=0.
jette's avatar
jette committed
 -- Prevent job hold by operator or account coordinator of his own job from
    being an Administrator Hold rather than User Hold by default.
Morris Jette's avatar
Morris Jette committed
 -- Cray - Fix for srun.pl parsing to avoid adding spaces between option and
    argument (e.g. "-N2" parsed properly without changing to "-N 2").
 -- Major updates to cgroup support by Mark Grondona (LLNL) and Matthieu
    Hautreux (CEA) and Sam Lang. Fixes timing problems with respect to the
    task_epilog. Allows cgroup mount point to be configurable. Added new
    configuration parameters MaxRAMPercent and MaxSwapPercent. Allow cgroup
    configuration parameters that are precentages to be floating point.
 -- Fixed issue where sview wasn't displaying correct nice value for jobs.
 -- Fixed issue where sview wasn't displaying correct min memory per node/cpu
    value for jobs.
 -- Disable some SelectTypeParameters for select/linear that aren't compatible.
 -- Move slurm_select_init to proper place to avoid loading multiple select
    plugins in the slurmd.
 -- BGQ - Include runjob_plugin.so in the bluegene rpm.
 -- Report correct job "Reason" if needed nodes are DOWN, DRAINED, or
    NOT_RESPONDING, "Resources" rather than "PartitionNodeLimit".
 -- BLUEGENE - Fixed issues with running on a sub-midplane system.
 -- Added some missing calls to allow older versions of SLURM to talk to newer.
 -- BGQ - allow steps to be ran.
 -- Do not attempt to run HeathCheckProgram on powered down nodes. Patch from
    Ramiro Alba, Centre Tecnològic de Tranferència de Calor, Spain.
* Changes in SLURM 2.3.0-2
==========================
 -- Fix for memory issue inside sview.
 -- Fix issue where if a job was pending and the slurmctld was restarted a
    variable wasn't initialized in the job structure making it so that job
    wouldn't run.
Danny Auble's avatar
Danny Auble committed
* Changes in SLURM 2.3.0
========================
 -- BLUEGENE - make sure we only set the jobinfo_select start_loc on a job
    when we are on a small block, not a regular one.
 -- BGQ - fix issue where not copying the correct amount of memory.
 -- BLUEGENE - fix clean start if jobs were running when the slurmctld was
    shutdown and then the system size changed.  This would probably only happen
    if you were emulating a system.
 -- Fix sview for calling a cray system from a non-cray system to get the
    correct geometry of the system.
 -- BLUEGENE - fix to correctly import pervious version of block state file.
 -- BLUEGENE - handle loading better when doing a clean start with static
    blocks.
 -- Add sinfo format and sort option "%n" for NodeHostName and "%o" for
    NodeAddr.
 -- If a job is deferred due to partition limits, then re-test those limits
    after a partition is modified. Patch from Don Lipari.
 -- Fix bug which would crash slurmcld if job's owner (not root) tries to clear
    a job's licenses by setting value to "".
 -- Cosmetic fix for printing out debug info in the priority plugin.
 -- In sview when switching from a bluegene machine to a regular linux cluster
    and vice versa the node->base partition lists will be displayed if setup
    in your .slurm/sviewrc file.
 -- BLUEGENE - Fix for creating full system static block on a BGQ system.
 -- BLUEGENE - Fix deadlock issue if toggling between Dynamic and Static block
    allocation with jobs running on blocks that don't exist in the static
    setup.
 -- BLUEGENE - Modify code to only give HTC states to BGP systems and not
    allow them on Q systems.
 -- BLUEGENE - Make it possible for an admin to define multiple dimension
    conn_types in a block definition.
 -- BGQ - Alter tools to output multiple dimensional conn_type.
Danny Auble's avatar
Danny Auble committed
* Changes in SLURM 2.3.0.rc2
============================
 -- With sched/wiki or sched/wiki2 (Maui or Moab scheduler), insure that a
    requeued job's priority is reset to zero.
 -- BLUEGENE - fix to run steps correctly in a BGL/P emulated system.
 -- Fixed issue where if there was a network issue between the slurmctld and
    the DBD where both remained up but were disconnected the slurmctld would
    get registered again with the DBD.
 -- Fixed issue where if the DBD connection from the ctld goes away because of
    a POLLERR the dbd_fail callback is called.
 -- BLUEGENE - Fix to smap command-line mode display.
 -- Change in GRES behavior for job steps: A job step's default generic
    resource allocation will be set to that of the job. If a job step's --gres
    value is set to "none" then none of the generic resources which have been
    allocated to the job will be allocated to the job step.
 -- Add srun environment value of SLURM_STEP_GRES to set default --gres value
    for a job step.
 -- Require SchedulerTimeSlice configuration parameter to be at least 5 seconds
    to avoid thrashing slurmd daemon.
 -- Cray - Fix to make nodes state in accounting consistent with state set by
    ALPS.
 -- Cray - A node DOWN to ALPS will be marked DOWN to SLURM only after reaching
    SlurmdTimeout. In the interim, the node state will be NO_RESPOND. This
    change makes behavior makes SLURM handling of the node DOWN state more
    consistent with ALPS. This change effects only Cray systems.
 -- Cray - Fix to work with 4.0.* instead of just 4.0.0
 -- Cray - Modify srun/aprun wrapper to map --exclusive to -F exclusive and
    --share to -F share. Note this does not consider the partition's Shared
    configuration, so it is an imperfect mapping of options.
 -- BLUEGENE - Added notice in the print config to tell if you are emulated
    or not.
 -- BLUEGENE - Fix job step scalability issue with large task count.
 -- BGQ - Improved c-node selection when asked for a sub-block job that
    cannot fit into the available shape.
 -- BLUEGENE - Modify "scontrol show step" to show  I/O nodes (BGL and BGP) or
    c-nodes (BGQ) allocated to each step. Change field name from "Nodes=" to
    "BP_List=".
 -- Code cleanup on step request to get the correct select_jobinfo.
 -- Memory leak fixed for rolling up accounting with down clusters.
 -- BGQ - fix issue where if first job step is the entire block and then the
    next parallel step is ran on a sub block, SLURM won't over subscribe cnodes.
 -- Treat duplicate switch name in topology.conf as fatal error. Patch from Rod
    Schultz, Bull
 -- Minor update to documentation describing the AllowGroups option for a
    partition in the slurm.conf.
 -- Fix problem with _job_create() when not using qos's.  It makes
    _job_create() consistent with similar logic in select_nodes().
 -- GrpCPURunMins in a QOS flushed out.
 -- Fix for squeue -t "CONFIGURING" to actually work.
 -- CRAY - Add cray.conf parameter of SyncTimeout, maximum time to defer job
    scheduling if SLURM node or job state are out of synchronization with ALPS.
 -- If salloc was run as interactive, with job control, reset the foreground
    process group of the terminal to the process group of the parent pid before
    exiting. Patch from Don Albert, Bull.
 -- BGQ - set up the corner of a sub block correctly based on a relative
    position in the block instead of absolute.
 -- BGQ - make sure the recently added select_jobinfo of a step launch request
    isn't sent to the slurmd where environment variables would be overwritten
    incorrectly.
Danny Auble's avatar
Danny Auble committed

* Changes in SLURM 2.3.0.rc1
============================
 -- NOTE THERE HAVE BEEN NEW FIELDS ADDED TO THE JOB AND PARTITION STATE SAVE
    FILES AND RPCS. PENDING AND RUNNING JOBS WILL BE LOST WHEN UPGRADING FROM
    EARLIER VERSION 2.3 PRE-RELEASES AND RPCS WILL NOT WORK WITH EARLIER
    VERSIONS.
 -- select/cray: Add support for Accelerator information including model and
    memory options.
 -- Cray systems: Add support to suspend/resume salloc command to insure that
    aprun does not get initiated when the job is suspended. Processes suspended
    and resumed are determined by using process group ID and parent process ID,
    so some processes may be missed. Since salloc runs as a normal user, it's
    ability to identify processes associated with a job is limited.
 -- Cray systems: Modify smap and sview to display all nodes even if multiple
    nodes exist at each coordinate.
 -- Improve efficiency of select/linear plugin with topology/tree plugin
    configured, Patch by Andriy Grytsenko (Massive Solutions Limited).
 -- For front-end architectures on which job steps are run (emulated Cray and
    BlueGene systems only), fix bug that would free memory still in use.
 -- Add squeue support to display a job's license information. Patch by Andy
    Roosen (University of Deleware).
 -- Add flag to the select APIs for job suspend/resume indicating if the action
    is for gang scheduling or an explicit job suspend/resume by the user. Only
    an explicit job suspend/resume will reset the job's priority and make
    resources exclusively held by the job available to other jobs.
 -- Fix possible invalid memory reference in sched/backfill. Patch by Andriy
    Grytsenko (Massive Solutions Limited).
 -- Add select_jobinfo to the task launch RPC. Based upon patch by Andriy
    Grytsenko (Massive Solutions Limited).
 -- Add DefMemPerCPU/Node and MaxMemPerCPU/Node to partition configuration.
    This improves flexibility when gang scheduling only specific partitions.
 -- Added new enums to print out when a job is held by a QOS instead of an
    association limit.
 -- Enhancements to sched/backfill performance with select/cons_res plugin.
    Patch from Bjørn-Helge Mevik, University of Oslo.
 -- Correct job run time reported by smap for suspended jobs.
 -- Improve job preemption logic to avoid preempting more jobs than needed.
Morris Jette's avatar
Morris Jette committed
 -- Add contribs/arrayrun tool providing support for job arrays. Contributed by
    Bjørn-Helge Mevik, University of Oslo. NOTE: Not currently packaged as RPM
    and manual file editing is required.
 -- When suspending a job, wait 2 seconds instead of 1 second between sending
    SIGTSTP and SIGSTOP. Some MPI implementation were not stopping within the
    1 second delay.
 -- Add support for managing devices based upon Linux cgroup container. Based
    upon patch by Yiannis Georgiou, Bull.
 -- Fix memory buffering bug if a AllowGroups parameter of a partition has 100
    or more users. Patch by Andriy Grytsenko (Massive Solutions Limited).
Morris Jette's avatar
Morris Jette committed
 -- Fix bug in generic resource tracking of gres associated with specific CPUs.
    Resources were being over-allocated.
 -- On systems with front-end nodes (IBM BlueGene and Cray) limit batch jobs to
    only one CPU of these shared resources.
 -- Set SLURM_MEM_PER_CPU or SLURM_MEM_PER_NODE environment variables for both
    interactive (salloc) and batch jobs if the job has a memory limit. For Cray
    systems also set CRAY_AUTO_APRUN_OPTIONS environment variable with the
    memory limit.
 -- Fix bug in select/cons_res task distribution logic when tasks-per-node=0.
    Patch from Rod Schultz, Bull.
 -- Restore node configuration information (CPUs, memory, etc.) for powered
    down when slurmctld daemon restarts rather than waiting for the node to be
    restored to service and getting the information from the node (NOTE: Only
    relevent if FastSchedule=0).
 -- For Cray systems with the srun2aprun wrapper, rebuild the srun man page
    identifying the srun optioins which are valid on that system.
 -- BlueGene: Permit users to specify a separate connection type for each
    dimension (e.g. "--conn-type=torus,mesh,torus").
 -- Add the ability for a user to limit the number of leaf switches in a job's
    allocation using the --switch option of salloc, sbatch and srun. There is
    also a new SchedulerParameters value of max_switch_wait, which a SLURM
    administrator can used to set a maximum job delay and prevent a user job
    from blocking lower priority jobs for too long. Based on work by Rod
    Schultz, Bull.
* Changes in SLURM 2.3.0.pre6
=============================
 -- NOTE: THERE HAS BEEN A NEW FIELD ADDED TO THE CONFIGURATION RESPONSE RPC
    AS SHOWN BY "SCONTROL SHOW CONFIG". THIS FUNCTION WILL ONLY WORK WHEN THE
    SERVER AND CLIENT ARE BOTH RUNNING SLURM VERSION 2.3.0.pre6
Moe Jette's avatar
Moe Jette committed
 -- Modify job expansion logic to support licenses, generic resources, and
    currently running job steps.
 -- Added an rpath if using the --with-munge option of configure.
 -- Add support for multiple sets of DEFAULT node, partition, and frontend
    specifications in slurm.conf so that default values can be changed mulitple
    times as the configuration file is read.
 -- BLUEGENE - Improved logic to place small blocks in free space before freeing
    larger blocks.
 -- Add optional argument to srun's --kill-on-bad-exit so that user can set
    its value to zero and override a SLURM configuration parameter of
    KillOnBadExit.
 -- Fix bug in GraceTime support for preempted jobs that prevented proper
    operation when more than one job was being preempted. Based on patch from
    Bill Brophy, Bull.
 -- Fix for running sview from a non-bluegene cluster to a bluegene cluster.
    Regression from pre5.
Moe Jette's avatar
Moe Jette committed
 -- If job's TMPDIR environment is not set or is not usable, reset to "/tmp".
    Patch from Andriy Grytsenko (Massive Solutions Limited).
 -- Remove logic for defunct RPC: DBD_GET_JOBS.
 -- Propagate DebugFlag changes by scontrol to the plugins.
 -- Improve accuracy of REQUEST_JOB_WILL_RUN start time with respect to higher
    priority pending jobs.
 -- Add -R/--reservation option to squeue command as a job filter.
 -- Add scancel support for --clusters option.
 -- Note that scontrol and sprio can only support a single cluster at one time.
 -- Add support to salloc for a new environment variable SALLOC_KILL_CMD.
 -- Add scontrol ability to increment or decrement a job or step time limit.
 -- Add support for SLURM_TIME_FORMAT environment variable to control time
    stamp output format. Work by Gerrit Renker, CSCS.
 -- Fix error handling in mvapich plugin that could cause srun to enter an
    infinite loop under rare circumstances.
Moe Jette's avatar
Moe Jette committed
 -- Add support for multiple task plugins. Patch from Andriy Grytsenko (Massive
    Solutions Limited).
 -- Addition of per-user node/cpu limits for QOS's. Patch from Aaron Knister,
    UMBC.
 -- Fix logic for multiple job resize operations.
 -- BLUEGENE - many fixes to make things work correctly on a L/P system.
 -- Fix bug in layout of job step with --nodelist option plus node count. Old
    code could allocate too few nodes.
Moe Jette's avatar
Moe Jette committed
* Changes in SLURM 2.3.0.pre5
=============================
 -- NOTE: THERE HAS BEEN A NEW FIELD ADDED TO THE JOB STATE FILE. UPGRADES FROM
    VERSION 2.3.0-PRE4 WILL RESULT IN LOST JOBS UNLESS THE "orig_dependency"
    FIELD IS REMOVED FROM JOB STATE SAVE/RESTORE LOGIC. ON CRAY SYSTEMS A NEW
    "confirm_cookie" FIELD WAS ADDED AND HAS THE SAME EFFECT OF DISABLING JOB
    STATE RESTORE.
 -- BLUEGENE - Improve speed of start up when removing blocks at the beginning.
 -- Correct init.d/slurm status to have non-zero exit code if ANY Slurm
    damon that should be running on the node is not running. Patch from Rod
    Schulz, Bull.
 -- Improve accuracy of response to "srun --test-only jobid=#".
 -- Fix bug in front-end configurations which reports job_cnt_comp underflow
    errors after slurmctld restarts.
 -- Eliminate "error from _trigger_slurmctld_event in backup.c" due to lack of
    event triggers.
 -- Fix logic in BackupController to properly recover front-end node state and
    avoid purging active jobs.
 -- Added man pages to html pages and the new cpu_management.html page.
    Submitted by Martin Perry / Rod Schultz, Bull.
 -- Job dependency information will only show the currently active dependencies
    rather than the original dependencies. From Dan Rusak, Bull.
 -- Add RPCs to get the SPANK environment variables from the slurmctld daemon.
    Patch from Andrej N. Gritsenko.
 -- Updated plugins/task/cgroup/task_cgroup_cpuset.c to support newer
    HWLOC_API_VERSION.
 -- Do not build select/bluegene plugin if C++ compiler is not installed.
 -- Add new configure option --with-srun2aprun to build an srun command
    which is a wrapper over Cray's aprun command and supports many srun
    options. Without this option, the srun command will advise the user
    to use the aprun command.
 -- Change container ID supported by proctrack plugin from 32-bit to 64-bit.
 -- Added contribs/cray/libalps_test_programs.tar.gz with tools to validate
    SLURM's logic used to support Cray systems.
 -- Create RPM for srun command that is a wrapper for the Cray/ALPS aprun
    command. Dependent upon .rpmmacros parameter of "%_with_srun2aprun".
 -- Add configuration parameter MaxStepCount to limit effect of bad batch
    scripts.
auble1's avatar
auble1 committed
 -- Moving to github
 -- Fix for handling a 2.3 system talking to a 2.2 slurmctld.
 -- Add contribs/lua/job_submit.license.lua script. Update job_submit and Lua
    related documentation.
 -- Test if _make_batch_script() is called with a NULL script.
 -- Increase hostlist support from 24k to 64k nodes.
 -- Renamed the Accounting Storage database's "DerivedExitString" job field to
    "Comment".  Provided backward compatible support for "DerivedExitString" in
    the sacctmgr tool.
 -- Added the ability to save the job's comment field to the Accounting
    Storage db (to the formerly named, "DerivedExitString" job field).  This
    behavior is enabled by a new slurm.conf parameter:
    AccountingStoreJobComment.
 -- Test if _make_batch_script() is called with a NULL script.
 -- Increase hostlist support from 24k to 64k nodes.
 -- Fix srun to handle signals correctly when waiting for a step creation.
 -- Preserve the last job ID across slurmctld daemon restarts even if the job
    state file can not be fully recovered.
 -- Made the hostlist functions be able to arbitrarily handle any size
    dimension no matter what the size of the cluster is in dimensions.
Moe Jette's avatar
Moe Jette committed
* Changes in SLURM 2.3.0.pre4
=============================
 -- Add GraceTime to Partition and QOS data structures. Preempted jobs will be
    given this time interval before termination. Work by Bill Brophy, Bull.
 -- Add the ability for scontrol and sview to modify slurmctld DebugFlags
    values.
 -- Various Cray-specific patches:
    - Fix a bug in distinguishing XT from XE.
    - Avoids problems with empty nodenames on Cray.
    - Check whether ALPS is hanging on to nodes, which happens if ALPS has not
      yet cleaned up the node partition.
    - Stops select/cray from clobbering node_ptr->reason.
    - Perform 'safe' release of ALPS reservations using inventory and apkill.
    - Compile-time sanity check for the apbasil and apkill files.
    - Changes error handling in do_basil_release() (called by
      select_g_job_fini()).
    - Warn that salloc --no-shell option is not supported on Cray systems.
 -- Add a reservation flag of "License_Only". If set, then jobs using the
    reservation may use the licenses associated with it plus any compute nodes.
    Otherwise the job is limited to the compute nodes associated with the
    reservation.
 -- Change slurm.conf node configuration parameter from "Procs" to "CPUs".
    Both parameters will be supported for now.
 -- BLUEGENE - fix for when user requests only midplane names with no count at
    job submission time to process the node count correctly.
 -- Fix job step resource allocation problem when both node and tasks counts
    are specified. New logic selects nodes with larger CPU counts as needed.
 -- BGQ - make it so srun wraps runjob (still under construction, but works
    for most cases)
 -- Permit a job's QOS and Comment field to both change in a single RPC. This
    was previously disabled since Moab stored the QOS within the Comment field.
 -- Add support for jobs to expand in size. Submit additional batch job with
Moe Jette's avatar
Moe Jette committed
    the option "--dependency=expand:<jobid>". See web page "faq.html#job_size"
    for details. Restrictions to be removed in the future.
 -- Added --with-alps-emulation to configure, and also an optional cray.conf
    to setup alps location and database information.
 -- Modify PMI data types from 16-bits to 32-bits in order to support MPICH2
    jobs with more than 65,536 tasks. Patch from Hongjia Cao, NUDT.
 -- Set slurmd's soft process CPU limit equal to it's hard limit and notify the
    user if the limit is not infinite.
 -- Added proctrack/cgroup and task/cgroup plugins from Matthieu Hautreux, CEA.
 -- Fix slurmctld restart logic that could leave nodes in UNKNOWN state for a
    longer time than necessary after restart.
* Changes in SLURM 2.3.0.pre3
=============================
 -- BGQ - Appears to work correctly in emulation mode, no sub blocks just yet.
 -- Minor typos fixed
 -- Various bug fixes for Cray systems.
 -- Fix bug that when setting a compute node to idle state, it was failing to
    set the systems up_node_bitmap.
 -- BLUEGENE - code reorder
 -- BLUEGENE - Now only one select plugin for all Bluegene systems.
 -- Modify srun to set the SLURM_JOB_NAME environment variable when srun is
    used to create a new job allocation. Not set when srun is used to create a
    job step within an existing job allocation.
 -- Modify init.d/slurm script to start multiple slurmd daemons per compute
    node if so configured. Patch from Matthieu Hautreux, CEA.
 -- Change license data structure counters from uint16_t to uint32_t to support
    larger license counts.
Moe Jette's avatar
Moe Jette committed
* Changes in SLURM 2.3.0.pre2
=============================
 -- Log a job's requeue or cancellation due to preemption to that job's stderr:
    "*** JOB 65547 CANCELLED AT 2011-01-21T12:59:33 DUE TO PREEMPTION ***".
 -- Added new job termination state of JOB_PREEMPTED, "PR" or "PREEMPTED" to
    indicate job termination was due to preemption.
 -- Optimize advanced reservations resource selection for computer topology.
    The logic has been added to select/linear and select/cons_res, but will
    not be enabled until the other select plugins are modified.
Moe Jette's avatar
Moe Jette committed
 -- Remove checkpoint/xlch plugin.
 -- Disable deletion of partitions that have unfinished jobs (pending,
    running or suspended states). Patch from Martin Perry, BULL.
 -- In sview, disable the sorting of node records by name at startup for
    clusters over 1000 nodes. Users can enable this by selecting the "Name"
    tab. This change dramatically improves scalability of sview.
 -- Report error when trying to change a node's state from scontrol for Cray
 -- Do not attempt to read the batch script for non-batch jobs. This patch
    eliminates some inappropriate error messages.
 -- Preserve NodeHostName when reordering nodes due to system topology.
 -- On Cray/ALPS systems  do node inventory before scheduling jobs.
 -- Disable some salloc options on Cray systems.
 -- Disable scontrol's wait_job command on Cray systems.
 -- Disable srun command on native Cray/ALPS systems.
 -- Updated configure option "--enable-cray-emulation" (still under
    development) to emulate a cray XT/XE system, and auto-detect a real
    Cray XT/XE systems (removed no longer needed --enable-cray configure
    option).  Building on native Cray systems requires the
    cray-MySQL-devel-enterprise rpm and expat XML parser library/headers.
* Changes in SLURM 2.3.0.pre1
=============================
 -- Added that when a slurmctld closes the connection to the database it's
    registered host and port are removed.
 -- Added flag to slurmdbd.conf TrackSlurmctldDown where if set will mark idle
    resources as down on a cluster when a slurmctld disconnects or is no
    longer reachable.
 -- Added support for more than one front-end node to run slurmd on
    architectures where the slurmd does not execute on the compute nodes
Moe Jette's avatar
Moe Jette committed
    (e.g. BlueGene). New configuration parameters FrontendNode and FrontendAddr
    added. See "man slurm.conf" for more information.
 -- With the scontrol show job command when using the --details option, show
    a batch job's script.
 -- Add ability to create reservations or partitions and submit batch jobs
    using sview. Also add the ability to delete reservations and partitions.
Moe Jette's avatar
Moe Jette committed
 -- Added new configuration parameter MaxJobId. Once reached, restart job ID
    values at FirstJobId.
 -- When restarting slurmctld with priority/basic, increment all job priorities
    so the highest job priority becomes TOP_PRIORITY.
Moe Jette's avatar
Moe Jette committed
* Changes in SLURM 2.2.8
========================
 -- Prevent background salloc disconnecting terminal at termination. Patch by
    Don Albert, Bull.
 -- Fixed issue where preempt mode is skipped when creating a QOS. Patch by
    Bill Brophy, Bull.
 -- Fixed documention (html) for PriorityUsageResetPeriod to match that in the
    man pages. Patch by Nancy Kritkausky, Bull.
Moe Jette's avatar
Moe Jette committed

* Changes in SLURM 2.2.7
========================
 -- Eliminate zombie process created if salloc exits with stopped child
    process. Patch from Gerrit Renker, CSCS.
 -- With default configuration on non-Cray systems, enable salloc to be
    spawned as a background process. Based upon work by Don Albert (Bull) and
    Gerrit Renker (CSCS).
 -- Fixed Regression from 2.2.4 in accounting where an inherited limit
    would not be set correctly in the added child association.
 -- Fixed issue with accounting when asking for jobs with a hostlist.
 -- Avoid clearing a node's Arch, OS, BootTime and SlurmdStartTime when
    "scontrol reconfig" is run. Patch from Martin Perry, Bull.
Moe Jette's avatar
Moe Jette committed

Moe Jette's avatar
Moe Jette committed
* Changes in SLURM 2.2.6
========================
 -- Fix displaying of account coordinators with sacctmgr.  Possiblity to show
    deleted accounts.  Only a cosmetic issue, since the accounts are already
    deleted, and have no associations.
 -- Prevent opaque ncurses WINDOW struct on OS X 10.6.
 -- Fix issue with accounting when using PrivateData=jobs... users would not be
    able to view there own jobs unless they were admin or coordinators which is
    obviously wrong.
 -- Fix bug in node stat if slurmctld is restarted while nodes are in the
    process of being powered up. Patch from Andriy Grytsenko.
 -- Change maximum batch script size from 128k to 4M.
Moe Jette's avatar
Moe Jette committed
 -- Get slurmd -f option working. Patch from Andriy Grytsenko.
Moe Jette's avatar
Moe Jette committed
 -- Fix for linking problem on OSX. Patches from Jon Bringhurst (LANL) and
    Tyler Strickland.
Moe Jette's avatar
Moe Jette committed
 -- Reset a job's priority to zero (suspended) when Moab requeues the job.
    Patch from Par Andersson, NSC.
 -- When enforcing accounting, fix polling for unknown uids for users after
    the slurmctld started.  Previously one would have to issue a reconfigure
    to the slurmctld to have it look for new uids.
 -- BLUEGENE - if a block goes into an error state.  Fix issue where accounting
    wasn't updated correctly when the block was resumed.
 -- Synchronize power-save module better with scheduler. Patch from
    Andriy Grytsenko (Massive Solutions Limited).
 -- Avoid SEGV in association logic with user=NULL. Patch from
    Andriy Grytsenko (Massive Solutions Limited).
 -- Fixed issue in accounting where it was possible for a new
    association/wckey to be set incorrectly as a default the new object
    was added after an original default object already existed.  Before
    the slurmctld would need to be restarted to fix the issue.
 -- Updated the Normalized Usage section in priority_multifactor.shtml.
 -- Disable use of SQUEUE_FORMAT env var if squeue -l, -o, or -s option is
    used. Patch from Aaron Knister (UMBC).
Moe Jette's avatar
Moe Jette committed

* Changes in SLURM 2.2.5
========================
 -- Correct init.d/slurm status to have non-zero exit code if ANY Slurm
    damon that should be running on the node is not running. Patch from Rod
    Schulz, Bull.
 -- Improve accuracy of response to "srun --test-only jobid=#".
Moe Jette's avatar
Moe Jette committed
 -- Correct logic to properly support --ntasks-per-node option in the
    select/cons_res plugin. Patch from Rod Schulz, Bull.
 -- Fix bug in select/cons_res with respect to generic resource (gres)
    scheduling which prevented some jobs from starting as soon as possible.
 -- Fix memory leak in select/cons_res when backfill scheduling generic
    resources (gres).
 -- Fix for when configuring a node with more resources than in real life
    and using task/affinity.
 -- Fix so slurmctld will pack correctly 2.1 step information. (Only needed if
    a 2.1 client is talking to a 2.2 slurmctld.)
 -- Set powered down node's state to IDLE+POWER after slurmctld restart instead
    of leaving in UNKNOWN+POWER. Patch from Andrej Gritsenko.
 -- Fix bug where is srun's executable is not on it's current search path, but
    can be found in the user's default search path. Modify slurmstepd to find
    the executable. Patch from Andrej Gritsenko.
 -- Make sview display correct cpu count for steps.
 -- BLUEGENE - when running in overlap mode make sure to check the connection
    type so you can create overlapping blocks on the exact same nodes with
    different connection types (i.e. one torus, one mesh).
 -- Fix memory leak if MPI ports are reserved (for OpenMPI) and srun's
    --resv-ports option is used.
Morris Jette's avatar
Morris Jette committed
 -- Fix some anomalies in select/cons_res task layout when using the
    --cpus-per-task option. Patch from Martin Perry, Bull.
 -- Improve backfill scheduling logic when job specifies --ntasks-per-node and
    --mem-per-cpu options on a heterogeneous cluster. Patch from Bjorn-Helge
    Mevik, University of Oslo.
 -- Print warning message if srun specifies --cpus-per-task larger than used
    to create job allocation.
 -- Fix issue when changing a users name in accounting, if using wckeys would
    execute correctly, but bad memcopy would core the DBD.  No information
    would be lost or corrupted, but you would need to restart the DBD.
Moe Jette's avatar
Moe Jette committed
* Changes in SLURM 2.2.4
========================
 -- For batch jobs for which the Prolog fails, substitute the job ID for any
    "%j" in the job's output or error file specification.
 -- Add licenses field to the sview reservation information.
 -- BLUEGENE - Fix for handling extremely overloaded system on Dynamic system
    dealing with starting jobs on overlapping blocks.  Previous fallout
    was job would be requeued.  (happens very rarely)
 -- In accounting_storage/filetxt plugin, substitute spaces within job names,
    step names, and account names with an underscore to insure proper parsing.
 -- When building contribs/perlapi ignore both INSTALL_BASE and PERL_MM_OPT.
    Use PREFIX instead to avoid build errors from multiple installation
    specifications.
 -- Add job_submit/cnode plugin to support resource reservations of less than
    a full midplane on BlueGene computers. Treat cnodes as liceses which can
    be reserved and are consumed by jobs. This reservation mechanism for less
    than an entire midplane is still under development.
Moe Jette's avatar
Moe Jette committed
 -- Clear a job's "reason" field when a held job is released.
 -- When releasing a held job, calculate a new priority for it rather than
    just setting the priority to 1.
 -- Fix for sview started on a non-bluegene system to pick colors correctly
    when talking to a real bluegene system.
 -- Improve sched/backfill's expected start time calculation.
 -- Prevent abort of sacctmgr for dump command with invalid (or no) filename.
 -- Improve handling of job updates when using limits in accounting, and
    updating jobs as a non-admin user.
 -- Fix for "squeue --states=all" option. Bug would show no jobs.
 -- Schedule jobs with reservations before those without reservations.
 -- Fix squeue/scancel to query correctly against accounts of different case.
 -- Abort an srun command when it's associated job gets aborted due to a
    dependency that can not be satisfied.
 -- In jobcomp plugins, report start time of zeroif pending job is cancelled.
    Previously may report expected start time.
 -- Fixed sacctmgr man to state correct variables.
 -- Select nodes based upon their Weight when job allocation requests include
    a constraint field with a count (e.g. "srun --constraint=gpu*2 -N4 a.out").
 -- Add support for user names that are entirely numeric and do not treat them
    as UID values. Patch from Dennis Leepow.
 -- Patch to un/pack double values properly if negative value.  Patch from
 -- Do not reset a job's priority when requeued or suspended.
 -- Fix problemm that could let new jobs start on a node in DRAINED state.
 -- Fix cosmetic sacctmgr issue where if the user you are trying to add
    doesn't exist in the /etc/passwd file and the account you are trying
    to add them to doesn't exist it would print (null) instead of the bad
    account name.
Danny Auble's avatar
Danny Auble committed
 -- Fix associations/qos for when adding back a previously deleted object
    the object will be cleared of all old limits.
 -- BLUEGENE - Added back a lock when creating dynamic blocks to be more thread
    safe on larger systems with heavy load.
Moe Jette's avatar
Moe Jette committed

* Changes in SLURM 2.2.3
========================
Morris Jette's avatar
Morris Jette committed
 -- Update srun, salloc, and sbatch man page description of --distribution
    option. Patches from Rod Schulz, Bull.
 -- Applied patch from Martin Perry to fix "Incorrect results for task/affinity
    block second distribution and cpus-per-task > 1" bug.
 -- Avoid setting a job's eligible time while held (priority == 0).
 -- Substantial performance improvement to backfill scheduling. Patch from
    Bjorn-Helge Mevik, University of Oslo.
 -- Make timeout for communications to the slurmctld be based upon the
    MessageTimeout configuration parameter rather than always 3 seconds.
    Patch from Matthieu Hautreux, CEA.
 -- Add new scontrol option of "show aliases" to report every NodeName that is
    associated with a given NodeHostName when running multiple slurmd daemons
    per compute node (typically used for testing purposes). Patch from
    Matthieu Hautreux, CEA.
 -- Fix for handling job names with a "'" in the name within MySQL accounting.
    Patch from Gerrit Renker, CSCS.
 -- Modify condition under which salloc execution delayed until moved to the
    foreground. Patch from Gerrit Renker, CSCS.
	Job control for interactive salloc sessions: only if ...
	a) input is from a terminal (stdin has valid termios attributes),
	b) controlling terminal exists (non-negative tpgid),
	c) salloc is not run in allocation-only (--no-shell) mode,
	d) salloc runs in its own process group (true in interactive
	   shells that support job control),
	e) salloc has been configured at compile-time to support background
	   execution and is not currently in the background process group.
 -- Abort salloc if no controlling terminal and --no-shell option is not used
    ("setsid salloc ..." is disabled). Patch from Gerrit Renker, CSCS.
 -- Fix to gang scheduling logic which could cause jobs to not be suspended
    or resumed when appropriate.
 -- Applied patch from Martin Perry to fix "Slurmd abort when using task
    affinity with plane distribution" bug.
 -- Applied patch from Yiannis Georgiou to fix "Problem with cpu binding to
    sockets option" behaviour. This change causes "--cpu_bind=sockets" to bind
    tasks only to the CPUs on each socket allocated to the job rather than all
    CPUs on each socket.
 -- Advance daily or weekly reservations immediately after termination to avoid
    having a job start that runs into the reservation when later advanced.
 -- Fix for enabling users to change there own default account, wckey, or QOS.
 -- BLUEGENE - If using OVERLAP mode fixed issue with multiple overlapping
    blocks in error mode.
 -- Fix for sacctmgr to display correctly default accounts.
 -- scancel -s SIGKILL will always sent the RPC to the slurmctld rather than
    the slurmd daemon(s). This insures that tasks in the process of getting
    spawned are killed.
 -- BLUEGENE - If using OVERLAP mode fixed issue with jobs getting denied
    at submit if the only option for their job was overlapping a block in
    error state.
* Changes in SLURM 2.2.2
========================
 -- Correct logic to set correct job hold state (admin or user) when setting
    the job's priority using scontrol's "update jobid=..." rather than its
 -- Modify squeue to report unset --mincores, --minthreads or --extra-node-info
    values as "*" rather than 65534. Patch from Rod Schulz, BULL.
 -- Report the StartTime of a job as "Unknown" rather than the year 2106 if its
    expected start time was too far in the future for the backfill scheduler
    to compute.
 -- Prevent a pending job reason field from inappropriately being set to
    "Priority".
 -- In sched/backfill with jobs having QOS_FLAG_NO_RESERVE set, then don't
    consider the job's time limit when attempting to backfill schedule. The job
    will just be preempted as needed at any time.
 -- Eliminated a bug in sbatch when no valid target clusters are specified.
 -- When explicitly sending a signal to a job with the scancel command and that
    job is in a pending state, then send the request directly to the slurmctld
    daemon and do not attempt to send the request to slurmd daemons, which are
    not running the job anyway.
 -- In slurmctld, properly set the up_node_bitmap when setting it's state to
    IDLE (in case the previous node state was DOWN).
 -- Fix smap to process block midplane names correctly when on a bluegene
    system.
 -- Fix smap to once again print out the Letter 'ID' for each line of a block/
    partition view.
 -- Corrected the NOTES section of the scancel man page
 -- Fix for accounting_storage/mysql plugin to correctly query cluster based
    transactions.
 -- Fix issue when updating database for clusters that were previously deleted
    before upgrade to 2.2 database.
 -- BLUEGENE - Handle mesh torus check better in dynamic mode.
 -- BLUEGENE - Fixed race condition when freeing block, most likely only would
    happen in emulation.
 -- Fix for calculating used QOS limits correctly on a slurmctld reconfig.
 -- BLUEGENE - Fix for bad conn-type set when running small blocks in HTC mode.
 -- If salloc's --no-shell option is used, then do not attempt to preserve the
    terminal's state.
 -- Add new SLURM configure time parameter of --disable-salloc-background. If
    set, then salloc can only execute in the foreground. If started in the
    background, then a message will be printed and the job allocation halted
    until brought into the foreground.
    NOTE: THIS IS A CHANGE IN DEFAULT SALLOC BEHAVIOR FROM V2.2.1, BUT IS
    CONSISTENT WITH V2.1 AND EARLIER.
 -- Added the Multi-Cluster Operation web page.
 -- Removed remnant code for enforcing max sockets/cores/threads in the
    cons_res plugin (see last item in 2.1.0-pre5).  This was responsible
    for a bug reported by Rod Schultz.
 -- BLUEGENE - Set correct env vars for HTC mode on a P system to get correct
    block.
 -- Correct RunTime reported by "scontrol show job" for pending jobs.
* Changes in SLURM 2.2.1
========================
 -- Fix setting derived exit code correctly for jobs that happen to have the
    same jobid.
 -- Better checking for time overflow when rolling up in accounting.
Moe Jette's avatar
Moe Jette committed
 -- Add scancel --reservation option to cancel all jobs associated with a
 -- Treat reservation with no nodes like one that starts later (let jobs of any
Moe Jette's avatar
Moe Jette committed
    size get queued and do not block any pending jobs).
 -- Fix bug in gang scheduling logic that would temporarily resume to many jobs
    after a job completed.
 -- Change srun message about job step being deferred due to SlurmctldProlog
    running to be more clear and only print when --verbose option is used.
 -- Made it so you could remove the hold on jobs with sview by setting the
    priority to infinite.
 -- BLUEGENE - better checking small blocks in dynamic mode whether a full
    midplane job could run or not.
 -- Decrease the maximum sleep time between srun job step creation retry
    attempts from 60 seconds to 29 seconds. This should eliminate a possible
    synchronization problem with gang scheduling that could result in job
    step creation requests only occuring when a job is suspended.