Newer
Older
* Modify step create logic so that call components of a heterogeneous job
launched by a single srun command have the same step ID value.
-- Modify output of "--mpi=list" to avoid duplicates for version numbers in
mpi/pmix plugin names.
-- Allow nodes to be rebooted while in a maintenance reservation.
-- Show nodes as down even when nodes are in a maintenance reservation.
-- Harden the slurmctld HA stack to mitigate certain split-brain issues.
-- Work for heterogeneous job support (complete solution in v17.11):
* Add burst buffer support.
* Remove srun's --mpi-combine option (always combined).
* Add SchedulerParameters configuration option "enable_hetero_steps" to
enable job steps that span multiple components of a heterogeneous job.
Disabled by default as most MPI implementations and Slurm configurations
are not currently supported. Limitation to be removed in Slurm version
18.08.
* Synchronize application launch across multiple components with debugger.
* Modify slurm_kill_job_step() to cancel all components of a heterogeneous
job step (used by MPI).
* Set SLURM_JOB_NUM_NODES environment variable as needed by MVAPICH.
* Base time limit upon the time that the latest job component is available
(after all nodes in all components booted and ready for use).
-- Add cluster name to smail tool email header.
-- Speedup arbitrary distribution algorithm.
-- Modify "srun --mpi=list" output to match valid option input by removing the
"mpi/" prefix on each line of output.
-- Automatically set the reservation's partition for the job if not the
cluster default.
-- mpi/pmi2 plugin - vestigial pointer could be referenced at shutdown with
invalid memory reference resulting.
Dominik Bartkiewicz
committed
-- Fix to _is_gres_cnt_zero() return false for improper input string
-- Cleanup all pthread_create calls and replace with new slurm_thread_create
macro.
-- Removed obsolete MPI plugins. Remaining options are openmpi, pmi2, pmix.
-- Removed obsolete checkpoint/poe plugin.
-- Process spank environment variable options before processing spank command
line options. Spank plugins should be able to handle option callbacks being
called multiple times.
-- Add support for specialized cores with task/affinity plugin (previously
only supported with task/cgroup plugin).
-- Add "TaskPluginParam=SlurmdOffSpec" option that will prevent the Slurm
compute node daemons (slurmd and slurmstepd) from executing on specialized
cores.
-- CRAY - Make native mode default, use --disable-native-cray to use ALPS
instead of native Slurm.
-- Add ability to prevent suspension of some count of nodes in a specified
range using the SuspendExcNodes configuration parameter.
-- Add SLURM_WCKEY to PrologSlurmctld and EpilogSlurmctld environment.
-- Return user response string in response to successful job allocation request
not only on failure. Set in LUA using function 'slurm.user_msg("STRING")'.
-- Add 'scontrol write batch_script <jobid>' command to retrieve the batch
script for a given job.
-- Remove option to display the batch script as part of 'scontrol show job'.
-- On native Cray system the configured RebootProgram is executed on on the
head node by the slurmctld daemon rather than by the slurmd daemons on the
compute nodes. The "capmc_resume" program from "contribs/cray" can be used.
-- Modify "scontrol top" command to accept a comma separated list of job IDs
as an argument rather than a single job ID.
-- Add MemorySwappiness value to cgroup.conf.
-- Add new "billing" TRES which allows jobs to be limited based on the job's
billable TRES calculated by the job's partition's TRESBillingWeights.
-- sbatch - force line-buffered output so 'sbatch -W' returns the jobid
over a piped output immediately.
-- Regular user use of "scontrol top" command is now diabled. Use the
configuration parameter "SchedulerParameters=enable_user_top" to enable
that functionality. The configuration parameter
"SchedulerParameters=disable_user_top" will be silently ignored.
-- Add -TALL to sreport.
-- Removed unused SlurmdPlugstack option and associated framework.
-- Correct logic for line continuation in srun --multi-prog file.
-- Add DBD Agent queue size to sdiag output.
-- Add running job count to sdiag output.
-- Print unix timestamps next to ASCII timestamps in sdiag output.
-- In a job allocation spanning KNL and non-KNL nodes and requiring a reboot,
do not attempt to set default NUMA or MCDRAM modes on non-KNL nodes.
-- Change default to let pending jobs run outside of reservation after
reservation is gone to put jobs in held state. Added NO_HOLD_JOBS_AFTER_END
reservation flag to use old default.
-- When creating a reservation, validate the CoreCnt specification matches
-- When creating a reservation, correct logic to ignoring job allocations on
request.
-- Deprecate BLCR plugin, and do not build by default.
-- Change sreport report titles from "Use" to "Usage"
* Changes in Slurm 17.11.0pre2
==============================
-- Initial work for heterogeneous job support (complete solution in v17.11):
* Modified salloc, sbatch and srun commands to parse command line, job
script and environment variables to recognize requests for heterogeneous
jobs. Same commands also modified to set environment variables describing
each component of the heterogeneous job.
* Modified job allocate, batch job submit and job "will-run" requests to
pass a list of job specifications and get a list of responses.
* Modify slurmctld daemon to process a heterogeneous job request and create
multiple job records as needed.
* Added new fields to job record: pack_job_id, pack_job_offset and
pack_job_set (set of job IDs). Added to slurmctld state save/restore
logic and job information reported.
* Display new job fields in "scontrol show job" output.
* Modify squeue command to display heterogeneous job records using "#+#"
format. The squeue --job=# output lists all components of a heterogeneous
job.
* Modify scancel logic to cancel all components of a heterogeneous job with
a single request/RPC.
* Configuration parameter DebugFlags value of "HeteroJobs" added.
* Job requeue and suspend/resume modified to operate on all components of
a heterogeneous job with a single request/RPC.
* New web page added to describe heterogeneous jobs.
* Descriptions of new API added to man pages.
* Modified email notifications to only operate on the first job component.
* Purge heterogeneous job records at the same time and not by individual
* Modified logic for heterogeneous jobs submitted to multiple clusters
("--clusters=...") so the job will be routed to the cluster that is
expected to start all components earliest.
* Modified srun to create multiple job steps for heterogeneous job
allocations.
* Modified launch plugin to accept a pointer to job step options structure
rather than work from a single/common data structure.
-- Improve backfill scheduling algorithm with respect to starting jobs as soon
as possible while avoiding advanced reservations.
-- Add URG as an option to 'scancel --signal'.
Dominik Bartkiewicz
committed
-- Check if the buffer returned from slurm_persist_msg_pack() isn't NULL.
-- Modify all daemons to re-open log files on receipt of SIGUSR2 signal. This
is much than using SIGHUP to re-read the configuration file and rebuild
various tables.
-- Add PrivateData=events configuration parameter
-- Work for heterogeneous job support (complete solution in v17.11):
* Add pointer to job option structure to job_step_create_allocation()
* Parallelize task launch for heterogeneous job allocations (initial work).
* Make packjobid, packjoboffset, and packjobidset fields available in squeue
output.
* Modify smap command to display heterogeneous job records using "#+#"
format.
* Add srun --pack-group and --mpi-combine options to control job step
launch behaviour (not fully implemented).
* Add pack job component ID to srun --label output (e.g. "P0 1:" for
job component 0 and task 1).
* jobcomp/elasticsearch: Add pack_job_id and pack_job_offset fields.
* sview: Modified to display pack job information.
* Major re-write of task state container logic to support for list of
containers rather than one container per srun command.
* Add srun pack job environment variables when performing job allocation.
-- Set Reason=dependency over Reason=JobArrayTaskLimit for pending jobs.
-- Add slurm.conf configuration parameters SlurmctldSyslogDebug and
SlurmdSyslogDebug to control which messages from the slurmctld and slurmd
daemons get written to syslog.
-- Add slurmdbd.conf configuration parameter DebugLevelSyslog to control which
messages from the slurmdbd daemon get written to syslog.
-- Fix handling of GroupUpdateForce option.
-- Work for heterogeneous job support (complete solution in v17.11):
* Add support to sched/backfill for concurrent allocation of all pack job
components including support of --time-min option.
* Defer initiation of a heterogeneous job until a components can be started
at the same time, taking into consideration association and QOS limits
for the job as a whole.
* Perform limit check on heterogeneous job as a whole at submit time to
reject jobs that will never be able to run.
* Add pack_job_id and pack_job_offset to accounting database.
* Modified sacct to accept pack job ID specification using "#+#" notation.
* Modified sstat to accept pack job ID specification using "#+#" notation.
-- Clear a job's "wait reason" value of BeginTime" after that time has passed.
Previously a readon of "BeginTime" could be reported long after the job's
requested begin time had passed.
-- Split group_info in slurm_ctl_conf_t into group_force and group_time.
-- Work for heterogeneous job support (complete solution in v17.11):
* Fix I/O race condition on step termination for srun launching multiple
pack job groups.
* If prolog is running when attempting to signal a step, then return EAGAIN
and retry rather than simply returning SLURM_ERROR and aborting.
* Modify launch/slurm plugin to signal all components of a pack job rather
than just the one (modify to use a list of step context records).
* Add logic to support srun --mpi-combine option.
* Set up debugger data structures.
* Disable cancellation of individual component while the job is pending.
* Modify scontrol job hold/release and update to operate with heterogeneous
job id specification (e.g. "scontrol hold 123+4").
* If srun lacks application specification for some component, the next one
specified will be used for earlier components.
* Changes in Slurm 17.11.0pre1
==============================
-- Interpet all format options in output/error file to log prolog errors. Prior
logic only supported "%j" (job ID) option.
Danny Auble
committed
-- Add the configure option --with-shared-libslurm which will link to
libslurm.so instead of libslurm.o thus reducing the footprint of all the
binaries.
-- In switch plugin, added plugin_id symbol to plugins and wrapped
switch_jobinfo_t with dynamic_plugin_data_t in interface calls in
order to pass switch information between clusters with different switch
types.
-- Switch naming of acct_gather_infiniband to acct_gather_interconnect
-- Make it so you can "stack" the interconnect plugins.
-- Add a last_sched_eval timestamp to record when a job was last evaluated
by the main scheduler or backfill.
-- Add scancel "--hurry" option to avoid staging out any burst buffer data.
-- Simplify the sched plugin interface.
-- Add new advanced reservation flags of "weekday" (repeat on each weekday;
Monday through Friday) and "weekend" (repeat on each weekend day; Saturday
and Sunday).
-- Add new advanced reservation flag of "flex", which permits jobs requesting
the reservation to begin prior to the reservation's start time and use
resources inside or outside of the reservation. A typical use case is to
prevent jobs not explicitly requesting the reservation from using those
reserved resources rather than forcing jobs requesting the reservation to
use those resources in the time frame reserved.
-- Node "OS" field expanded from "sysname" to "sysname release version" (e.g.
change from "Linux" to
"Linux 4.8.0-28-generic #28-Ubuntu SMP Sat Feb 8 09:15:00 UTC 2017").
-- jobcomp/elasticsearch - Add "job_name" and "wc_key" fields to stored
information.
-- jobcomp/filetxt - Add ArrayJobId, ArrayTaskId, ReservationName, Gres,
Account, QOS, WcKey, Cluster, SubmitTime, EligibleTime, DerivedExitCode and
ExitCode.
-- scontrol modified to report core IDs for reservation containing individual
cores.
-- MYSQL - Get rid of table join during rollup which speeds up the process
dramatically on large job/step tables.
-- Add ability to define features on clusters for directing federated jobs to
different clusters.
-- Add new RPC to process multiple federation RPCs in a single communication.
-- Modify slurm_load_jobs() function to load job information from all clusters
in a federation.
-- Add squeue --local and --sibling options to modify filtering of jobs on
federated clusters.
-- Add SchedulerParameters option of bf_max_job_user_part to specifiy the
maximum number of jobs per user for any single partition. This differs from
bf_max_job_user in that a separate counter is applied to each partition
rather than having a single counter per user applied to all partitions.
-- Modify backfill logic so that bf_max_job_user, bf_max_job_part and
bf_max_job_user_part options can all be used independently of each other.
-- Add sprio -p/--partition option to filter jobs by partition name.
-- Add partition name to job priority factor response message.
-- Add sprio --local and --sibling options for use in federation of clusters.
-- Add sprio "%c" format to print cluster name in federation mode.
-- Modify sinfo logic to provided unified view of all nodes and partitions
in a federation, add --local option to only report local state information
even in a cluster, print cluster name with "%V" format option, and
optionally sort by cluster name.
-- If a task in a parallel job fails and it was launched with the
--kill-on-bad-exit option then terminate the remaining tasks using the
SIGCONT, SIGTERM and SIGKILL signals rather than just sending SIGKILL.
-- Include submit_time when doing the sort for job scheduling.
-- Modify sacct to report all jobs in federation by default. Also add --local
option.
-- Modify sacct to accept "--cluster all" option (in addition to the old
"--cluster -1", which is still accepted).
-- Modify sreport to report all jobs in federation by default. Also add --local
option.
-- sched/backfill: Improve assoc_limit_stop configuration parameter support.
-- KNL features: Always keep active and available features in the same order:
first site-specific features, next MCDRAM modes, last NUMA modes.
-- Changed default ProctrackType to cgroup.
-- Add "cluster_name" field to node_info_t and partition_info_t data structure.
It is filled in only when the cluster is part of a federation and
SHOW_FEDERATION flag used.
-- Functions slurm_load_node() slurm_load_partitions() modified to show all
nodes/partitions in a federation when the SHOW_FEDERATION flag is used.
-- Add --federation option to sacct, scontrol, sinfo, sprio, squeue, sreport to
show a federated view. Will show local view by default.
-- Add FederationParameters=fed_display slurm.conf option to configure status
commands to display a federated view by default if the cluster is a member
of a federation.
-- Log the down nodes whenever slurmctld restarts.
-- Report that "CPUs" plus "Boards" in node configuration invalid only if the
CPUs value is not equal to the total thread count.
-- Extend the output of the seff utility to also include the job's wall-clock
time.
-- Add bf_max_time to SchedulerParameters.
-- Add bf_max_job_assoc to SchedulerParameters.
-- Add new SchedulerParameters option bf_window_linear to control the rate at
which the backfill test window expands. This can be used on a system with
a modest number of running jobs (hundreds of jobs) to help prevent expected
start times of pending jobs to get pushed forward in time. On systems with
large numbers of running jobs, performance of the backfill scheduler will
suffer and fewer jobs will be evaluated.
-- Improve scheduling logic with respect to license use and node reboots.
-- CRAY - Alter algorithm to come up with the SLURM_ID_HASH.
-- Implement federated scheduling and federated status outputs.
-- The '-q' option to srun has changed from being the short form of
'--quit-on-interrupt' to '--qos'.
-- Change sched_min_interval default from 0 to 2 microseconds.
* Changes in Slurm 17.02.12
==========================
-- Fix segfault in slurmdbd hourly rollup when having a job outside a
reservation, with no end_time set, from an assoc that's in a reservation.
* Changes in Slurm 17.02.11
==========================
-- Fix insecure handling of user_name and gid fields. CVE-2018-10995.
* Changes in Slurm 17.02.10
==========================
-- Fix updating of requested TRES memory.
-- Cray modulefile: avoid removing /usr/bin from path on module unload.
-- Fix issue when resetting the partition pointers on nodes.
-- Show reason field in 'sinfo -R' when nodes is marked as failed.
-- Fix potential of slurmstepd segfaulting when the extern step fails to start.
-- Allow nodes state to be updated between FAIL and DRAIN.
-- Avoid registering a job'd credential multiple times.
-- Fix sbatch --wait to stop waiting after job is gone from memory.
-- Fix memory leak of MailDomain configuration string when slurmctld daemon is
reconfigured.
-- Fix to properly remove extern steps from the starting_steps list.
-- Fix Slurm to work correctly with HDF5 1.10+.
-- Add support in salloc/srun --bb option for "access_mode" in addition to
"access" for consistency with DW options.
-- Fix potential deadlock in _run_prog() in power save code.
-- MYSQL - Add dynamic_offset in the database to force range for auto
increment ids for the tres_table.
-- Avoid setting node in COMPLETING state indefinitely if the job initiating
the node reboot is cancelled while the reboot in in progress.
-- node_feature/knl_cray - Fix memory leaks that occur when slurmctld
reconfigured.
-- node_feature/knl_cray - Fix memory leak that can occur during normal
operation.
-- Fix job array dependency with "aftercorr" option and some task arrays in
the first job fail. This fix lets all task array elements that can run
proceed rather than stopping all subsequent task array elements.
-- Fix whole node allocation cpu counts when --hint=nomultihtread.
-- NRT - Fix issue when running on a HFI (p775) system with multiple protocols.
-- Fix uninitialized variables when unpacking slurmdb_archive_cond_t.
-- Fix security issue in accounting_storage/mysql plugin by always escaping
strings within the slurmdbd. CVE-2018-7033.
* Changes in Slurm 17.02.9
==========================
-- When resuming powered down nodes, mark DOWN nodes right after ResumeTimeout
has been reached (previous logic would wait about one minute longer).
-- Fix sreport not showing full column name for TRES Count.
-- Fix slurmdb_reservations_get() giving wrong usage data when job's spanned
reservation that was modified.
-- Fix sreport reservation utilization report showing bad data.
-- Show all TRES' on a reservation in sreport reservation utilization report by
default.
-- Fix sacctmgr show reservation handling "end" parameter.
-- Work around issue with sysmacros.h and gcc7 / glibc 2.25.
-- Fix layouts code to only allow setting a boolean.
-- Fix sbatch --wait to keep waiting even if a message timeout occurs.
-- CRAY - If configured with NodeFeatures=knl_cray and there are non-KNL
nodes which include no features the slurmctld will abort without
this patch when attemping strtok_r(NULL).
-- Fix regression in 17.02.7 which would run the spank_task_privileged as
part of the slurmstepd instead of it's child process.
-- Fix security issue in Prolog and Epilog by always prepending SPANK_ to
all user-set environment variables. CVE-2017-15566.
* Changes in Slurm 17.02.8
==========================
-- Add 'slurmdbd:' to the accounting plugin to notify message is from dbd
instead of local.
-- mpi/mvapich - Buffer being only partially cleared. No failures observed.
-- Fix for job --switch option on dragonfly network.
-- In salloc with --uid option, drop supplementary groups before changing UID.
-- jobcomp/elasticsearch - strip any trailing slashes from JobCompLoc.
-- jobcomp/elasticsearch - fix memory leak when transferring generated buffer.
-- Prevent slurmstepd ABRT when parsing gres.conf CPUs.
-- Fix sbatch --signal to signal all MPI ranks in a step instead of just those
on node 0.
Danny Auble
committed
-- Check multiple partition limits when scheduling a job that were previously
only checked on submit.
-- Cray: Avoid running application/step Node Health Check on the external
job step.
-- Optimization enhancements for partition based job preemption.
-- Address some build warnings from GCC 7.1, and one possible memory leak if
/proc is inaccessible.
-- If creating/altering a core based reservation with scontrol/sview on a
remote cluster correctly determine the select type.
-- Fix autoconf test for libcurl when clang is used.
-- Fix default location for cgroup_allowed_devices_file.conf to use correct
default path.
-- Document NewName option to sacctmgr.
-- Reject a second PMI2_Init call within a single step to prevent slurmstepd
from hanging.
-- Handle old 32bit values stored in the database for requested memory
correctly in sacct.
-- Fix memory leaks in the task/cgroup plugin when constraining devices.
-- Make extremely verbose info messages debug2 messages in the task/cgroup
plugin when constraining devices.
-- Fix issue that would deny the stepd access to /dev/null where GRES has a
'type' but no file defined.
-- Fix issue where the slurmstepd would fatal on job launch if you have no
gres listed in your slurm.conf but some in gres.conf.
-- Fix validating time spec to correctly validate various time formats.
-- Make scontrol work correctly with job update timelimit [+|-]=.
-- Reduce the visibily of a number of warnings in _part_access_check.
-- Prevent segfault in sacctmgr if no association name is specified for
an update command.
-- burst_buffer/cray plugin modified to work with changes in Cray UP05
software release.
-- Fix job reasons for jobs that are violating assoc MaxTRESPerNode limits.
-- Fix segfault when unpacking a 16.05 slurm_cred in a 17.02 daemon.
-- Fix setting TRES limits with case insensitive TRES names.
-- Add alias for xstrncmp() -- slurm_xstrncmp().
-- Fix sorting of case insensitive strings when using xstrcasecmp().
-- Gracefully handle race condition when reading /proc as process exits.
-- Avoid error on Cray duplicate setup of core specialization.
-- Skip over undefined (hidden in Slurm) nodes in pbsnodes.
-- Add empty hashes in perl api's slurm_load_node() for hidden nodes.
-- CRAY - Add rpath logic to work for the alpscomm libs.
-- Fixes for administrator extended TimeLimit (job reason & time limit reset).
-- Fix gres selection on systems running select/linear.
-- sview: Added window decorator for maximize,minimize,close buttons for all
systems.
Dominik Bartkiewicz
committed
-- squeue: interpret negative length format specifiers as a request to
delimit values with spaces.
-- Fix the torque pbsnodes wrapper script to parse a gres field with a type
set correctly.
* Changes in Slurm 17.02.7
==========================
-- Fix deadlock if requesting to create more than 10000 reservations.
-- Fix potential memory leak when creating partition name.
-- Execute the HealthCheckProgram once when the slurmd daemon starts rather
than executing repeatedly until an exit code of 0 is returned.
Alejandro Sanchez
committed
-- Set job/step start and end times to 0 when using --truncate and start > end.
-- Make srun --pty option ignore EINTR allowing windows to resize.
-- When resuming node only send one message to the slurmdbd.
-- Modify srun --pty option to use configured SrunPortRange range.
-- Fix issue with whole gres not being printed out with Slurm tools.
-- Fix issue with multiple jobs from an array are prevented from starting.
-- Fix for possible slurmctld abort with use of salloc/sbatch/srun
--gres-flags=enforce-binding option.
-- Fix race condition when using jobacct_gather/cgroup where the memory of the
step wasn't always gathered correctly.
-- Better debug when slurmdbd queue is filling up in the slurmctld.
-- Fixed truncation on scontrol show config output.
-- Serialize updates from from the dbd to the slurmctld.
-- Fix memory leak in slurmctld when agent queue to the DBD has filled up.
-- CRAY - Throttle step creation if trying to create too many steps at once.
Danny Auble
committed
-- If failing after switch_g_job_init happened make sure switch_g_job_fini is
called.
-- Fix minor memory leak if launch fails in the slurmstepd.
-- Fix issue where UnkillableStepProgram if step was in an ending state.
-- Fix bug when tracking multiple simultaneous spawned ping cycles.
-- jobcomp/elasticsearch plugin now saves state of pending requests on
slurmctld daemon shutdown so then can be recovered on restart.
-- Fix issue when an alternate munge key when communicating on a persistent
connection.
-- Document inconsistent behavior of GroupUpdateForce option.
-- Fix bug in selection of GRES bound to specific CPUs where the GRES count
is 2 or more. Previous logic could allocate CPUs not available to the job.
-- Increase buffer to handle long /proc/<pid>/stat output so that Slurm can
read correct RSS value and take action on jobs using more memory than
requested.
-- Fix srun job jobs that can run immediately to run in the highest priority
partion when multiple partitions are listed. scontrol show jobs can
potentially show the partition list in priority order.
-- Fix starting controller if StateSaveLocation path didn't exist.
-- Fix inherited association 'max' TRES limits combining multiple limits in
the tree.
-- Sort TRES id's on limits when getting them from the database.
-- Fix issue with pmi[2|x] when TreeWidth=1.
-- Correct buffer size used in determining specialized cores to avoid possible
truncation of core specification and not reserving the specified cores.
-- Close race condition on Slurm structures when setting DebugFlags.
-- Make it so the cray/switch plugin grabs new DebugFlags on a reconfigure.
-- Fix incorrect lock levels when creating or updating a reservation.
-- Fix overlapping reservation resize.
-- Add logic to help support Dell KNL systems where syscfg is different than
the normal Intel syscfg.
-- CRAY - Fix BB to handle type= correctly, regression in 17.02.6.
* Changes in Slurm 17.02.6
==========================
-- Fix configurator.easy.html to output the SelectTypeParameters line.
-- If a job requests a specific memory requirement then gets something else
from the slurmctld make sure the step allocation is made aware of it.
-- Fix missing initialization in slurmd.
-- Fix potential degradation when running HTC (> 100 jobs a sec) like
workflows through the slurmd.
-- Fix race condition which could leave a stepd hung on shutdown.
-- CRAY - Add configuration for ATP to the ansible play script.
-- Fix potential to corrupt DBD message.
-- burst_buffer logic modified to support sizes in both SI and EIC size units
(e.g. M/MiB for powers of 1024, MB for powers of 1000).
* Changes in Slurm 17.02.5
==========================
-- Prevent segfault if a job was blocked from running by a QOS that is then
deleted.
-- Improve selection of jobs to preempt when there are multiple partitions
with jobs subject to preemption.
-- Only set kmem limit when ConstrainKmemSpace=yes is set in cgroup.conf.
-- Fix bug in task/affinity that could result in slurmd fatal error.
-- Increase number of jobs that are tracked in the slurmd as finishing at one
time.
-- Note when a job finishes in the slurmd to avoid a race when launching a
batch job takes longer than it takes to finish.
-- Improve slurmd startup on large systems (> 10000 nodes)
-- Add LaunchParameters option of cray_net_exclusive to control whether all
jobs on the cluster have exclusive access to their assigned nodes.
-- Make sure srun inside an allocation gets --ntasks-per-[core|socket]
set correctly.
-- Only make the extern step at job creation.
-- Fix for job step task layout with --cpus-per-task option.
-- Fix --ntasks-per-core option/environment variable parsing to set
the requested value, instead of always setting one (srun).
-- Correct error message when ClusterName in configuration files does not match
the name in the slurmctld daemon's state save file.
-- Better checking when a job is finishing to avoid underflow on job's
submitted to a QOS/association.
-- Handle partition QOS submit limits correctly when a job is submitted to
more than 1 partition or when the partition is changed with scontrol.
-- Performance boost for when Slurm is dealing with credentials.
-- Fix race condition which could leave a stepd hung on shutdown.
-- Add lua support for opensuse.
* Changes in Slurm 17.02.4
==========================
-- Do not attempt to schedule jobs after changing the power cap if there are
already many active threads.
-- Job expansion example in FAQ enhanced to demonstrate operation in
heterogeneous environments.
Alejandro Sanchez
committed
-- Prevent scontrol crash when operating on array and no-array jobs at once.
-- knl_cray plugin: Log incomplete capmc output for a node.
-- knl_cray plugin: Change capmc parsing of mcdram_pct from string to number.
-- When rebooting a node and using the PrologFlags=alloc make sure the
prolog is ran after the reboot.
-- node_features/knl_generic - If a node is rebooted for a pending job, but
fails to enter the desired NUMA and/or MCDRAM mode then drain the node and
requeue the job.
-- node_features/knl_generic disable mode change unless RebootProgram
configured.
-- Add new burst_buffer function bb_g_job_revoke_alloc() to be executed
if there was a failure after the initial resource allocation. Does not
release previously allocated resources.
-- Test if the node_bitmap on a job is NULL when testing if the job's nodes
are ready. This will be NULL is a job was revoked while beginning.
-- Fix incorrect lock levels when testing when job will run or updating a job.
-- Add missing locks to job_submit/pbs plugin when updating a jobs
dependencies.
-- Add min_memory_per_node|cpu to the job_submit/lua plugin to deal with lua
not being able to deal with pn_min_memory being a uint64_t. Scripts are
urged to change to these new variables avoid issue. If not set the
variables will be 'nil'.
-- Calculate priority correctly when 'nice' is given.
-- Fix minor typos in the documentation.
-- node_features/knl_cray: Preserve non-KNL active features if slurmctld
reconfigured while node boot in progress.
-- node_features/knl_generic: Do not repeatedly log errors when trying to read
KNL modes if not KNL system.
-- Add missing QOS read lock to backfill scheduler.
-- When doing a dlopen on liblua only attempt the version compiled against.
-- Fix null-dereference in sreport cluster ulitization when configured with
memory-leak-debug.
-- Fix Partition info in 'scontrol show node'. Previously duplicate partition
names, or Partitions the node did not belong to could be displayed.
-- Fix it so the backup slurmdbd will take control correctly.
Alejandro Sanchez
committed
-- Fix unsafe use of MAX() macro, which could result in problems cleaning up
Alejandro Sanchez
committed
accounting plugins in slurmd, or repeat job cancellation attempts in
scancel.
-- Fix 'scontrol update reservation duration=unlimited' to set the duration
to 365-days (as is done elsewhere), rather than 49710 days.
-- Check if variable given to scontrol show job is a valid jobid.
-- Fix WithSubAccounts option to not include WithDeleted unless requested.
Dominik Bartkiewicz
committed
-- Prevent a job tested on multiple partitions from being marked
WHOLE_NODE_USER.
Dominik Bartkiewicz
committed
-- Prevent a race between completing jobs on a user-exclusive node from
leaving the node owned.
-- When scheduling take the nodes in completing jobs out of the mix to reduce
fragmentation. SchedulerParameters=reduce_completing_frag
-- For jobs submited to multiple partitions, report the job's earliest start
time for any partition.
-- Backfill partitions that use QOS Grp limits to "float" better.
-- node_features/knl_cray: don't clear configured GRES from non-KNL node.
-- sacctmgr - prevent segfault in command when a request is denied due
to a insufficient priviledges.
-- Add warning about libcurl-devel not being installed during configure.
-- Streamline job purge by handling file deletion on a separate thread.
-- Always set RLIMIT_CORE to the maximum permitted for slurmd, to ensure
core files are created even on non-developer builds.
-- Fix --ntasks-per-core option/environment variable parsing to set
the requested value, instead of always setting one.
-- If trying to cancel a step that hasn't started yet for some reason return
a good return code.
-- Fix issue with sacctmgr show where user=''
* Changes in Slurm 17.02.3
==========================
-- Increase --cpu_bind and --mem_bind field length limits.
-- Fix segfault when using AdminComment field with job arrays.
-- Clear Dependency field when all dependencies are satisfied.
-- Add --array-unique to squeue which will display one unique pending job
array element per line.
-- Reset backfill timers correctly without skipping over them in certain
circumstances.
-- When running the "scontrol top" command, make sure that all of the user's
jobs have a priority that is lower than the selected job. Previous logic
would permit other jobs with equal priority (no jobs with higher priority).
-- Fix perl api so we always get an allocation when calling Slurm::new().
-- Fix issue with cleaning up cpuset and devices cgroups when multiple steps
end at the same time.
-- Document that PriorityFlags option of DEPTH_OBLIVIOUS precludes the use of
FAIR_TREE.
-- Fix issue if an invalid message came in a Slurm daemon/command may abort.
-- Make it impossible to use CR_CPU* along with CR_ONE_TASK_PER_CORE. The
options are mutually exclusive.
-- ALPS - Fix scheduling when ALPS doesn't agree with Slurm on what nodes
are free.
-- When removing a partition make sure it isn't part of a reservation.
-- Fix seg fault if loading attempting to load non-existent burstbuffer plugin.
-- Fix to backfill scheduling with respect to QOS and association limits. Jobs
submitted to multiple partitions are most likley to be effected.
-- sched/backfill: Improve assoc_limit_stop configuration parameter support.
-- sched/backfill: Fix bug related to advanced reservations and the need to
reboot nodes to change KNL mode.
-- Preempt plugins - fix check for 'preempt_youngest_first' option.
-- Preempt plugins - fix incorrect casts in preempt_youngest_first mode.
-- Preempt/job_prio - fix incorrect casts in sort function.
-- Fix to make task/affinity work with ldoms where there are more than 64
cpus on the node.
-- When using node_features/knl_generic make it so the slurmd doesn't segfault
when shutting down.
Tim Wickberg
committed
-- Fix potential double-xfree() when using job arrays that can lead to
slurmctld crashing.
-- Fix priority/multifactor priorities on a slurmctld restart if not using
accounting_storage/[mysql|slurmdbd].
-- Fix NULL dereference reported by CLANG.
-- Update proctrack documentation to strongly encourage use of
proctrack/cgroup.
-- Fix potential memory leak if job fails to begin after nodes have been
selected for a job.
-- Handle a job that made it out of the select plugin without a job_resrcs
pointer.
-- Fix potential race condition when persistent connections are being closed at
shutdown.
-- Fix incorrect locks levels when submitting a batch job or updating a job
in general.
-- CRAY - Move delay waiting for job cleanup to after we check once.
-- MYSQL - Fix memory leak when loading archived jobs into the database.
-- Fix potential race condition when starting the priority/multifactor plugin's
decay thread.
-- Sanity check to make sure we have started a job in acct_policy.c before we
clear it as started.
-- Allow reboot program to use arguments.
-- Message Aggr - Remove race condition on slurmd shutdown with respects to
destroying a mutex.
-- Fix updating job priority on multiple partitions to be correct.
-- Don't remove admin comment when updating a job.
Dominik Bartkiewicz
committed
-- Return error when bad separator is given for scontrol update job licenses.
* Changes in Slurm 17.02.2
==========================
-- Update hyperlink to LBNL Node Health Check program.
-- burst_buffer/cray - Add support for line continuation.
-- If a job is cancelled by the user while it's allocated nodes are being
reconfigured (i.e. the capmc_resume program is rebooting nodes for the job)
and the node reconfiguration fails (i.e. the reboot fails), then don't
requeue the job but leave it in a cancelled state.
-- capmc_resume (Cray resume node script) - Do not disable changing a node's
active features if SyscfgPath is configured in the knl.conf file.
-- Improve the srun documentation for the --resv-ports option.
-- burst_buffer/cray - Fix parsing for discontinuous allocated nodes. A job
allocation of "20,22" must be expressed as "20\n22".
-- Fix rare segfault when shutting down slurmctld and still sending data to
the database.
-- Fix gres output of a job if it is updated while pending to be displayed
correctly with Slurm tools.
-- Fix missing unlock when job_list doesn't exist when starting priority/
multifactor.
-- Fix segfault if slurmctld is shutting down and the slurmdbd plugin was
in the middle of setting db_indexes.
-- Add ESLURM_JOB_SETTING_DB_INX to errno to note when a job can't be updated
because the dbd is setting a db_index.
-- Fix possible double insertion into database when a job is updated at the
moment the dbd is assigning a db_index.
-- Fix memory error when updating a job's licenses.
Danny Auble
committed
-- Fix seff to work correctly with non-standard perl installs.
-- Export missing slurmdbd_defs_[init|fini] needed for libslurmdb.so to work.
-- Fix sacct from returning way more than requested when querying against a job
array task id.
-- Fix double read lock of tres when updating gres or licenses on a job.
-- Make sure locks are always in place when calling
assoc_mgr_make_tres_str_from_array.
-- Prevent slurmctld SEGV when creating reservation with duplicated name.
-- Consider QOS flags Partition[Min|Max]Nodes when doing backfill.
-- Fix slurmdbd_defs.c to not have half symbols go to libslurm.so and the
other half go to libslurmdb.so.
-- Fix 'scontrol show jobs' to remove an errant newline when 'Switches' is
printed.
-- Better code for handling memory required by a task on a heterogeneous
system.
-- Fix regression in 17.02.0 with respects to GrpTresMins on a QOS or
Association.
-- Schedule interactive jobs quicker.
-- Perl API - correct value of MEM_PER_CPU constant to correctly handle
memory values.
-- Fix 'flags' variable to be 32 bit from the old 16 bit value in the perl api.
-- Export sched_nodes for a job in the perl api.
-- Improve error output when updating a reservation that has already started.
-- Fix --ntasks-per-node issue with srun so DenyOnLimit would work correctly.
-- node_features/knl_cray plugin - Fix memory leak.
Dominik Bartkiewicz
committed
-- Fix wrong cpu_per_task count issue on heterogeneous system when dealing with
steps.
-- Fix double free issue when removing usage from an association with sacctmgr.
-- Fix issue with SPANK plugins attempting to set null values as environment
variables, which leads to the command segfaulting on newer glibc versions.
-- Fix race condition on slurmctld startup when plugins have not gone through
init() ahead of the rpc_manager processing incoming messages.
-- job_submit/lua - expose admin_comment field.
-- Allow AdminComment field to be set by the job_submit plugin.
-- Allow AdminComment field to be changed by any Administrator.
Alejandro Sanchez
committed
-- MYSQL - Streamline job flush sql when doing a clean start on the slurmctld.
-- Fix potential infinite loop when talking to the DBD when shutting down
the slurmctld.
-- Fix MCS filter.
-- Make it so pmix can be included in the plugin rpm without having to
specify --with-pmix.
-- MYSQL - Fix initial load when not using he DBD.
-- Fix scontrol top to not make jobs priority 0 (held).
-- Downgrade info message about exceeding partition time limit to a debug2.
* Changes in Slurm 17.02.1-2
============================
-- Replace clock_gettime with time(NULL) for very old systems without the call.
* Changes in Slurm 17.02.1
==========================
-- Modify pam module to work when configured NodeName and NodeHostname differ.
-- Update to sbatch/srun man pages to explain the "filename pattern" clearer
-- Add %x to sbatch/srun filename pattern to represent the job name.
-- job_submit/lua - Add job "bitflags" field.
-- Update slurm.spec file to note obsolete RPMs.
-- Fix deadlock scenario when dumping configuration in the slurmctld.
-- Remove unneeded job lock when running assoc_mgr cache. This lock could
cause potential deadlock when/if TRES changed in the database and the
slurmctld wasn't made aware of the change. This would be very rare.
-- Fix missing locks in gres logic to avoid potential memory race.
Dominik Bartkiewicz
committed
-- If gres is NULL on a job don't try to process it when returning detailed
information about a job to scontrol.
-- Fix print of consumed energy in sstat when no energy is being collected.
-- Print formatted tres string when creating/updating a reservation.
-- Fix issues with QOS flags Partition[Min|Max]Nodes to work correctly.
-- Prevent manipulation of the cpu frequency and governor for batch or
extern steps. This addresses an issue where the batch step would
inadvertently set the cpu frequency maximum to the minimum value
supported on the node.
-- Convert a slurmctd power management data structure from array to list in
order to eliminate the possibility of zombie child suspend/resume
processes.
-- Burst_buffer/cray - Prevent slurmctld daemon abort if "paths" operation
fails. Now job will be held. Update job update time when held.
-- Fix issues with QOS flags Partition[Min|Max]Nodes to work correctly.
-- Refactor slurmctld agent logic to eliminate some pthreads.
-- Added "SyscfgTimeout" parameter to knl.conf configuration file.
-- Fix for CPU binding for job steps run under a batch job.
* Changes in Slurm 17.02.0
==========================
-- job_submit/lua - Make "immediate" parameter available.
-- Fix srun I/O race condtion to eliminate a error message that might be
generated if the application exits with outstanding stdin.
-- Fix regression when purging/archiving jobs/events.
-- Add new job state JOB_OOM indicating Out Of Memory condition as detected
by task/cgroup plugin.
-- If QOS has been added to the system go refigure out Deny/AllowQOS on
partitions.
-- Deny job with duplicate GRES requested.
-- Fix loading super old assoc_mgr usage without segfaulting.
-- CRAY systems: Restore TaskPlugins order of task/cray before task/cgroup.
-- Task/cray: Treat missing "mems" cgroup with "debug" messages rather than
"error" messages. The file may be missing at step termination due to a
change in how cgroups are released at job/step end.
-- Fix for job constraint specification with counts, --ntasks-per-node value,
and no node count.
-- Fix ordering of step task allocation to fill in a socket before going into
another one.
-- Fix configure to not require C++
Tim Wickberg
committed
-- job_submit/lua - Remove access to slurmctld internal reservation fields of
job_pend_cnt and job_run_cnt.
-- Prevent job_time_limit enforcement from blocking other internal operations
if a large number of jobs need to be cancelled.
-- Add 'preempt_youngest_order' option to preempt/partition_prio plugin.
-- Fix controller being able to talk to a pre-released DBD.
-- Added ability to override the invoking uid for "scontrol update job"
by specifying "--uid=<uid>|-u <uid>".
-- Changed file broadcast "offset" from 32 to 64 bits in order to support files
over 2 GB.
-- slurm.spec - do not install init scripts alongside systemd service files.
Alejandro Sanchez
committed
-- Add port info to 'sinfo' and 'scontrol show node'.
-- Fix errant definition of USE_64BIT_BITSTR which can lead to core dumps.
-- Move BatchScript to end of each job's information when using
"scontrol -dd show job" to make it more readable.
-- Add SchedulerParameters configuration parameter of "default_gbytes", which
treats numeric only (no suffix) value for memory and tmp disk space as being
in units of Gigabytes. Mostly for compatability with LSF.
-- Fix race condtion in srun/sattach logic which would prevent srun from
terminating.
-- Bitstring operations are now 64bit instead of 32bit.
-- Replace hweight() function in bitstring with faster version.
-- scancel would treat a non-numeric argument as the name of jobs to be
cancelled (a non-documented feature). Cancelling jobs by name now require
the "--jobname=" command line argument.
-- scancel modified to note that no jobs satisfy the filter options when the
--verbose option is used along with one or more job filters (e.g. "--qos=").
-- Change _pack_cred to use pack_bit_str_hex instead of pack_bit_fmt for
better scalability and performance.
-- Add BootTime configuration parameter to knl.conf file to optimize resource
allocations with respect to required node reboots.
-- Add node_features_p_boot_time() to node_features plugin to optimize
scheduling with respect to node reboots.
-- Avoid allocating resources to a job in the event that its run time plus boot
time (if needed) extent into an advanced reservation.
-- Burst_buffer/cray - Avoid stage-out operation if job never started.
-- node_features/knl_cray - Add capability to detected Uncorrectable Memory
Errors (UME) and if detected then log the event in all job and step stderr
with a message of the form:
error: *** STEP 1.2 ON tux1 UNCORRECTABLE MEMORY ERROR AT 2016-12-14T09:09:37 ***
Similar logic added to node_features/knl_generic in version 17.02.0pre4.
-- If job is allocated nodes which are powered down, then reset job start time
when the nodes are ready and do not charge the job for power up time.
-- Add the ability to purge transactions from the database.
-- Add support for requeue'ing of federated jobs (BETA).
-- Add support for interactive federated jobs (BETA).
-- Add the ability to purge rolled up usage from the database.
-- Properly set SLURM_JOB_GPUS environment variable for Prolog.
* Changes in Slurm 17.02.0pre4
==============================
-- Add support for per-partitiion OverTimeLimit configuration.
-- Add --mem_bind option of "sort" to run zonesort on KNL nodes at step start.
-- Add LaunchParameters=mem_sort option to configure running of zonesort
-- Add "FreeSpace" information for each pool to the "scontrol show burstbuffer"
output. Required changes to the burst_buffer_info_t data structure.
-- Add new node state flag of NODE_STATE_REBOOT for node reboots triggered by
"scontrol reboot" commands. Previous logic re-used NODE_STATE_MAINT flag,
which could lead to inconsistencies. Add "ASAP" option to "scontrol reboot"
command that will drain a node in order to reboot it as soon as possible,
then return it to service.
-- Allow unit conversion routine to convert 1024M to 1G.
-- switch/cray plugin - change legacy spool directory location.
-- Add new PriorityFlags option of INCR_ONLY, which prevents a job's priority
from being decremented.
-- Make it so we don't purge job start messages until after we purge step
messages. Hopefully this will reduce the number of messages lost when
filling up memory when the database/DBD is down.
-- Added SchedulingParameters option of "bf_job_part_count_reserve". Jobs below
the specified threshold will not have resources reserved for them.
-- If GRES are configured with file IDs, then "scontrol -d show node" will
not only identify the count of currently allocated GRES, but their specific
index numbers (e.g. "GresUsed=gpu:alpha:2(IDX:0,2),gpu:beta:0(IDX:N/A)").
Ditto for job information with "scontrol -d show job".
-- Add "GresEnforceBind=Yes" to "scontrol show job" output if so configured.
-- Add support for SALLOC_CONSTRAINT, SBATCH_CONSTRAINT and SLURM_CONSTRAINT
environment variables to set default constraints for salloc, sbatch and
srun commands respectively.
-- Provide limited support for the MemSpecLimit configuration parameter without
the task/cgroup plugin.
-- node_features/knl_generic - Add capability to detected Uncorrectable Memory
Errors (UME) and if detected then log the event in all job and step stderr
with a message of the form:
error: *** STEP 1.2 ON tux1 UNCORRECTABLE MEMORY ERROR AT 2016-12-14T09:09:37 ***
-- Add SLURM_JOB_GID to TaskProlog environment.
-- burst_buffer/cray - Remove leading zeros from node ID lists passed to
dw_wlm_cli program.
-- Add "Partitions" field to "scontrol show node" output.
-- Remove sched/wiki and sched/wiki2 plugins and associated code.
-- Remove SchedulerRootFilter option and slurm_get_root_filter() API call.
-- Add SchedulerParameters option of spec_cores_first to select specialized
cores from the lowest rather than highest number cores and sockets.
-- Add PrologFlags option of Serial to disable concurrent launch of
Prolog and Epilog scripts.
-- Fix security issue caused by insecure file path handling triggered by the
failure of a Prolog script. To exploit this a user needs to anticipate or
cause the Prolog to fail for their job. CVE-2016-10030.
* Changes in Slurm 17.02.0pre3
==============================
-- Add srun host & PID to job step data structures.
-- Avoid creating duplicate pending step records for the same srun command.
-- Rewrite srun's logic for pending steps for better efficiency (fewer RPCs).
-- Added new SchedulerParameters options step_retry_count and step_retry_time
to control scheduling behaviour of job steps waiting for resources.
-- Optimize resource allocation logic for --spread-job job option.
-- Modify cpu_bind and mem_bind map and mask options to accept a repetition
count to better support large task count. For example:
"mask_mem:0x0f*2,0xf0*2" is equivalent to "mask_mem:0x0f,0x0f,0xf0,0xf0".
-- Add support for --mem_bind=prefer option to prefer, but not restrict memory
-- Add mechanism to constrain kernel memory allocation using cgroups. New
cgroup.conf parameters added: ConstrainKmemSpace, MaxKmemPercent, and
MinKmemSpace.
-- Correct invokation of man2html, which previously could cause FreeBSD builds
to hang.
-- MYSQL - Unconditionally remove 'ignore' clause from 'alter ignore'.
-- Modify service files to not start Slurm daemons until after Munge has been
started.
NOTE: If you are not using Munge, but are using the "service" scripts to
start Slurm daemons, then you will need to remove this check from the
etc/slurm*service scripts.
-- Do not process SALLOC_HINT, SBATCH_HINT or SLURM_HINT environment variables
if any of the following salloc, sbatch or srun command line options are
specified: -B, --cpu_bind, --hint, --ntasks-per-core, or --threads-per-core.
-- burst_buffer/cray: Accept new jobs on backup slurmctld daemon without access
to dw_wlm_cli command. No burst buffer actions will take place.
-- Do not include SLURM_JOB_DERIVED_EC, SLURM_JOB_EXIT_CODE, or
SLURM_JOB_EXIT_CODE in PrologSlurmctld environment (not available yet).
-- Cray - set task plugin to fatal() if task/cgroup is not loaded after
task/cray in the TaskPlugin settings.
-- Remove separate slurm_blcr package. If Slurm is built with BLCR support,
the files will now be part of the main Slurm packages.
-- Replace sjstat, seff and sjobexit RPM packages with a single "contribs"
package.
-- Remove long since defunct slurmdb-direct scripts.
-- Add SbcastParameters configuration option to control default file
destination directory and compression algorithm.
-- Add new SchedulerParameter (max_array_tasks) to limit the maximum number of
tasks in a job array independently from the maximum task ID (MaxArraySize).
Alejandro Sanchez
committed
-- Fix issue where number of nodes is not properly allocated when sbatch and
salloc are requested with -n tasks < hosts from -w hostlist or from -N.
* Changes in Slurm 17.02.0pre2
==============================
-- Add new RPC (REQUEST_EVENT_LOG) so that slurmd and slurmstepd can log events
through the slurmctld daemon.
-- Remove sbatch --bb option. That option was never supported.
-- Automatically clean up task/cgroup cpuset and devices cgroups after steps
are completed.
-- Add federation read/write locks.
-- Limit job purge run time to 1 second at a time.
-- The database index for jobs is now 64 bits. If you happen to be close to
4 billion jobs in your database you will want to update your slurmctld at
the same time as your slurmdbd to prevent roll over of this variable as
it is 32 bit previous versions of Slurm.
-- Optionally lock slurmstepd in memory for performance reasons and to avoid
possible SIGBUS if the daemon is paged out at the time of a Slurm upgrade
(changing plugins). Controlled via new LaunchParameters options of
slurmstepd_memlock and slurmstepd_memlock_all.
-- Add event trigger on burst buffer errors (see strigger man page,
--burst_buffer option).
-- Add job AdminComment field which can only be set by a Slurm administrator.
-- Add salloc, sbatch and srun option of --delay-boot=<time>, which will
temporarily delay booting nodes into the desired state for a job in the
hope of using nodes already in the proper state which will be available at
a later time.
-- Add job burst_buffer_state and delay_boot fields to scontrol and squeue
output. Also add ability to modify delay_boot from scontrol.
-- Fix for node's available TRES array getting filled in with configured GRES
-- Log if job --bb option contains any unrecognized content.
-- Display configured and allocated TRES for nodes in scontrol show nodes.
-- Change all memory values (in MB) to uint64_t to accommodate > 2TB per node.
-- Add MailDomain configuration parameter to qualify email addresses.
-- Refactor the persistent connections within the federation code to use
the same logic that was found in the slurmdbd. Now both functionalities
share the same code.
-- Remove BlueGene/L and BlueGene/P support.
-- Add "flag" field to launch_tasks_request_msg. Remove the following fields
(moved into flags): multi_prog, task_flags, user_managed_io, pty,
buffered_stdio, and labelio.
-- Add protocol version to slurmd startup communications for slurmstepd to
permit changes in the protocol.
* Changes in Slurm 17.02.0pre1
==============================
-- burst_buffer/cray - Add support for rounding up the size of a buffer reqeust
if the DataWarp configuration "equalize_fragments" is used.
-- Rename "in" to "input" in slurm_step_io_fds data structure defined in
slurm.h. This is needed to avoid breaking Python with by using one of its
keywords in a Slurm data structure.
-- Remove eligible_time from jobcomp/elasticsearch.
-- Enable the deletion of a QOS, even if no clusters have been added to the
database.
-- SlurmDBD - change all timestamps to bigint from int to solve Y2038 problem.
-- Add salloc/sbatch/srun --spread-job option to distribute tasks over as many
nodes as possible. This also treats the --ntasks-per-node option as a
maximum value.
-- Add ConstrainKmemSpace to cgroup.conf, defaulting to yes, to allow
cgroup Kmem enforcement to be disabled while still using ConstrainRAMSpace.
-- Add support for sbatch --bbf option to specify a burst buffer input file.
-- Added burst buffer support for job arrays. Add new SchedulerParameters
configuration parameter of bb_array_stage_cnt=# to indicate how many pending
tasks of a job array should be made available for burst buffer resource
allocation.
-- Fix small memory leak when a job fails to load from state save.
-- Fix invalid read when attempting to delete clusters from database with
running jobs.
-- Fix small memory leak when deleting clusters from database.
-- Add SLURM_ARRAY_TASK_COUNT environment variable. Total number of tasks in a
job array (e.g. "--array=2,4,8" will set SLURM_ARRAY_TASK_COUNT=3).
-- Add new sacctmgr commands: "shutdown" (shutdown the server), "list stats"