Commits · 0c1d340018d24a99a46864ff1e36b68d7446c69c · Manuel G. Marciani / ces_slurm_simulator

16 Jul, 2018 1 commit
- Fix typo in faq · 0c1d3400
  Felip Moll authored Jul 16, 2018
  
  0c1d3400
13 Jul, 2018 1 commit

SlurmDBD - improve error message on archive load failure. · 1c27a2e6

Isaac Hartung authored Jul 12, 2018

Add errno to info message in the SlurmDBD log, and pass the actual
errno back to the sacctmgr process so the user can see it.

Bug 5152.

1c27a2e6

12 Jul, 2018 4 commits
- mpi/pmix: fixed the collectives canceling · f15c8183
  Boris Karasev authored Jun 16, 2018
```
- avoid `abort()` when collective is failed
- added logging of coll details for fail cases

Bug 5067
```
  f15c8183
- Make code compile with hdf5 1.10.2+ · 90c4e7e7
  Danny Auble authored Jul 12, 2018
```
Note, this is setting it up so we can use defunct functions.  It will
probably need to be properly fixed in a future version so we don't
do this.
```
  90c4e7e7
- Fix for potential deadlock in the assoc_mgr_get_user_assocs() · 80d38355
  Dominik Bartkiewicz authored Jul 12, 2018
```
Bug 5098.
```
  80d38355
- Fix issues with --exclusive=[user|mcs] to work correctly · 72736af2
  Dominik Bartkiewicz authored Jul 12, 2018
```
with preemption or when job requests a specific list of hosts.

Bug 5293.
```
  72736af2
09 Jul, 2018 1 commit
- Add news for 4daeedd8 · d10854d9
  Danny Auble authored Jul 09, 2018
  
  d10854d9
06 Jul, 2018 6 commits

Add workaround for importing newly install namespace packages · da2ecda8
Thea Flowers authored Jun 22, 2018
```
Bug 5395
```
da2ecda8
Fix potential segfault when closing the mpi/pmi2 plugin. · 4daeedd8
Danny Auble authored Jul 06, 2018
```
Bug 5390
```
4daeedd8

Fix leaking freezer cgroups. · 7f9c4f73

Marshall Garey authored Jul 06, 2018

Continuation of 923c9b37.

There is a delay in the cgroup system when moving a PID from one cgroup
to another. It is usually short, but if we don't wait for the PID to
move before removing cgroup directories the PID previously belonged to,
we could leak cgroups. This was previously fixed in the cpuset and
devices subsystems. This uses the same logic to fix the freezer
subsystem.

Bug 5082.

7f9c4f73

Combine duplicate code in cgroup fini functions. · 923c9b37

Marshall Garey authored Jul 06, 2018

cpuset and devices subsystems have duplicate code to cleanup the cgroup
and prevent leaking cgroups by moving the process to the root cgroup and
waiting for it to be moved.

Move this duplicate code to a common function so it can be used later by
the freezer subsystem.

Bug 5082.

923c9b37

Clarify Depth Mean Try Sched in sdiag man page · dd6ca4b0
Marshall Garey authored Jul 06, 2018
```
Bug 5227
```
dd6ca4b0
Fix test to make sure something happens to deem success. · 2f9a326e
Danny Auble authored Jul 05, 2018

2f9a326e

04 Jul, 2018 2 commits
- Add some corrections to FAQ and remove Slurm 1.3 string · 0985c8b1
  Felip Moll authored Jul 04, 2018
```
bug4451
```
  0985c8b1
- Combine the active and available node feature change logs · 3818159e
  Morris Jette authored Jul 04, 2018
```
So that multiple nodes changes will be reported on one line rather than one
line per node. Otherwise this could lead to performance issues when reloading
slurmctld in big systems.

Bug4980
```
  3818159e
03 Jul, 2018 2 commits

Clarify gres.conf Cores documentation · 3ee3795f

Felip Moll authored Jul 03, 2018

Slurm numbers the cores using an abstract index, starting from CPU 0
on the first socket, core, thread, and continuing until N on the last socket,
last core, last thread. Explain that in the documentation.

bug 5189

3ee3795f

Fix _step_signal() from always returning success · 2ab24e04
Brian Christiansen authored Jul 02, 2018
```
Currently, no caller checks the return code.

Bug 5164
```
2ab24e04

02 Jul, 2018 1 commit
- Update StoragePass docs password restrictions · 0c606741
  Marshall Garey authored Jul 02, 2018
```
Can't have # character in the password since it is treated as a comment.

Bug 5294
```
  0c606741
27 Jun, 2018 2 commits

Fix incorrect quoting in x_ac_debug.m4. · 2bde148f

Pär Lindfors authored Jun 27, 2018

Only produces a whitespace difference in configure. Inadvertently
introduced by commit 103ebaac.

Bug 5335.

2bde148f

Docs - fix text cutoff issue in Firefox · 0891cf25

Michael Hinton authored Jun 22, 2018

Firefox handles flex differently than Chrome. When flex is set to 1,
the flex item does not respect the flex container's bounds, causing
text to be cutoff.

Bug 5339.

0891cf25

26 Jun, 2018 4 commits

Fix problem when validating job memory on multi-partition requests. · f07f53fc

Dominik Bartkiewicz authored Jun 08, 2018

Some job fields can change in the course of scheduling. This patch
reinitializes previously adjusted job fields to their original value
when validating the job memory in multi-partition requests.

Bug 4895.

f07f53fc

Revert "Fix different issues when requesting memory per cpu/node." · d52d8f4f
Alejandro Sanchez authored Jun 08, 2018
```
This reverts commit bf4cb0b1.

Bug 5240, Bug 4895 and Bug 4976.
```
d52d8f4f

Prevent reboot of busy KNL node when asking for inactive features. · d8c5379b

Felip Moll authored Jun 26, 2018

When one asks for an inactive feature and also specifies the node with -w flag,
the node will be rebooted despite it may contain running jobs.

bug4821

d8c5379b

Reorder proctrack/task plugin load in the slurmstepd to match that of slurmd · 164da888
Tim Wickberg authored Jun 25, 2018
```
and avoid race condition calling task before proctrack can introduce.

Bug 5319
```
164da888

25 Jun, 2018 2 commits
- Revert commit 0c7b30fe, this was needed for the last commit · 0d5ef523
  Morris Jette authored Jun 25, 2018
```
to work correctly.

Bug 5155
Bug 4516
```
  0d5ef523
- Add new job dependency type of "afterburstbuffer". The pending job will be · 3d4baee9
  Morris Jette authored Jun 25, 2018
```
delayed until the first job completes execution and it's burst buffer
stage-out is completed.

Bug 4675
```
  3d4baee9
22 Jun, 2018 2 commits
- Add sanity check to make sure qos existed before setting it as default · ec0a0fd5
  Dominik Bartkiewicz authored Jun 22, 2018
```
Bug 5191
```
  ec0a0fd5
- Prevent slurmctld from abort when attempting to set non-existing qos as def_qos_id · c9682e1a
  Dominik Bartkiewicz authored Jun 22, 2018
```
Bug 5159.
```
  c9682e1a
20 Jun, 2018 5 commits

Docs - remove references to slurm-munge RPM package. · 3ced4ec8
Tim Wickberg authored Jun 20, 2018
```
MUNGE plugin is no longer packaged separate after the slurm.spec overhaul.
```
3ced4ec8
multi-partition --test-only enhancements · 0b8a6a48
Morris Jette authored Jun 20, 2018
```
Enhancements to commit 35a13703
bug 5185
```
0b8a6a48
multi-partition --test-only enhancements · 7111833b
Brian Christiansen authored Jun 20, 2018
```
Enhancements to commit 35a13703
bug 5185
```
7111833b

Make job_start_data() multi partition aware on REQUEST_JOB_WILL_RUN. · 35a13703

Alejandro Sanchez authored Jun 20, 2018

Previously the function was only testing against the first partition in
the job_record. Now it detects if the job request is multi partition and
if so then loops through all of them until the job will run in any or
until the end of the list, returning the error code from the last one if
the job won't run in any partition.

Bug 5185

35a13703

avoid test suite leaving pending job · 95454171
Morris Jette authored Jun 19, 2018
```
This can happen if burst buffer logic is broken
```
95454171

19 Jun, 2018 2 commits

Don't enforce MaxQueryTimeRange with specific jobs · d41cb31a

Isaac Hartung authored Jun 19, 2018

When requesting specific jobids with sacct, the starttime of the request
is 0, which will cause the time range to be outside of the
MaxQueryTimeRange range -- if specified. When requesting specific
jobids, sacct should be able to find the job whenever it started --
unless confined to a smaller range with -S and/or -E.

Bug 5009

d41cb31a

Fix send_gids info. in NEWS and RELEASE_NOTES. · 4c49b4bb
Felip Moll authored Jun 19, 2018

4c49b4bb

18 Jun, 2018 1 commit
- MySQL - Prevent deadlock caused by archive logic locking reads. · 4c448fd7
  Danny Auble authored Jun 18, 2018
```
Specifically due to SELECT ... FOR UPDATE ones.

Bug 5086.
```
  4c448fd7
16 Jun, 2018 1 commit

Docs - update config references to SlurmDBD. · 97e4304e

Michael Hinton authored Jun 15, 2018

The only "database type storage plugin" is SlurmDBD, so refer
to it directly.

Mention the default database name in slurmdbd.conf if StorageHost
is not explicitly set.

97e4304e

15 Jun, 2018 2 commits
- job_submit/lua - fix access into reservation table. · d512be7b
  Marshall Garey authored Jun 15, 2018
```
Bug 5270.
```
  d512be7b
- Allow job_submit_plugin_modify() to change admin_comment field. · d939cb94
  Tim Wickberg authored Jun 14, 2018
```
Instead of unintentionally rejecting the update from a non-Administrator
if the job_submit plugin modified that field.

Bug 5306.
```
  d939cb94
13 Jun, 2018 1 commit
- Doc - add link to opa2slurm tool. · e086587c
  Tim Wickberg authored Jun 13, 2018
  
  e086587c