Commits · 877d8cd9edcad9c4ff21c85ed0477d8b86f6c842 · Manuel G. Marciani / ces_slurm_simulator

04 Dec, 2018 8 commits
- Merge branch 'slurm-18.08' · 877d8cd9
  Tim Wickberg authored Dec 04, 2018
  
  877d8cd9
- Docs - rewrite platforms.html page with current info. · 372b51e8
  Tim Wickberg authored Dec 04, 2018
```
Break out a list of Linux distributions as well.
```
  372b51e8
- Fix handling of 'slurmd -f' by setting SLURM_CONF earlier. · 401d1b47
  Marshall Garey authored Dec 04, 2018
```
Plugins reading in their own config files rely on the SLURM_CONF
environment variable pointing to the appropriate directory,
otherwise they will fall back to the build in sysconfdir path.

Set the environment variable early enough so that the -f flag
operates correctly, but not before conf->conffile has definitely
been set. Remove the setenv call that happens before the first
slurmstepd is fork()'d as it is now redundant.

Bug 4774.
```
  401d1b47
- salloc - set SLURM_NTASKS_PER_CORE and SLURM_NTASKS_PER_SOCKET when appropriate. · a36b8a4d
  Alejandro Sanchez authored Dec 04, 2018
```
sbatch sets these, but salloc did not. This should make srun behavior
between the two consistent.

Bug 3861.
```
  a36b8a4d
- Merge branch 'slurm-18.08' · bb498289
  Tim Wickberg authored Dec 04, 2018
  
  bb498289
- Fix typo, use updated comment style while correcting. · 0803a148
  Tim Wickberg authored Dec 03, 2018
  
  0803a148
- Add 19.05 protocol_version blocks to _{pack,unpack}_launch_tasks_request_msg. · e858d26d
  Tim Wickberg authored Dec 03, 2018
```
While here fixup function declarations to match current style guide.

No functional changes.
```
  e858d26d
- Add 19.05 protocol_version blocks to _{pack,unpack}_prolog_launch_msg. · d8f94356
  Tim Wickberg authored Dec 03, 2018
```
While here fixup function declarations to match current style guide.

No functional changes.
```
  d8f94356
03 Dec, 2018 5 commits

When handling runaway jobs remove all usage before rollup to remove any · bf705c80
Marshall Garey authored Dec 03, 2018
```
time that wasn't existent instead of just updating lines that have time
with a lesser time.
```
bf705c80
Fix issue when job's environment is minimal and only contains variables · f1116c67
Dominik Bartkiewicz authored Dec 03, 2018
```
Slurm is going to replace internally.

Bug 5800
```
f1116c67
Remove a few missed references to ionodes. · aec91c8b
Tim Wickberg authored Dec 02, 2018

aec91c8b
Fix long line from previous commit. · bd600cc7
Tim Wickberg authored Dec 02, 2018

bd600cc7

Rework slurmstepd authentication. · 06863788

Tim Wickberg authored Dec 02, 2018

slurmstepd exclusively accepts API connections through a unix socket.
Before this patch, the client end (usually slurmd, but pam_slurm_adopt and
scontrol both can use this) retrieves an auth cred via MUNGE, serializes
that over the socket, after which the slurmstepd must send that crential
back to MUNGE for verification.

However, the only info used from that cred is the uid from the client side
of the socket. That info can be retrieved via SO_PEERCRED (on Linux) - this
is what MUNGE uses to authenticate its own credentials. And the client uid
is only checked in half of the API calls since the info exposed is not
considered sensitive.

So, rather than have every slurmd -> slurmstepd call involve a sequence of:

slurmd -> MUNGE for cred (authenticated using SO_PEERCRED internally)
slurmd -> slurmstepd over socket
slurmstepd -> MUNGE to validate credential

This can be simplified to:
slurmd -> slurmstepd over socket (auth using SO_PEERCRED directly)

This simplified call path removes two socket connections, plus the overhead
from MUNGE's cryptographic operations, from the exchange. While performance
is not criticial for slurmd -> slurmstepd communication, this also improves
performance for other system utilities such as pam_slurm_adopt (which needs
to connect to half of the extern stepds on the node on average), or a future
nss_slurm module which is expected to place an even higher load on this API.

The one caveat here is that the API was not built in a way that makes this
restructing easy. The slurmstepd protocol version, which may be one or two
release behind that of the slurmd, was only sent back to the slurmd _after_
the auth cred has been received and validated. So, to handle backwards
compatibility, we change over to sending the SLURM_PROTOCOL_VERSION instead
of SOCKET_CONNECT as the first int over the socket. If the slurmstepd
returns an error - since this value is not equal to SOCKET_CONNECT (zero)
as was required in older versions - we allow that connection to close, and
try to reconnect using the older RPC format instead. That fallback code
should be removed two versions after 19.05 is released.

06863788

02 Dec, 2018 2 commits

Rework debug3 messages in _handle_request. · 78ea3e01

Tim Wickberg authored Dec 02, 2018

Use __func__, and list the function name first in the message.

Drop one redundant message printing the request number - all paths
through the switch statement will print this out in some form.

Remove a ternary used to print SLURM_SUCCESS/SLURM_FAILURE and
print the raw return value. If you're staring at debug3 logs,
you should hopefully know how to interpret these values. :)

78ea3e01

Modify _handle_request to drop gid as an argument. · af96f57b

Tim Wickberg authored Dec 02, 2018

Not used, so don't bother retrieving it from the cred in _handle_accept.

Also, switch a printf format to %u instead of %d for uid_t.

af96f57b

29 Nov, 2018 5 commits
- Merge branch 'slurm-18.08' · 2e5b90ef
  Morris Jette authored Nov 29, 2018
  
  2e5b90ef
- Validate job_ptr in backfill before restoring preempt state. · 4dec76c9
  Dominik Bartkiewicz authored Nov 29, 2018
```
Bug 6121
```
  4dec76c9
- Merge branch 'slurm-18.08' · c0fe0b23
  Morris Jette authored Nov 29, 2018
  
  c0fe0b23
- Fix salloc and missing SLURM_NTASKS. · 8c910226
  Nate Rini authored Nov 28, 2018
```
Bug 6008
```
  8c910226
- Avoid step launch error with no_consume GRES configuration · f2f989ec
  Morris Jette authored Nov 28, 2018
```
bug 6078
```
  f2f989ec
28 Nov, 2018 14 commits
- gres logic refactoring · bf4bfd4e
  Morris Jette authored Nov 28, 2018
```
This change permits the set of the type_ arrays in the job
gres data structure without setup of the typo_ arrays.

Bug 6078
```
  bf4bfd4e
- Extend test to test patch from bug 6077 · 7cbda917
  Morris Jette authored Nov 28, 2018
  
  7cbda917
- Fix issue when requesting invalid gres. · 80e2cc41
  Alejandro Sanchez authored Nov 28, 2018
```
Bug 6077
```
  80e2cc41
- In route/topology validate the slurmctld doesn't try to initialize the · cae90ff4
  Danny Auble authored Nov 28, 2018
```
node system.

Bug 6037
```
  cae90ff4
- Fix race condition in route/topology when the slurmctld is reconfigured. · f35bb686
  Marshall Garey authored Nov 28, 2018
```
Bug 6037
```
  f35bb686
- Fix test · 903da06b
  Morris Jette authored Nov 28, 2018
```
The test was failing if the GPU count exceeded the available CPU count
```
  903da06b
- Fix test · 41b49afc
  Morris Jette authored Nov 28, 2018
```
The test was failing due to bad logic in the case where a GRES of
craynetwork > 1 and the node count was 1
```
  41b49afc
- mpi/pmix: Remove unneeded libpmix callback drop in tree-based coll · bd283fd3
  Artem Y. Polyakov authored Nov 27, 2018
```
Bug 5983
```
  bd283fd3
- Split GRES "has_file" into flags · f2e4bed6
  Morris Jette authored Nov 28, 2018
```
Rename "has_file" to "config_flags" and independently set the flags
GRES_CONF_HAS_FILE and GRES_CONF_HAS_TYPE as appropriate. Previously
defining a GRES "Type" would set a default value of "File" to
"/dev/null"

Bug 6078
```
  f2e4bed6
- mpi/pmix: Fix double invocation of the PMIx lib fence callback · 674e78b6
  Artem Y. Polyakov authored Nov 05, 2018
```
In case of the error code paths (like collective timeout) it is possible
that a callback provided by PMIx will be called twice leading to a
segmentation fault.
This commit fixes it by properly accounting callback invocations.

Bug 5983
```
  674e78b6
- Improve node GRES copy function · bf4ddb33
  Morris Jette authored Nov 28, 2018
```
Permit independent copy of topo and type data structures.

Bug 6078
```
  bf4ddb33
- Cosmetic changes, no changes to logic · 58d8ede1
  Morris Jette authored Nov 28, 2018
  
  58d8ede1
- Change type of temp GRES counter · fcbaa06d
  Morris Jette authored Nov 27, 2018
```
A temporary GRES counter was being stored in a variable of type "int".
This changes the variable to type "uint64_t" without changing logic.
```
  fcbaa06d
- Initialize variables before use · bd0b580c
  Morris Jette authored Nov 27, 2018
```
No change in logic, just add initialization to eliminate some Coverity warnings.
```
  bd0b580c
27 Nov, 2018 6 commits

Change LONG_OPT_* macros in salloc/sbatch/srun into a single enum. · 3bffcb5f
Tim Wickberg authored Nov 27, 2018

3bffcb5f
Revert "mpi/pmix: Fix double invocation of the PMIx lib fence callback" · 2dfadf65
Danny Auble authored Nov 27, 2018
```
This reverts commit b0515009.
```
2dfadf65
mpi/pmix: Make multi-slurmd work correctly when using ring communication. · 17ddbd5e
Danny Auble authored Nov 27, 2018
```
Bug 5935
```
17ddbd5e

mpi/pmix: fixed the logging of collective state · e7803212

Boris Karasev authored Nov 11, 2018



This could have caused core dumps if communication failed for one
reason or another.

Signed-off-by: Boris Karasev <karasev.b@gmail.com>

Bug 5935

e7803212

mpi/pmix: Fix double invocation of the PMIx lib fence callback · b0515009

Artem Y. Polyakov authored Nov 05, 2018

In case of the error code paths (like collective timeout) it is possible
that a callback provided by PMIx will be called twice leading to a
segmentation fault.
This commit fixes it by properly accounting callback invocations.

b0515009

Clean up step on a failed node correctly. · e11f4af9

Morris Jette authored Nov 27, 2018

This patch does 2 things:
1. When a step fails on some node, then mark it as complete on those
   nodes. This is needed so that when the step ends on the other
   nodes, slurmctld recognized the step as completely done.
2. If the step does not have the --no-kill option set, then when some
   node fails, send a request to terminate the step on ALL of its nodes.

Bug 5805

e11f4af9