Commits · f297242ec44ae36b4431154f9908527c1ab195b5 · Manuel G. Marciani / ces_slurm_simulator

21 Nov, 2012 3 commits

slurmstepd : correct a bug in the IO thread termination monitoring · f297242e

Matthieu Hautreux authored Nov 13, 2012

A dedicated thread (_kill_thr) is launched by slurmstepd at the end of a
step in order to destroy the IO thread if it does not manage to correctly
terminate by itself after 300 seconds.

Two bugs are corrected in this logic by this patch.

First, the performed sleep(300) is not protected against interruptions
and this delay can be reduced to a few seconds in case of signals received
by slurmstepd, thus, reducing the delay and forcing the IO thread to
terminate before the expiration of the grace time. The logic is modified
to ensure that the delay is respected using a loop around the sleep().

Second, to terminate the IO thread, a SIGKILL is delivered to the IO thread
using pthread_kill. However, sending SIGKILL using pthread_kill is a
process-wide operation (see man pthread_kill), thus all the slurmstepd
threads are killed and slurmstepd is terminated. This logic is modified
by using pthread_cancel() instead of pthread_kill() thus letting ...

f297242e

Correct a bug with -w in step management resulting in inadequate memory errors returned to srun · ac86cc37

Matthieu Hautreux authored Nov 12, 2012

When requesting a particular nodelist for a step, if at least one of the node is
still used by a former step (no REQUEST_STEP_COMPLETE received from that node),
the current behavior is to return ESLURM_INVALID_TASK_MEMORY and srun aborting
with "Memory required by task is not available".

This can be reproduced by launching consecutive steps with the -w parameter set
to $SLURM_NODELIST and introducing delays in the spank epilog on the execution
nodes.

The behavior is changed to only defer the execution of the step by returning
ESLURM_NODES_BUSY when it is detected that some nodes are blocked because of
already used memory.

ac86cc37

Correct a bug in consecutive steps management due to asynchronous step completions · 4c97337d

Matthieu Hautreux authored Nov 12, 2012

When using consecutive steps, it appears that in some cases, the time required
by the slurmstepd on the execution nodes to inform the controler of the completion
of the step is higher than the time required to request the following step.
In that scenario, the controler can reject the step by returning the error code
ESLURM_REQUESTED_NODE_CONFIG_UNAVAILABLE even if the step could be executed if
all the former steps were correctly finished.

This can be reproduced by launching consecutive steps and introducing dalys in
the spank epilog on the execution nodes.

The behavior is changed to only defer the execution of the step by returning
ESLURM_NODES_BUSY when all the available nodes are not idle considering the
former steps.

4c97337d

20 Nov, 2012 2 commits
- Accounting - Fix issue where QOS usage was being zeroed out on a · 8b0b5ae7
  Danny Auble authored Nov 20, 2012
```
slurmctld restart.
```
  8b0b5ae7
- Reset node MAINT state flag when a reservation's nodes or flags change · cc97d84b
  Morris Jette authored Nov 19, 2012
  
  cc97d84b
19 Nov, 2012 3 commits
- BGQ - Fix job step timeout actually happen when done from within an · 0500e007
  Danny Auble authored Nov 19, 2012
```
allocation.
```
  0500e007
- Modify use of OOM (out of memory protection) for Linux 2.6.36 kernel or later · 8ae5e73e
  Morris Jette authored Nov 19, 2012
```
NOTE: If you were setting the environment variable SLURMSTEPD_OOM_ADJ=-17,
it should be set to -1000 for Linux 2.6.36 kernel or later.
```
  8ae5e73e
- NEWS for e40883f1 · f42117c4
  Danny Auble authored Nov 19, 2012
  
  f42117c4
09 Nov, 2012 1 commit
- BGQ - when srun -Q is given make runjob be quiet · e40883f1
  Danny Auble authored Nov 08, 2012
  
  e40883f1
07 Nov, 2012 4 commits
- BGQ - better fix for ntasks_per_node verification · e7d8ce15
  Danny Auble authored Nov 06, 2012
  
  e7d8ce15
- remove debug · 295394b2
  Danny Auble authored Nov 06, 2012
  
  295394b2
- BGQ - validate correct ntasks_per_node · 7eb1a451
  Danny Auble authored Nov 06, 2012
  
  7eb1a451
- BGQ - Fix issue when running srun outside of an allocation and only · 9e25da94
  Danny Auble authored Nov 06, 2012
```
specifying the number of tasks and not the number of nodes.
```
  9e25da94
05 Nov, 2012 2 commits
- Cray - Improve signal handling for spawned tasks on job cancel · 3ff9f17e
  Morris Jette authored Nov 05, 2012
```
On job kill requeust, send SIGCONT, SIGTERM, wait KillWait and send
SIGKILL. Previously just sent SIGKILL to tasks.
```
  3ff9f17e
- Add common function to return KillWait configuration parameter · 91be41da
  Morris Jette authored Nov 05, 2012
  
  91be41da
02 Nov, 2012 3 commits
- Remove duplicate NEWS item · c3fde3ce
  Morris Jette authored Nov 02, 2012
  
  c3fde3ce
- Update NEWS for start of v2.4.5 work · 832ca7df
  Morris Jette authored Nov 02, 2012
  
  832ca7df
- Update META for v2.4.4 tag · b8d6a058
  Morris Jette authored Nov 02, 2012
  
  b8d6a058
26 Oct, 2012 1 commit
- Improvements to how salloc handles PrologSlurmctld · fde39699
  Morris Jette authored Oct 26, 2012
  
  fde39699
25 Oct, 2012 3 commits
- Correction to slurmdbd communications failure handling logic · 26871b8d
  Morris Jette authored Oct 25, 2012
```
Incorrect error codes returned in some cases, especially if the slurmdbd is down
```
  26871b8d
- Move some logic in a test for cleaner output · a06e79a7
  Morris Jette authored Oct 25, 2012
  
  a06e79a7
- Cray - Defer salloc until after PrologSlurmctld completes. · a5645a19
  Morris Jette authored Oct 25, 2012
  
  a5645a19
24 Oct, 2012 1 commit
- smap - spread node information across multiple lines for larger systems. · 2c8bd966
  Morris Jette authored Oct 24, 2012
```
Previously for linux systems all information was placed on a single line.
```
  2c8bd966
23 Oct, 2012 4 commits
- BGQ - move variables to correct place. · 4bc03136
  Danny Auble authored Oct 22, 2012
  
  4bc03136
- GQ - Cleaner handling of cnode failures when reported through the runjob · f6a33bad
  Danny Auble authored Oct 22, 2012
```
interface instead of through the normal method.
```
  f6a33bad
- BGQ - added debug when removing blocks · c3a79e14
  Danny Auble authored Oct 22, 2012
  
  c3a79e14
- fix comment · 378e0ca6
  Danny Auble authored Oct 22, 2012
  
  378e0ca6
22 Oct, 2012 1 commit
- BGQ - Fix for printing realtime server debug correctly. · 9054e4e0
  Danny Auble authored Oct 22, 2012
  
  9054e4e0
19 Oct, 2012 3 commits
- add new banner file · 35a2067e
  Danny Auble authored Oct 19, 2012
  
  35a2067e
- update larger systems running slurm · e431a1b6
  Danny Auble authored Oct 19, 2012
  
  e431a1b6
- Update html headers · c132f9e9
  Danny Auble authored Oct 19, 2012
  
  c132f9e9
18 Oct, 2012 9 commits
- BGQ - Make it so if a nodeboard goes in error any block using that midplane · ea39371a
  Danny Auble authored Oct 18, 2012
```
for passthrough gets removed on a dynamic system.
```
  ea39371a
- BGQ - Add logic to make it so blocks can't use a midplane with a nodeboard · 4b1f6608
  Danny Auble authored Oct 18, 2012
```
in error for passthrough.
```
  4b1f6608
- Fixed InactiveLimit math to work correctly · 13a8882a
  Danny Auble authored Oct 17, 2012
  
  13a8882a
- BGQ - Fixed InactiveLimit to work correctly to avoid scenarios where a · 65fef1ff
  Danny Auble authored Oct 17, 2012
```
user's pending allocation was started with srun and then for some reason
the slurmctld was brought down and while it was down the srun was removed.
```
  65fef1ff
- BGQ - fix thread id to handle before realtime thread correctly · 4bebf027
  Danny Auble authored Oct 17, 2012
```
previously it overwrote the poll_thread id
```
  4bebf027
- BGQ - increase size of variable for larger systems · 80e58be9
  Danny Auble authored Oct 17, 2012
  
  80e58be9
- remove unused variable. · a0b7be77
  Danny Auble authored Oct 17, 2012
  
  a0b7be77
- BGQ - Add functionality to make it so we track the actions on a block. · baf267e0
  Danny Auble authored Oct 17, 2012
```
This is needed for when a free request is added to a block but there are
jobs finishing up so we don't start new jobs on the block since they will
fail on start.
```
  baf267e0
- BGQ - add extra debugging to the runjob_mux plugin to print a job · 3338207e
  Danny Auble authored Oct 17, 2012
  
  3338207e