Commits · c53749072c9d141b31aa8a19fca5cc8aa91f4418 · Manuel G. Marciani / ces_slurm_simulator

16 Nov, 2015 2 commits
- burst_buffer/cray: Retry failed teardown as needed · 789f5c7e
  Morris Jette authored Nov 16, 2015
```
bug 2143
```
  789f5c7e
- backfill: test assoc & QOS nodes limits · dcc943b7
  Morris Jette authored Nov 16, 2015
```
Backfill scheduler: Test association and QOS node limits before reserving
    resources for pending job.
bug 2129
```
  dcc943b7
13 Nov, 2015 8 commits
- Show requested TRES in "squeue -O tres" when job is pending. · 1a67ac46
  Brian Christiansen authored Nov 13, 2015
  
  1a67ac46
- Prevent "scontrol update job" from updating jobs that have already finished. · 6981c318
  Brian Christiansen authored Nov 13, 2015
```
Prevents the following sequence from causing a segfault:
$ scontrol create partitionname=stuff nodes=ALL
$ sbatch --wrap="hostname" -o/dev/null -p stuff
Submitted batch job 1047468
$ scontrol delete partitionname=stuff
$ scontrol update jobid=1047468 partition=stuff
```
  6981c318
- Header for next tagged version · d4898475
  Danny Auble authored Nov 13, 2015
  
  d4898475
- Minor elasticsearch code improvements dealing with reverse association · db8959e3
  Danny Auble authored Nov 13, 2015
```
tree.
```
  db8959e3
- Add with_freeipmi option to spec file. · d9b9ea91
  Danny Auble authored Nov 13, 2015
  
  d9b9ea91
- Add unique identifiers to anchor tags in HTML generated from the man pages. · 0318c0d2
  Brian Christiansen authored Nov 12, 2015
```
Bug 2006
```
  0318c0d2
- Add REQUEST_ADD_EXTERN_PID option to add pid to the slurmstepd's extern · d53755af
  Danny Auble authored Nov 12, 2015
```
step.
```
  d53755af
- Add SchedulerParameters option of max_script_size · 2270ece4
  Morris Jette authored Nov 12, 2015
  
  2270ece4
12 Nov, 2015 3 commits
- Remove duplicate #define IS_NODE_POWER_UP · bbcb45f0
  Mark Roberts authored Nov 12, 2015
  
  bbcb45f0
- Add job array info to elasticsearch plugin · 255c5552
  Morris Jette authored Nov 11, 2015
  
  255c5552
- Add "sdiag reset" support for operators · 016bc183
  Morris Jette authored Nov 11, 2015
```
Previously only supported by SlurmUser and root.
```
  016bc183
11 Nov, 2015 4 commits

Sched/backfill fix of bf_max_job_array_resv enforcement · 53176bbd
Morris Jette authored Nov 11, 2015
```
Previously only reserved space for one task of pending job array.
```
53176bbd

Take node out of FUTURE state with reconfig · 79343222

Morris Jette authored Nov 11, 2015

Support taking node out of FUTURE state with "scontrol reconfig" command.
Previous logic would keep node in FUTURE state if that was the original
configuration when slurmctld started. If job was running on the node,
it will stay running, but the node make not be visible.

79343222

Fix job cancelation bug. · 8e66e267
David Bigagli authored Nov 11, 2015

8e66e267

Add more Prolog/EpilogSlurmctld env vars · 66659fac

Morris Jette authored Nov 10, 2015

Make SLURM_ARRAY_TASK_MIN, SLURM_ARRAY_TASK_MAX, and SLURM_ARRAY_TASK_STEP
    environment variables available to PrologSlurmctld and EpilogSlurmctld.

66659fac

10 Nov, 2015 5 commits
- MYSQL - Quote assoc table name in mysql query. · edd932af
  Hongjia Cao authored Nov 10, 2015
  
  edd932af
- Make 'extern' step show up in the database. · 129b5c33
  Danny Auble authored Nov 09, 2015
```
We needed to send a finish from each node in the step whether it had
any activity or not.  This way the controller knew things were done
on the node and the data was sent to the database.

Bug 2097
```
  129b5c33
- Make it possible to query 'extern' step with sstat. · 74a7c5c7
  Danny Auble authored Nov 09, 2015
  
  74a7c5c7
- Burst_buffer/cray: Don't stall scheduling · 7a6697d3
  Morris Jette authored Nov 09, 2015
```
Burst_buffer/cray: Don't stall scheduling of other jobs while a stage-in
    is in progress.
bug 2114
```
  7a6697d3
- burst_buffer/cray job purge logic · 9804e297
  Morris Jette authored Nov 09, 2015
```
Fix to purge terminated jobs with burst buffer errors.
bug 2123
```
  9804e297
09 Nov, 2015 2 commits
- Increase job prolog limit on slurmctld restart · 5a522aa7
  Morris Jette authored Nov 09, 2015
```
The prolog_running counter can now exceed 1. New logic raises limit
from 1 to 4 before preventing job recovery on restart.
```
  5a522aa7
- Prevent srun from core dumping should the step contxt be NULL. #2083 · 1da8e3f0
  David Bigagli authored Nov 09, 2015
  
  1da8e3f0
07 Nov, 2015 1 commit

Add burst_buffer.conf flag of TeardownFailure · e0aa4d87

Morris Jette authored Nov 06, 2015

Added burst_buffer.conf flag parameter of "TeardownFailure" which will
    teardown and remove a burst buffer after failed stage-in or stage-out.
    By default, the buffer will be preserved for analysis and manual teardown.
bug 2116

e0aa4d87

06 Nov, 2015 2 commits
- Update NEWS · b0a3125d
  David Bigagli authored Nov 06, 2015
  
  b0a3125d
- Fix TRES_MAX flag to work correctly. · 0cf690a9
  Danny Auble authored Nov 05, 2015
```
Bug 2106

What was happening was the calculation wasn't happening for memory
or nodes, just cpus and gres.
```
  0cf690a9
05 Nov, 2015 1 commit
- Fix typo for the "devices" cgroup subsystem in pam_slurm_adopt.c · 27662579
  Kilian Cavalotti authored Nov 04, 2015
  
  27662579
04 Nov, 2015 6 commits
- Update NEWS for start of v15.08.4 · 80d35792
  Morris Jette authored Nov 04, 2015
  
  80d35792
- Fixed counter of not indexed jobs, error_cnt post-increment changed to · dc632bcd
  Alejandro Sanchez authored Nov 04, 2015
```
pre-increment.
```
  dc632bcd
- Update NEWS for last DW commit · 800ad5d6
  Morris Jette authored Nov 04, 2015
  
  800ad5d6
- burst_buffer/cray: Don't call paths if no #DW commands · 7965eae1
  Morris Jette authored Nov 04, 2015
```
The "dw_wlm_cli paths" command returns an error if no #DW options.
```
  7965eae1
- Update NEWS with 14.11 fix for 2095 · 28474a18
  Brian Christiansen authored Nov 04, 2015
```
commit:508f866e
```
  28474a18
- Fix systemd's slurmd service from killing slurmstepds on shutdown. · 508f866e
  Brian Christiansen authored Nov 04, 2015
```
Bug 2095
```
  508f866e
03 Nov, 2015 4 commits
- Alphabetize debugflags when printing them out. · 8119ac7a
  Danny Auble authored Nov 03, 2015
  
  8119ac7a
- Modifications to pam_slurm_adopt to work correctly for the "extern" step. · ca682973
  Ryan Cox authored Nov 03, 2015
  
  ca682973
- Burstbuffer/Cray: Fix for persistent buffer use · 30b48560
  Morris Jette authored Nov 03, 2015
```
Add logic to call the "setup" function.
```
  30b48560
- MYSQL - Fix rollups for multiple jobs running by the same association · 34e24467
  Danny Auble authored Nov 02, 2015
```
in an hour counting multiple times.
```
  34e24467
02 Nov, 2015 1 commit

Return error on user job release of admin hold · 703504e5

Morris Jette authored Nov 02, 2015

Return permission denied if regular user tries to release job held by an
    administrator.
bug 2087

703504e5

30 Oct, 2015 1 commit

Fix reservation creation on DOWN nodes · 6aed461b

Deric Sullivan authored Oct 29, 2015

Fix creation of advanced reservation of cores on nodes which are DOWN.
There seems to be a bug with reservations using a node list (e.g.
Nodes=something + CoreCnt=something).  The result is a reservation made that's
arguably broken; listing the reservation (scontrol show reservation) will show
"Nodes=" (blank) and "CoreCnt=0".

It's very easy to reproduce, just by doing the following against a node in a
DOWN (also tested with POWER_UP) state:
scontrol create ReservationName=tmp_res StartTime=now EndTime=now+600
Nodes=<some_non_idle_node> CoreCnt=1 Users=<some_valid_user>
scontrol show reservation.
bug 2078

6aed461b