Commits · 78ae8647fb9547ac4e13c21bb5cd38cc0a1a1f09 · Manuel G. Marciani / ces_slurm_simulator

11 Sep, 2015 2 commits
- MYSQL - Change debug to print out with DebugFlags=DB_Step instead of debug4 · 78ae8647
  Danny Auble authored Sep 11, 2015
  
  78ae8647
- Fix slurmdbd backup to use DbdAddr when contacting the primary. · 5beb84db
  Brian Christiansen authored Sep 11, 2015
```
And add missing documenation.
Bug 1921
```
  5beb84db
10 Sep, 2015 5 commits
- Fix gres tracking for multiple steps · af1163a2
  Morris Jette authored Sep 10, 2015
```
GRES were not being properly tracks for multiple simultaneous steps.
A step which could have run later could be rejected as never being
able to run.
Replacement for commit dd842d79, which was reverted in commit 6f73812875c
bug 1925
```
  af1163a2
- Fix unit conversion bug. · fa90d2c7
  David Bigagli authored Sep 10, 2015
  
  fa90d2c7
- Fix scontrol core dump. · 4e3ff395
  David Bigagli authored Sep 10, 2015
  
  4e3ff395
- Fix issue with GRES in steps so that if you have multiple exclusive steps · dd842d79
  Danny Auble authored Sep 09, 2015
```
and you use all the GRES up instead of reporting the configuration isn't
available you hold the requesting step until the GRES is available.
```
  dd842d79
- When requesting GRES in a step check for correct variable for the count. · 23182913
  Brian Gilmer authored Sep 09, 2015
  
  23182913
09 Sep, 2015 6 commits
- Update news · add1fa32
  David Bigagli authored Sep 09, 2015
  
  add1fa32
- don't truncate sview/squeue task ID info · 66f2bbc6
  Morris Jette authored Sep 08, 2015
```
Don't trucate task ID information in "squeue --array/-r" output.
Task ID info in sview also expanded to 64 characters (from ~16 chars).
```
  66f2bbc6
- Fix for burst_buffer/cray to parse type option correctly. · 606f1f1a
  Danny Auble authored Sep 08, 2015
```
Fix for bad cut and paste.  Looks like the original code was just a copy
of the access parsing.
```
  606f1f1a
- Make sure safe_limits was initialized before processing limits in the · 3a75d81c
  Danny Auble authored Sep 08, 2015
```
    slurmctld.
```
  3a75d81c
- improve json installation logic · b95b4e7f
  Morris Jette authored Sep 08, 2015
```
Use more flexible mechnanism to find json installation.
```
  b95b4e7f
- Fix srun from inheriting the SLURM_CPU_BIND and SLURM_MEM_BIND environment... · 9e84a7da
  Brian Christiansen authored Sep 08, 2015
```
Fix srun from inheriting the SLURM_CPU_BIND and SLURM_MEM_BIND environment variables when running in an existing srun (e.g. an srun within an salloc).

Bug 1888
```
  9e84a7da
08 Sep, 2015 3 commits
- Fix missing else when packing an update partition message · 24f51d67
  Danny Auble authored Sep 08, 2015
  
  24f51d67
- Improve job state reason for required nodes down · 2ae66435
  jette authored Sep 08, 2015
```
Improve job state reason string when required nodes not available.
bug 1920
```
  2ae66435
- Update the slurm.conf man page documenting better nohold_on_prolog_fail · 4be84f67
  David Bigagli authored Sep 08, 2015
  
  4be84f67
07 Sep, 2015 1 commit
- Improve pmi2 cleanup logic. · 4ba3d8a6
  David Bigagli authored Sep 07, 2015
  
  4ba3d8a6
03 Sep, 2015 1 commit

burst_buffer/cray bug work-around removed · 1f5c6a08

Morris Jette authored Sep 03, 2015

Remove our work around for a bug in the Cray API that has recently
  been fixed.
Improve error message for root jobs trying to use DW.

1f5c6a08

02 Sep, 2015 6 commits
- Remove NEWS item for removed work · 6709d10f
  Morris Jette authored Sep 02, 2015
```
This change is already logged in the NEWS for v14.11.10
```
  6709d10f
- If AccountingEnforce=safe is set make sure a job can finish before going · f3d08706
  Danny Auble authored Sep 02, 2015
```
over the limit with grpwall on a QOS or association.
```
  f3d08706
- Print usage for GrpJobs GrpSubmitJobs and GrpWall even if there is no · a027c824
  Danny Auble authored Sep 02, 2015
```
limit.
```
  a027c824
- Add time to the partition QOS the job is running on instead of just the · 9743e05b
  Danny Auble authored Sep 02, 2015
```
job QOS.
```
  9743e05b
- Leave down/drain node unavailable when powered down · 4c3491a4
  Morris Jette authored Sep 01, 2015
```
Previous logic would set the avail_node_bitmap when
a node was powered down, even if the initial state was
DOWN or DRAINED. This made the node available for allocation
to a job, which we don't want until the DOWN or DRAIN
state is cleared.
bug 1893
```
  4c3491a4
- Revert power_save patches · df5c3a1f
  Morris Jette authored Sep 01, 2015
```
This reverts commits
7660da9e
5c386455 and
f6c5302b
```
  df5c3a1f
01 Sep, 2015 9 commits
- Fix truncation of job reason in squeue. · 0b9a5d6a
  Brian Christiansen authored Sep 01, 2015
```
Bug 1741
```
  0b9a5d6a
- Put 14.11.10 message in the 15.08.1 NEWS section · 52ce6d55
  Danny Auble authored Sep 01, 2015
  
  52ce6d55
- Move note to correct version. · 7660da9e
  David Bigagli authored Sep 01, 2015
  
  7660da9e
- Put in note for next potential tag · 98e9196f
  Danny Auble authored Sep 01, 2015
  
  98e9196f
- Fix test21.30 and 21.34 to check grpwall better. · 887255fb
  Nathan Yee authored Sep 01, 2015
  
  887255fb
- If a node is down don't set it in power suspend mode. · f6c5302b
  David Bigagli authored Sep 01, 2015
  
  f6c5302b
- Mod files for legitimate tag. · 520b48ac
  Danny Auble authored Aug 31, 2015
  
  520b48ac
- Fix rpmbuild issue on Centos7. · e1bf0cae
  Brian Christiansen authored Aug 31, 2015
```
Bug 1896
```
  e1bf0cae
- Change QOS flag name from PartitionQOS to OverPartQOS to be a better · 0f647f14
  Danny Auble authored Aug 31, 2015
```
description.
```
  0f647f14
31 Aug, 2015 2 commits
- Update NEWS · 9ec514aa
  Brian Christiansen authored Aug 31, 2015
  
  9ec514aa
- Fix srun to use the NoInAddrAny TopologyParam option. · a3f55d79
  Brian Christiansen authored Aug 31, 2015
```
Bug 1867
```
  a3f55d79
28 Aug, 2015 1 commit

Requeue job if possible when slurmstepd aborts · d8e6f55d

Morris Jette authored Aug 28, 2015

This problem is reproducible by launching a job then killing the
  slurmstepd process. Under those conditions, requeue the job if
  possible (i.e. batch job with requeue option/configuration).
  This patch also improves the slurmctld logging when this happens.
bug 1889

d8e6f55d

27 Aug, 2015 4 commits

Correct RebootProgram usage · 82068b6b

Morris Jette authored Aug 27, 2015

Correct RebootProgram logic when executed outside of a maintenance
  reservation. Previous logic would mark the node up upon response
  to the reboot RPC (from slurmctld to slurmc) and when the node
  actually rebooted, flag that as an unexpected reboot. This new
  logic checks the node's up time to not mark the compute node as
  being usable until the reboot actually takes place.
but 1866

82068b6b

Add new job array env vars · a15b8bd8

Morris Jette authored Aug 27, 2015

Add environment variables SLURM_ARRAY_TASK_MAX, SLURM_ARRAY_TASK_MIN,
SLURM_ARRAY_TASK_STEP for job arrays.
bug 1600

a15b8bd8

Cleaner copy for PriorityWeightTRES, it also fixes a core dump when trying · 1bf2d7b9
Danny Auble authored Aug 27, 2015
```
to free it otherwise.
```
1bf2d7b9
Fix handling of requeued jobs with steps that are still finishing. · d049e065
Brian Christiansen authored Aug 26, 2015
```
Bug 1826
```
d049e065