Commits · c9673edc149d78fc985e718c91cd85512963b3f1 · Manuel G. Marciani / ces_slurm_simulator

02 Sep, 2015 6 commits
- Remove NEWS item for removed work · 6709d10f
  Morris Jette authored Sep 02, 2015
```
This change is already logged in the NEWS for v14.11.10
```
  6709d10f
- If AccountingEnforce=safe is set make sure a job can finish before going · f3d08706
  Danny Auble authored Sep 02, 2015
```
over the limit with grpwall on a QOS or association.
```
  f3d08706
- Print usage for GrpJobs GrpSubmitJobs and GrpWall even if there is no · a027c824
  Danny Auble authored Sep 02, 2015
```
limit.
```
  a027c824
- Add time to the partition QOS the job is running on instead of just the · 9743e05b
  Danny Auble authored Sep 02, 2015
```
job QOS.
```
  9743e05b
- Leave down/drain node unavailable when powered down · 4c3491a4
  Morris Jette authored Sep 01, 2015
```
Previous logic would set the avail_node_bitmap when
a node was powered down, even if the initial state was
DOWN or DRAINED. This made the node available for allocation
to a job, which we don't want until the DOWN or DRAIN
state is cleared.
bug 1893
```
  4c3491a4
- Revert power_save patches · df5c3a1f
  Morris Jette authored Sep 01, 2015
```
This reverts commits
7660da9e
5c386455 and
f6c5302b
```
  df5c3a1f
01 Sep, 2015 9 commits
- Fix truncation of job reason in squeue. · 0b9a5d6a
  Brian Christiansen authored Sep 01, 2015
```
Bug 1741
```
  0b9a5d6a
- Put 14.11.10 message in the 15.08.1 NEWS section · 52ce6d55
  Danny Auble authored Sep 01, 2015
  
  52ce6d55
- Move note to correct version. · 7660da9e
  David Bigagli authored Sep 01, 2015
  
  7660da9e
- Put in note for next potential tag · 98e9196f
  Danny Auble authored Sep 01, 2015
  
  98e9196f
- Fix test21.30 and 21.34 to check grpwall better. · 887255fb
  Nathan Yee authored Sep 01, 2015
  
  887255fb
- If a node is down don't set it in power suspend mode. · f6c5302b
  David Bigagli authored Sep 01, 2015
  
  f6c5302b
- Mod files for legitimate tag. · 520b48ac
  Danny Auble authored Aug 31, 2015
  
  520b48ac
- Fix rpmbuild issue on Centos7. · e1bf0cae
  Brian Christiansen authored Aug 31, 2015
```
Bug 1896
```
  e1bf0cae
- Change QOS flag name from PartitionQOS to OverPartQOS to be a better · 0f647f14
  Danny Auble authored Aug 31, 2015
```
description.
```
  0f647f14
31 Aug, 2015 2 commits
- Update NEWS · 9ec514aa
  Brian Christiansen authored Aug 31, 2015
  
  9ec514aa
- Fix srun to use the NoInAddrAny TopologyParam option. · a3f55d79
  Brian Christiansen authored Aug 31, 2015
```
Bug 1867
```
  a3f55d79
28 Aug, 2015 1 commit

Requeue job if possible when slurmstepd aborts · d8e6f55d

Morris Jette authored Aug 28, 2015

This problem is reproducible by launching a job then killing the
  slurmstepd process. Under those conditions, requeue the job if
  possible (i.e. batch job with requeue option/configuration).
  This patch also improves the slurmctld logging when this happens.
bug 1889

d8e6f55d

27 Aug, 2015 5 commits

Correct RebootProgram usage · 82068b6b

Morris Jette authored Aug 27, 2015

Correct RebootProgram logic when executed outside of a maintenance
  reservation. Previous logic would mark the node up upon response
  to the reboot RPC (from slurmctld to slurmc) and when the node
  actually rebooted, flag that as an unexpected reboot. This new
  logic checks the node's up time to not mark the compute node as
  being usable until the reboot actually takes place.
but 1866

82068b6b

Add new job array env vars · a15b8bd8

Morris Jette authored Aug 27, 2015

Add environment variables SLURM_ARRAY_TASK_MAX, SLURM_ARRAY_TASK_MIN,
SLURM_ARRAY_TASK_STEP for job arrays.
bug 1600

a15b8bd8

Cleaner copy for PriorityWeightTRES, it also fixes a core dump when trying · 1bf2d7b9
Danny Auble authored Aug 27, 2015
```
to free it otherwise.
```
1bf2d7b9
Fix handling of requeued jobs with steps that are still finishing. · d049e065
Brian Christiansen authored Aug 26, 2015
```
Bug 1826
```
d049e065
Fix some potential deadlock issues when state files don't exist in the · f6bc60cc
Danny Auble authored Aug 26, 2015
```
association manager.
```
f6bc60cc

26 Aug, 2015 6 commits

Make MaxTRESPerUser work in sacctmgr. · 220f48d5
Danny Auble authored Aug 26, 2015

220f48d5

Prevent wrong job array task ID from being shown · 1d2545ca

Morris Jette authored Aug 26, 2015

Prevent job array task ID from being reported as NO_VAL if last task in the
array gets requeued. The problem is that when that task starts, the task
bitmap entry for it stays set, but the task counter gets decremented.
If that job then gets requeued, under some conditions a failure to schedule
it results in the array_task_id in the job record getting set to NO_VAL.
Then when building the job info to report for squeue/scontrol, the string
showing the pending task ID's is not rebuilt due to that counter being
zero. All indications are that the job runs fine, only the information
reported to squeue/scontrol is wrong.
bug 1790

1d2545ca

ALPS - Make it so srun --hint=nomultithread works correctly. · 9fc01907
Danny Auble authored Aug 25, 2015

9fc01907
MySQL - Fix minor memory leak if a connection ever goes away whist using it. · c3d88da8
Danny Auble authored Aug 25, 2015

c3d88da8
Add AccountingStorageTRES to scontrol show config · 7e6cba40
Danny Auble authored Aug 25, 2015

7e6cba40
When restarting or reconfiging the slurmctld, if job is completing handle · 52b09048
Danny Auble authored Aug 25, 2015
```
accounting correctly to avoid meaningless errors about overflow.
```
52b09048

25 Aug, 2015 4 commits
- Update 15.08.0rc2 NEWS with 14.11.9 fixes. · 548789a3
  Brian Christiansen authored Aug 25, 2015
  
  548789a3
- Fix testing for CR_Memory when CR_Memory and CR_ONE_TASK_PER_CORE are used with select/linear. · 3227b364
  Brian Christiansen authored Aug 25, 2015
```
Bug 1873
```
  3227b364
- Fix for message aggregation return rpcs where none of the messages are · ba990ddd
  Danny Auble authored Aug 25, 2015
```
intended for the head of the tree.
```
  ba990ddd
- ALPS - Fix compile to not link against -ljob and -lexpat with every lib · 92b6e921
  Danny Auble authored Aug 24, 2015
```
or binary.
```
  92b6e921
24 Aug, 2015 1 commit
- Fix issue with frontend systems (outside ALPs or BlueGene) where srun · 6533528f
  Danny Auble authored Aug 24, 2015
```
wouldn't get the correct protocol version to launch a step.
```
  6533528f
22 Aug, 2015 1 commit
- Update NEWS/RELEASE_NOTES · 1ed53007
  Morris Jette authored Aug 21, 2015
  
  1ed53007
21 Aug, 2015 5 commits
- Fix segfault in sreport when there was no response from the dbd. · 307bd82d
  Brian Christiansen authored Aug 21, 2015
```
Bug 1831
```
  307bd82d
- Sort job arrays in job queue according to array_task_id when priorities are equal. · 39ea3440
  Brian Christiansen authored Aug 21, 2015
```
Bug 1869
```
  39ea3440
- Fix a bug in squeue which prevented squeue -tPD to print array jobs. · 703dd6a6
  David Bigagli authored Aug 21, 2015
  
  703dd6a6
- Fix preemption issue that could kill job at startup · 47f96cae
  Morris Jette authored Aug 20, 2015
```
Fix gang scheduling/preemption issue that could cancel job at startup.
I have not been able to reproduce the reported problem, but this
should prevent the reported problem.
bug 1880
```
  47f96cae
- Fix a couple of typos in NEWS · ce20585a
  Morris Jette authored Aug 20, 2015
  
  ce20585a