Commits · 2dab024da0b38dfc5b99602397f7cfd876558ac6 · Manuel G. Marciani / ces_slurm_simulator

07 Oct, 2015 6 commits
- Allow admin and operator to set job priority at submission · 2dab024d
  Morris Jette authored Oct 07, 2015
  
  2dab024d
- Do not send burst buffer stage out email unless the job uses burst buffers · 3a63b4e0
  Morris Jette authored Oct 07, 2015
```
byg 2013
```
  3a63b4e0
- Update NEWS. · 075668ae
  David Bigagli authored Oct 07, 2015
  
  075668ae
- Fix slurcmtld allowing root to see job steps using squeues -s. · 1026d698
  Hongjia Cao authored Oct 07, 2015
  
  1026d698
- Update NEWS · 170d17d7
  David Bigagli authored Oct 07, 2015
  
  170d17d7
- Fix srun core dump. · 30a5d677
  Hongjia Cao authored Oct 07, 2015
  
  30a5d677
06 Oct, 2015 13 commits
- Fix for prolog container cgroup · 80dcbf7e
  Morris Jette authored Oct 06, 2015
```
Create a "task" cgroup at job allocation time via the prolog container.
  A dummy "sleep" process will occupy the cgroup so long as the job exits.
bug 1994
```
  80dcbf7e
- Cosmetic changes, no logic changes · 619ec0f1
  Morris Jette authored Oct 06, 2015
  
  619ec0f1
- Cosmetic changes, no logic changes · c4451a1f
  Morris Jette authored Oct 06, 2015
  
  c4451a1f
- Make debug print out correctly · ae323e27
  Danny Auble authored Oct 06, 2015
  
  ae323e27
- MySQL - Improve the code with asking for jobs in a suspended state. · f0f3dfdb
  Danny Auble authored Oct 06, 2015
  
  f0f3dfdb
- Fix spec file to look for mariadb or mysql devel packages for build · 42e22f03
  Danny Auble authored Oct 06, 2015
```
requirements.
```
  42e22f03
- Add acct_gather_energy/ibmaem plugin · 8937f58a
  Axel Auweter authored Oct 06, 2015
```
Add acct_gather_energy/ibmaem plugin for systems with IBM Systems Director
    Active Energy Manager.
```
  8937f58a
- Merge branch 'slurm-14.11' into slurm-15.08 · dd13c747
  Morris Jette authored Oct 06, 2015
```
Conflicts:
	src/slurmctld/job_mgr.c
```
  dd13c747
- Permit job_submit plugin to set a job's priority · 3b5f13fa
  Thomas Cadeau authored Oct 06, 2015
```
bug 2011
```
  3b5f13fa
- Fix for use of uninitialized variable · da84d1d7
  jette authored Oct 05, 2015
```
It would not cause any problem other than excess memory being
allocated, but was found by CLANG.
```
  da84d1d7
- Fix sacct to not return all jobs if the -j option is given with a trailing · 2646e761
  Danny Auble authored Oct 05, 2015
```
','.
```
  2646e761
- Merge branch 'slurm-14.11' into slurm-15.08 · 14ba53b2
  Morris Jette authored Oct 05, 2015
```
Conflicts:
	src/common/proc_args.c
```
  14ba53b2
- Propagate sbatch "--dist=plane=#" option to srun. · 6868906b
  Morris Jette authored Oct 05, 2015
```
bug 1999
```
  6868906b
05 Oct, 2015 4 commits
- Test changes for default memory limit of unlimited · f50c96e7
  Morris Jette authored Oct 05, 2015
```
A configuration of "DefMemPerNode=UNLIMITED" prevented more than
one job from running at a time on a given node, which broke some
tests. These changes prevent the tests from breaking.
```
  f50c96e7
- Fix typo. · bc127394
  david authored Oct 05, 2015
  
  bc127394
- Merge branch 'slurm-14.11' into slurm-15.08 · 721029bb
  jette authored Oct 04, 2015
  
  721029bb
- Include header for clean BGQ/Cray build · 3d601061
  jette authored Oct 04, 2015
  
  3d601061
03 Oct, 2015 2 commits
- Merge branch 'slurm-14.11' into slurm-15.08 · 68d3ae59
  Morris Jette authored Oct 02, 2015
```
Conflicts:
	NEWS
```
  68d3ae59
- Don't requeue RPCs from slurmctld to DOWN nodes · f4ea9dec
  Morris Jette authored Oct 02, 2015
```
Don't requeue RPC going out from slurmctld to DOWN nodes (can generate
    repeating communication errors).
bug 2002
```
  f4ea9dec
02 Oct, 2015 4 commits

Update v15.08.2 NEWS with v14.11.10 work · ff24578a
Morris Jette authored Oct 01, 2015

ff24578a

Don't mark powered down node as not responding · c0bb562a

Morris Jette authored Oct 01, 2015

This will only happen if a PING RPC for the node is already queued
  when the decision is made to power it down, then fails to get
  a response for the ping (since the node is already down).
bug 1995

c0bb562a

Reset job CPU count if CPUs/task ratio increased for mem limit · 29fe3eae

Morris Jette authored Sep 30, 2015

If a job's CPUs/task ratio is increased due to configured MaxMemPerCPU,
then increase it's allocated CPU count in order to enforce CPU limits.
Previous logic would increase/set the cpus_per_task as needed if a
job's --mem-per-cpu was above the configured MaxMemPerCPU, but NOT
increase the min_cpus or max_cpus varilable. This resulted in allocating
the wrong CPU count.

29fe3eae

Don't mark powered down node as not responding · 8c03a8bc

Morris Jette authored Oct 01, 2015

This will only happen if a PING RPC for the node is already queued
  when the decision is made to power it down, then fails to get
  a response for the ping (since the node is already down).
bug 1995

8c03a8bc

01 Oct, 2015 2 commits
- MYSQL - Remove restriction to have to be at least an operator to query TRES · 2bfbcbd8
  Danny Auble authored Oct 01, 2015
```
values.
```
  2bfbcbd8
- Fix advanced reservation core selection logic with network topology · 9e4a695d
  Morris Jette authored Oct 01, 2015
```
This required a fairly major re-write of the select plugin logic
bug 1975
```
  9e4a695d
30 Sep, 2015 6 commits

Make cgroup paths consistent · c5c566ff

Morris Jette authored Sep 30, 2015

Correct some cgroup paths ("step_batch" vs. "step_4294967294", "step_exter"
    vs. "step_extern", and "step_extern" vs. "step_4294967295").

c5c566ff

Document CPU count increase for mem_per_cpu limit · 0164729f

Morris Jette authored Sep 30, 2015

Document that if a job's memory per CPU limit exceeds the system
limit, that the job's memory limit is decreased and it's CPU count
increased automatically.

0164729f

Reset job CPU count if CPUs/task ratio increased for mem limit · 836912bf

Morris Jette authored Sep 30, 2015

If a job's CPUs/task ratio is increased due to configured MaxMemPerCPU,
then increase it's allocated CPU count in order to enforce CPU limits.
Previous logic would increase/set the cpus_per_task as needed if a
job's --mem-per-cpu was above the configured MaxMemPerCPU, but NOT
increase the min_cpus or max_cpus varilable. This resulted in allocating
the wrong CPU count.

836912bf

Merge remote-tracking branch 'origin/slurm-14.11' into slurm-15.08 · 8812fabe
Brian Christiansen authored Sep 30, 2015
```
Conflicts:
	NEWS
	src/slurmctld/job_mgr.c
	src/srun/libsrun/launch.c
```
8812fabe
Enable srun -I to use pending step logic. · 0bf0e71f
Brian Christiansen authored Sep 30, 2015
```
Continuation of 1252d1a1
Bug 1938
```
0bf0e71f

Don't start duplicate batch job · c1513956

Morris Jette authored Sep 29, 2015

Requeue/hold batch job launch request if job already running. This is
  possible if node went to DOWN state, but jobs remained active.
In addition, if a prolog/epilog failed DRAIN the node rather than
  setting it down, which could kill jobs that could continue to
  run.
bug 1985

c1513956

29 Sep, 2015 3 commits
- srun: Add SLURM_JOB_NODELIST env var · dfaa33ee
  Morris Jette authored Sep 29, 2015
```
This makes srun more consistent with salloc and sbatch
```
  dfaa33ee
- Improve job_completion logging · 4cebe297
  Morris Jette authored Sep 29, 2015
```
Previous logic would not report termiation siganl, only exit code,
  which could be meaningless.
```
  4cebe297
- Fix srun -I<timeout> from flooding the controller with step create requests. · 1252d1a1
  Brian Christiansen authored Sep 29, 2015
```
Bug 1938
```
  1252d1a1