Commits · 6c02e63ebfb03d3e7cefbf9e3ee5a6b965155886 · Manuel G. Marciani / ces_slurm_simulator

08 Oct, 2015 6 commits

Merge branch 'slurm-15.08' · 6c02e63e
Morris Jette authored Oct 08, 2015

6c02e63e
Rename a test file for better clarity · 7198e637
Morris Jette authored Oct 08, 2015

7198e637

Fix case where the primary and backup dbds would both be performing rollup. · b2eb504b

Brian Christiansen authored Oct 07, 2015

If the backup dbd happened to be doing rollup at the time the primary resumed
both the primary and the backup would be doing rollups and causing contention on
the database tables. The backup would wait for the rollup handler to finish
before giving up control.

The fix is to cancel the rollup_handler and let the backup begin to shutdown so
that it will close an existing connections and then re-exec itself. The re-exec
helps because the rollup handler spawns a thread for each cluster to rollup and
just cancelling the rollup handler doesn't cancel the spawned threads from the
rollup handler. This cleans up the dbd and locks. The re-exec only happens in
the backup if the primary resumed and a rollup was happening.

Bug 1988

b2eb504b

Fix case where if the backup slurmdbd has existing connections when it gives... · 44bb06bc

Brian Christiansen authored Oct 07, 2015

Fix case where if the backup slurmdbd has existing connections when it gives up control that the it would be killed.

If the backup had existing connections when giving up control, it would try to
signal the existing threads by using pthread_kill to send SIGKILL to the
threads. The problem is that SIGKILL doesn't go the thread but the main process
and the backup dbd would be killed.

44bb06bc

Fixed slurmctld not sending cold-start messages correctly to the database · 4ed2f8c6
Danny Auble authored Oct 07, 2015
```
when a cold-start (-c) happens to the slurmctld.
```
4ed2f8c6

Remove SICP job option · 0f6bf406

Morris Jette authored Oct 07, 2015

This was intended as a step toward managing jobs across mutliple
  clusters, but we will be pursuing a very different design.

0f6bf406

07 Oct, 2015 21 commits
- Merge remote-tracking branch 'origin/slurm-14.11' into slurm-15.08 · 2dcc2732
  Danny Auble authored Oct 07, 2015
```
Conflicts:
	src/sacct/options.c
```
  2dcc2732
- Fix sacct -j, (nothing but a comma) to not return all jobs. · d5979ef6
  Danny Auble authored Oct 07, 2015
  
  d5979ef6
- Merge remote-tracking branch 'origin/slurm-15.08' · 6c48f39f
  Danny Auble authored Oct 07, 2015
  
  6c48f39f
- sacctmgr - Don't allow default account associations to be removed · 9f602cba
  Danny Auble authored Oct 07, 2015
```
from a user.

This would cause the slurmctld to cache the old default which wasn't valid
and cause the user to have to request the association always.
```
  9f602cba
- Merge remote-tracking branch 'origin/slurm-14.11' into slurm-15.08 · f5d6b175
  Danny Auble authored Oct 07, 2015
```
Conflicts:
	NEWS
	src/plugins/accounting_storage/mysql/as_mysql_job.c
```
  f5d6b175
- Merge branch 'slurm-15.08' · fc0e12c3
  Morris Jette authored Oct 07, 2015
  
  fc0e12c3
- Document sbatch cpu/mem binding env vars · 8e949f72
  Morris Jette authored Oct 07, 2015
```
bug 2009
```
  8e949f72
- Merge branch 'slurm-15.08' · b9fcd7b3
  Morris Jette authored Oct 07, 2015
  
  b9fcd7b3
- Corret plane distribution test · a254c6a5
  Morris Jette authored Oct 07, 2015
```
Each node could have fewer tasks allocated on a node than the plane
  size, which broke the test. The plane size needs to be treated
  as a maximum consecutive rank value.
```
  a254c6a5
- Update documentation for who can set job priority · 0d3ecfe3
  Thomas Cadeau authored Oct 07, 2015
  
  0d3ecfe3
- Allow admin and operator to set job priority at submission · 2dab024d
  Morris Jette authored Oct 07, 2015
  
  2dab024d
- Do not send burst buffer stage out email unless the job uses burst buffers · 3a63b4e0
  Morris Jette authored Oct 07, 2015
```
byg 2013
```
  3a63b4e0
- Update NEWS. · 075668ae
  David Bigagli authored Oct 07, 2015
  
  075668ae
- Fix slurcmtld allowing root to see job steps using squeues -s. · 1026d698
  Hongjia Cao authored Oct 07, 2015
  
  1026d698
- Update NEWS · 170d17d7
  David Bigagli authored Oct 07, 2015
  
  170d17d7
- Fix srun core dump. · 30a5d677
  Hongjia Cao authored Oct 07, 2015
  
  30a5d677
- Add missing files. · 4612dabb
  David Bigagli authored Oct 07, 2015
  
  4612dabb
- Run autogen for the PMIX plugin. · 49cc4c2d
  David Bigagli authored Oct 07, 2015
  
  49cc4c2d
- Fix compilation error. · bf95ca2f
  David Bigagli authored Oct 07, 2015
  
  bf95ca2f
- Introduce the PMIX plugin for high performance MPI startup. · 3089921a
  Artem Polyakov authored Oct 07, 2015
  
  3089921a
- Fix issue with sacct, printing 0_0 for array's that had finished in the · 75ea13a3
  Danny Auble authored Oct 06, 2015
```
database but the start record hadn't made it yet.
```
  75ea13a3
06 Oct, 2015 13 commits
- Merge branch 'slurm-15.08' · 2cf413c9
  Morris Jette authored Oct 06, 2015
```
Conflicts:
	NEWS
	configure
```
  2cf413c9
- Add acct_gather_energy/ibmaem plugin · 14be4f65
  Axel Auweter authored Oct 06, 2015
```
Add acct_gather_energy/ibmaem plugin for systems with IBM Systems Director
    Active Energy Manager.
```
  14be4f65
- Fix for prolog container cgroup · 80dcbf7e
  Morris Jette authored Oct 06, 2015
```
Create a "task" cgroup at job allocation time via the prolog container.
  A dummy "sleep" process will occupy the cgroup so long as the job exits.
bug 1994
```
  80dcbf7e
- Cosmetic changes, no logic changes · 619ec0f1
  Morris Jette authored Oct 06, 2015
  
  619ec0f1
- Cosmetic changes, no logic changes · c4451a1f
  Morris Jette authored Oct 06, 2015
  
  c4451a1f
- Make debug print out correctly · ae323e27
  Danny Auble authored Oct 06, 2015
  
  ae323e27
- MySQL - Improve the code with asking for jobs in a suspended state. · f0f3dfdb
  Danny Auble authored Oct 06, 2015
  
  f0f3dfdb
- Fix spec file to look for mariadb or mysql devel packages for build · 42e22f03
  Danny Auble authored Oct 06, 2015
```
requirements.
```
  42e22f03
- Add acct_gather_energy/ibmaem plugin · 8937f58a
  Axel Auweter authored Oct 06, 2015
```
Add acct_gather_energy/ibmaem plugin for systems with IBM Systems Director
    Active Energy Manager.
```
  8937f58a
- Merge branch 'slurm-15.08' · 5005f4f0
  Morris Jette authored Oct 06, 2015
  
  5005f4f0
- Merge branch 'slurm-14.11' into slurm-15.08 · dd13c747
  Morris Jette authored Oct 06, 2015
```
Conflicts:
	src/slurmctld/job_mgr.c
```
  dd13c747
- Permit job_submit plugin to set a job's priority · 3b5f13fa
  Thomas Cadeau authored Oct 06, 2015
```
bug 2011
```
  3b5f13fa
- Fix for use of uninitialized variable · da84d1d7
  jette authored Oct 05, 2015
```
It would not cause any problem other than excess memory being
allocated, but was found by CLANG.
```
  da84d1d7