Commits · 3c8ed335e1e13e166f4364d495f395c67698ae72 · Manuel G. Marciani / ces_slurm_simulator

15 Jun, 2015 6 commits
- Remove 32-bit gres count limit · 3c8ed335
  Morris Jette authored Jun 15, 2015
  
  3c8ed335
- Merge branch 'slurm-14.11' · 6709664e
  Morris Jette authored Jun 15, 2015
  
  6709664e
- Prevent abort on update of license-only reservation · 50deadb4
  Morris Jette authored Jun 15, 2015
```
Logic was assuming the reservation had a node bitmap which was
  being used to check for overlapping jobs. If there is no node
  bitmap (e.g. a licenses only reservation), an abort would result.
```
  50deadb4
- Add test for license reservations · f197ecf8
  Morris Jette authored Jun 15, 2015
  
  f197ecf8
- harden a test · 694c0fc3
  Morris Jette authored Jun 15, 2015
  
  694c0fc3
- Correct test ID number · c1ee6937
  Morris Jette authored Jun 15, 2015
  
  c1ee6937
12 Jun, 2015 4 commits
- Set job's reason to BadConstaints when job can't run on any node. · 475988f5
  Brian Christiansen authored Jun 12, 2015
```
Bug 1739
```
  475988f5
- Remove TICKET_BASED fairshare. · d540af5b
  Brian Christiansen authored Jun 12, 2015
```
Bug 1743
```
  d540af5b
- Merge remote-tracking branch 'origin/slurm-14.11' · bd083998
  Brian Christiansen authored Jun 12, 2015
  
  bd083998
- Deprecated TICKET_BASED fairshare. · c3a30337
  Brian Christiansen authored Jun 12, 2015
```
Bug 1743
```
  c3a30337
11 Jun, 2015 9 commits

One more addition to 9d20cf02 · 214c5372
Brian Christiansen authored Jun 11, 2015
```
Prevent double free.
```
214c5372

Initialize cpu-frequency structures. · cff2effc

Brian Christiansen authored Jun 10, 2015

cpufreq variables weren't being intialized to NO_VAL when using task/none
plugin. This caused the conditions in cpur_freq_reset to not stop
test_cpu_owner_lock from being called.

cff2effc

Merge remote-tracking branch 'origin/slurm-14.11' · 52e96b91
Brian Christiansen authored Jun 10, 2015
```
Conflicts:
	src/common/cpu_frequency.c
```
52e96b91
Merge remote-tracking branch 'origin/slurm-14.11' · f84ad920
Brian Christiansen authored Jun 10, 2015
```
Conflicts:
	src/common/cpu_frequency.c
```
f84ad920
One more addition to 9d20cf02 · 9dcf1444
Brian Christiansen authored Jun 10, 2015

9dcf1444
Use correct slurmd spooldir when creating cpu-frequency locks. · 9d20cf02
Brian Christiansen authored Jun 10, 2015
```
Bug 1733
```
9d20cf02
Merge branch 'slurm-14.11' · 14fed54b
jette authored Jun 10, 2015

14fed54b

Fix for node reboot/down state · 4e8545b6

Didier GAZEN authored Jun 10, 2015

In your node_mgr fix to keep rebooted nodes down (commit 9cd15dfe), you
forgot to consider the case of nodes that are powered up but are responding after
ResumeTimeout seconds (the maximum time permitted). Such nodes are
marked DOWN (because they didn't respond within ResumeTimeout seconds) than
should become silently available when ReturnToService=1 (as stated in the slurm.conf manual)

With your modification when such nodes are finally responding, they are seen as
rebooted nodes and remain in the DOWN state (with the new reason: Node
unexpectedly rebooted) even when ReturnToService=1 !

Correction of commit 3c2b46af

4e8545b6

Revert commit 3c2b46af · c85f7484
Didier GAZEN authored Jun 10, 2015

c85f7484

10 Jun, 2015 9 commits

Add test for advanced reservation "replace" option · 4a228a0f
Morris Jette authored Jun 10, 2015

4a228a0f
Add thread specialization test · f1d7584e
Morris Jette authored Jun 10, 2015

f1d7584e
Fix thread specialization logic · 28dd9182
Morris Jette authored Jun 10, 2015
```
It was always failing when a node list was supplied on job submission
```
28dd9182
harden core specialization test · 85f19cc0
Morris Jette authored Jun 10, 2015

85f19cc0
Merge branch 'slurm-14.11' · 332d834c
Morris Jette authored Jun 10, 2015

332d834c
Add NEWS for last commit · 30e50e6c
Morris Jette authored Jun 10, 2015

30e50e6c

Fix for node reboot/down state · 3c2b46af

Didier GAZEN authored Jun 10, 2015

My patch to obtain the correct behaviour:

3c2b46af

Merge branch 'slurm-14.11' · da0c70a8

Morris Jette authored Jun 09, 2015

Conflicts:
	doc/man/man5/slurm.conf.5
	src/plugins/select/cons_res/job_test.c

da0c70a8

select/serial gres scheduling fix · f2a08ce7
Morris Jette authored Jun 09, 2015
```
Equivalent fix as e1a00772
for select/serial rather than select/cons_res
```
f2a08ce7

09 Jun, 2015 12 commits
- Search for user in all groups · 93ead71a
  David Bigagli authored Jun 09, 2015
  
  93ead71a
- Fix scheduling inconsistency with GRES · e1a00772
  Morris Jette authored Jun 09, 2015
```
1. I submit a first job that uses 1 GPU:
$ srun --gres gpu:1 --pty bash
$ echo $CUDA_VISIBLE_DEVICES
0

2. while the first one is still running, a 2-GPU job asking for 1 task per node
waits (and I don't really understand why):
$ srun --ntasks-per-node=1 --gres=gpu:2 --pty bash
srun: job 2390816 queued and waiting for resources

3. whereas a 2-GPU job requesting 1 core per socket (so just 1 socket) actually
gets GPUs allocated from two different sockets!
$ srun -n 1  --cores-per-socket=1 --gres=gpu:2 -p testk --pty bash
$ echo $CUDA_VISIBLE_DEVICES
1,2

With this change #2 works the same way as #3.
bug 1725
```
  e1a00772
- Move definitions into alphabetic order · 5f337d38
  Morris Jette authored Jun 09, 2015
  
  5f337d38
- Enable backup controller on external Cray node with Native Slurm. · 5671bde2
  Brian Christiansen authored Jun 09, 2015
```
Bug 1572
```
  5671bde2
- Add no_backup_scheduling SchedulerParameter. · f9d132fc
  Brian Christiansen authored Jun 09, 2015
```
Bug 1572
```
  f9d132fc
- Update broken links in webpages · 4a41e4d7
  Danny Auble authored Jun 09, 2015
  
  4a41e4d7
- Replace /usr/bin with a more managible approach · 321a48b3
  Danny Auble authored Jun 09, 2015
  
  321a48b3
- Correct slurm.conf formatting · 481a51fe
  Morris Jette authored Jun 09, 2015
  
  481a51fe
- In test4.3 unset the SINFO_FORMAT since it conflicts with the --long · 52aa6ce2
  David Bigagli authored Jun 08, 2015
```
option.
```
  52aa6ce2
- Fix native cray compile error. · 5e3bfeb6
  Brian Christiansen authored Jun 09, 2015
  
  5e3bfeb6
- Corrections to slurm.conf formatting · b373847e
  Morris Jette authored Jun 09, 2015
  
  b373847e
- Change "ERROR" to "FAILURE" in test output · 879d49b6
  Morris Jette authored Jun 09, 2015
```
Modify test to work if "." is not in search path
Fix error message, change "sbatch" to "salloc"
```
  879d49b6