Commits · 9da1a385105301f19d396410ed56cd3898ac9276 · Manuel G. Marciani / ces_slurm_simulator

21 May, 2014 6 commits
- minor memory leak fix from patch 82d5e763 · 9da1a385
  Danny Auble authored May 21, 2014
  
  9da1a385
- Fix test to not mistake calling prompt for the prompt it is suppose to · 2733b657
  Danny Auble authored May 20, 2014
```
wait for.
```
  2733b657
- If TaskProlog sets SLURM_PROLOG_CPU_MASK reset affinity for that task · d1efe282
  Danny Auble authored May 20, 2014
```
based on the mask given.
```
  d1efe282
- Make cgroup task layout (block | cyclic) method mirror that of · 06750d32
  Danny Auble authored May 20, 2014
```
task/affinity.
```
  06750d32
- task/affinity - When using --hint=nomultithread only bind to the first · 82d5e763
  Danny Auble authored May 20, 2014
```
thread in a core.
```
  82d5e763
- Add new distribution method fcyclic so when a task is using multiple cpus · a335b974
  Danny Auble authored May 20, 2014
```
it can bind cyclically across sockets.
```
  a335b974
20 May, 2014 7 commits
- honor cpus_per_task with ntasks_per_core · 859839a7
  Morris Jette authored May 20, 2014
```
Previous logic assumed cpus_per_task=1, so ntasks_per_core option
could spread the job across more cores than desired
```
  859839a7
- Pack cpus-per-task onto single socket · cfc399f1
  Morris Jette authored May 20, 2014
```
cpus-per-task support: Try to pack all CPUs of each tasks onto one socket.
Previous logic could spread the tasks CPUs across multiple sockets.
```
  cfc399f1
- fix --ntasks-per-socket with --cpus-per-task · 18bca034
  Morris Jette authored May 20, 2014
```
Previous logic was counting CPUs, but assuming each task would
only use one CPU.
```
  18bca034
- Correct author · 9304aad4
  Dan Weeks authored May 20, 2014
  
  9304aad4
- Revert "Fix division by zero issue when get_cpuinfo returns 0 for number of sockets." · dcadb070
  Danny Auble authored May 20, 2014
```
This reverts commit b22268d8.
```
  dcadb070
- Fix division by zero issue when get_cpuinfo returns 0 for number of sockets. · b22268d8
  Danny Auble authored May 20, 2014
  
  b22268d8
- Fix typo in log message · f19f22f0
  Morris Jette authored May 19, 2014
  
  f19f22f0
19 May, 2014 7 commits
- Rename test files for better clarity · 8beaaabb
  Morris Jette authored May 19, 2014
  
  8beaaabb
- Add tests for default output file names · 884bed07
  Nathan Yee authored May 19, 2014
  
  884bed07
- Add squeue sort for priority/long format · 837b1820
  Morris Jette authored May 19, 2014
  
  837b1820
- Merge branch 'slurm-2.6' into slurm-14.03 · 03f6e261
  Morris Jette authored May 19, 2014
```
Conflicts:
	src/slurmctld/job_mgr.c
```
  03f6e261
- Properly handle job requeue options · 68a4bfd7
  Morris Jette authored May 19, 2014
```
Properly enforce job --requeue and --norequeue options. Previous
logic was in three places not doing so (either ignoring the value,
ANDing it with the JobRequeue configuration option or using the
JobRequeue configuration option by itself).
bug 821
```
  68a4bfd7
- Remove redundant variables · 716b2ca1
  Morris Jette authored May 19, 2014
  
  716b2ca1
- harden job limit enforcement · 09cf5238
  Morris Jette authored May 19, 2014
```
There should be no change in behavior with the production code,
but this will improve the robustness of the code if someone makes
changes to the logic.
```
  09cf5238
15 May, 2014 2 commits

Add SelectTypeParameters option of CR_PACK_NODES · 7c2fe50e

Morris Jette authored May 15, 2014

Add SelectTypeParameters option of CR_PACK_NODES to pack a job's tasks
tightly on its allocated nodes rather than distributing them evenly across
the allocated nodes.
bug 819

7c2fe50e

Close window with srun if waiting for an allocation and while printing · c563b34e
Danny Auble authored May 15, 2014
```
something you also get a signal which would produce deadlock.

Fix Bug 601.
```
c563b34e

14 May, 2014 4 commits
- Expand backup controller documentation · 553ca27f
  Morris Jette authored May 14, 2014
  
  553ca27f
- Run EpilogSlurmctld for jobs killed in reconfig · 87128cf0
  Morris Jette authored May 14, 2014
```
Run EpilogSlurmctld for a job is killed during slurmctld reconfiguration.
bug 806
```
  87128cf0
- Cosmetic mods · d76b4a60
  Morris Jette authored May 14, 2014
  
  d76b4a60
- Jobs hidden only if ALL partitions are hidden · 5fc21da2
  Morris Jette authored May 14, 2014
```
Only if ALL of their partitions are hidden will a job be hidden by default.
bug 812
```
  5fc21da2
13 May, 2014 7 commits

More gracefully handle batch launch failure · a73012bc

Morris Jette authored May 13, 2014

If a batch job launch request can not be built (the script file
is missing, a credential can not be created, or the user does
not exist on the selected compute node), then cancel the job
in a graceful fashion. Previously, the bad RPC would be sent to
the compute node and that node DRAINED.
see bug 807

a73012bc

Correct CR_LLN with node selection by job · 899561b1

Morris Jette authored May 13, 2014

Correct SelectTypeParameters=CR_LLN with job selecition of specific nodes.
Previous logic would in most instances allocate resources on all nodes
to the job.

899561b1

Correct squeue job node & CPU counts on requeue · 4f97cae8

Morris Jette authored May 13, 2014

Correct squeue's job node and CPU counts for requeued jobs.
Previously, when a job was requeued, its CPU count reported
was that of the previous execution. When combined with the
--ntasks-per-node option, squeue would compute the expected
node count. If the --exclusive option is also used, the node
count reported by squeue could be off by a large margin (e.g.
"sbatch --exclusive --ntasks-per-node=1 -N1 .." on requeue
would use the number of CPUs on the allocated node to recompute
the expected node count).
bug 756

4f97cae8

Fix issue where batch cpuset wasn't looked at correctly in · c5728294
Danny Auble authored May 13, 2014
```
jobacct_gather/cgroup.
```
c5728294
Support non-standard slurm.conf path · 3bf2adcd
Morris Jette authored May 13, 2014
```
Support SLURM_CONF path which does not have "slurm.conf" as the file name.
bug 803
```
3bf2adcd
Expand log messages · 0f457b94
Morris Jette authored May 13, 2014

0f457b94
Add limits hierachy documentation · b2cbe311
Morris Jette authored May 13, 2014

b2cbe311

12 May, 2014 7 commits
- Retry step create if node not responding · ffad3102
  Morris Jette authored May 12, 2014
```
If a job has non-responding node, retry job step create rather than
returning with DOWN node error.
bug 734
```
  ffad3102
- Cosmetic mods to NEWS · e17ffc1b
  Morris Jette authored May 12, 2014
  
  e17ffc1b
- Merge branch 'slurm-2.6' into slurm-14.03 · f2372034
  Morris Jette authored May 12, 2014
  
  f2372034
- Fix support for job --profile=none option · 043e1b08
  Puenlap Lee authored May 12, 2014
```
Also correct related documentation
```
  043e1b08
- Make test suite more robust · 6e0ac7dd
  Nathan Yee authored May 12, 2014
```
Add force option to all file removals ("rm ..." to "rm -f ...").
bug 673
```
  6e0ac7dd
- Merge branch 'slurm-2.6' into slurm-14.03 · 455f94f4
  Morris Jette authored May 12, 2014
  
  455f94f4
- fix of comp nodes causing backfill to end early · d508ea95
  Hongjia Cao authored May 12, 2014
```
Completing nodes is removed when calling _try_sched() for a job, which
is the case in select_nodes(). If _try_sched() thinks the job can run
now but select_nodes() returns ESLURM_NODES_BUSY, the backfill loop will
be ended.
```
  d508ea95