Commits · 2e95c20b3bf9bcddd9b0fe0048e222fb8306c90b · Manuel G. Marciani / ces_slurm_simulator

18 Feb, 2015 2 commits
- Add SLURM_JOB_GPUS to Prolog · 2e95c20b
  Morris Jette authored Feb 17, 2015
```
Add SLURM_JOB_GPUS environment variable to those available in Prolog.
Also add list of environment variables available in the various
prologs and epilogs on the web page.
bug 1458
```
  2e95c20b
- Print FAIR_TREE in "scontrol show config" output for PriorityFlags. · 27eef95d
  Brian Christiansen authored Feb 17, 2015
  
  27eef95d
17 Feb, 2015 4 commits
- BGQ - Close very small window where a step could of been removed before the · c169c935
  Danny Auble authored Feb 17, 2015
```
runjob happened, and the step was part of an array.  This is an addition to
commit 49e0f5f2
```
  c169c935
- BGQ - Fix issue with job arrays not being handled correctly · 49e0f5f2
  Danny Auble authored Feb 17, 2015
```
in the runjob_mux plugin.
```
  49e0f5f2
- Update NEWS · 6984348d
  Brian Christiansen authored Feb 17, 2015
```
Bug 1461
Commit: 2e2d924e
```
  6984348d
- Prevent slurmdbd abort if node DOWN with NULL reason · 2e2d924e
  Morris Jette authored Feb 17, 2015
```
See bug 1461
```
  2e2d924e
13 Feb, 2015 2 commits

Fix squeue. · c13e8540
David Bigagli authored Feb 13, 2015

c13e8540

Avoid triggering accounting if node state unchanged · 23f84ace

Morris Jette authored Feb 12, 2015

If call was made to change a node's state to the same state it
was already in and set its reason to the same value it already
had, then an accounting record was generated. If a script, say
NodeHealthCheck is repeatedly setting a node state (say DRAIN),
it could generate a huge number of redundant accounting records.
This eliminates these redundant records.
related to bug 1437

23f84ace

12 Feb, 2015 4 commits
- Start v14.11.5 NEWS file · 4531ab3f
  Morris Jette authored Feb 12, 2015
  
  4531ab3f
- Update META for v14.11.4 tag · 1b2c8e18
  Morris Jette authored Feb 12, 2015
  
  1b2c8e18
- Fix perlapi tests for libslurm perl module. · ea7a0c7c
  Brian Christiansen authored Feb 12, 2015
  
  ea7a0c7c
- Fix issue with "sreport cluster AccountUtilizationByUser" when using PrivateData=users. · 37b56085
  Brian Christiansen authored Feb 12, 2015
```
Bug 1446
```
  37b56085
11 Feb, 2015 1 commit
- MySQL - If a node state and reason are the same on a node state change · 1685ba56
  Danny Auble authored Feb 11, 2015
```
don't insert a new row in the event table.
```
  1685ba56
10 Feb, 2015 4 commits

Additional fix to 50e0c84f. · 50b43afd
Brian Christiansen authored Feb 09, 2015
```
uid's are 0 when associations are loaded.
```
50b43afd

Backfill scheduler bug on job's partition change · a0d12d0c

Morris Jette authored Feb 09, 2015

The backfill scheduler build a queue of eligible job/partition
information and then proceeds to determine when and where those
jobs will start. The backfill scheduler can be configured to
periodically release locks in order to let other operations
take place. If the partition(s) associated with one of those
jobs changes during one of those periods, the job will still
be considered for scheduling in the old partition until the
backfill scheduler starts over with a new job/partition list.
This change to the backfill scheduler validates each job's
partition in from the list based upon current information
(considering any partition changes).
See bug 1436

a0d12d0c

Insure bitstring size is valid · ecd593cc

Morris Jette authored Feb 09, 2015

If bitmap size is initially NO_VAL 0xfffffffe, then a tiny buffer
is allocated and accessing it can go off the end of the buffer.
This has not been observed in production, but only in the investigation
of another problem.

ecd593cc

Fix segfault in controller when deleting a user association of a user which... · 50e0c84f
Brian Christiansen authored Feb 09, 2015
```
 Fix segfault in controller when deleting a user association of a user which had been previously removed from the system.

 Bug 1238
```
50e0c84f

09 Feb, 2015 6 commits
- Fix job array task requeue race condition · ae0ba3d8
  Morris Jette authored Feb 09, 2015
```
Fix slurmctld initialization problem which could cause requeue of the last
task in a job array to fail if executed prior to the slurmctld loading
the maximum size of a job array into a variable in the job_mgr.c module.
```
  ae0ba3d8
- Fix bug that could lose task of job array · 0efa0ba4
  Morris Jette authored Feb 09, 2015
```
Fix slurmctld job recovery logic which could cause the last task in a job
array to be lost on restart.
```
  0efa0ba4
- Remove misleading error message · b16bd9f5
  Morris Jette authored Feb 09, 2015
  
  b16bd9f5
- Update docs for --hint=nomultithread. · e2c12d8e
  Brian Christiansen authored Feb 09, 2015
```
Only supported with task/affinity plugin.
```
  e2c12d8e
- Use default value if CgroupMountpoint not defined · 2f04a52a
  Pär Lindfors authored Feb 09, 2015
```
When CgroupMountpoint was not defined in cgroup.conf the mount point
got undefined. This resulted in cgroups not being released.
```
  2f04a52a
- Fix build for non-standard hwloc location · bd303aff
  Nicolas Joly authored Feb 09, 2015
  
  bd303aff
08 Feb, 2015 1 commit
- Merge remote-tracking branch 'origin/slurm-14.03' into slurm-14.11 · fa1a8b8b
  Danny Auble authored Feb 07, 2015
  
  fa1a8b8b
05 Feb, 2015 5 commits
- If a job is requeued because of RequeueExit or RequeueExitHold sent · 3e5d8f8e
  David Bigagli authored Feb 05, 2015
```
event  REQUEUED to slurmdbd.
```
  3e5d8f8e
- Describe possible cause of error message · 8a5c6354
  Morris Jette authored Feb 05, 2015
```
Related to bug 1429
```
  8a5c6354
- Merge pull request #100 from paran1/slurm-14.03 · ad66d638
  Brian Christiansen authored Feb 05, 2015
```
Improve "Prolog and Epilog Scripts" in slurm.conf(5)
```
  ad66d638
- Add information about PrologFlags to slurm.conf man page · a75686fb
  Pär Lindfors authored Feb 05, 2015
  
  a75686fb
- Fix SLURM_CLUSTER_NAME variable name in slurm.conf man page · 144cd1c7
  Pär Lindfors authored Feb 05, 2015
```
The environment variable name SLURM_JOB_CLUSTER_NAME should be
SLURM_CLUSTER_NAME. This is also available in Prolog and Epilog, so
remove note about it only being available in PrologSlurmctld and
EpilogSlurmctld.
```
  144cd1c7
04 Feb, 2015 4 commits

expand squeue "shared" job format more · 61b571a8
Morris Jette authored Feb 04, 2015

61b571a8

Report correct job "shared" field value · 3de14946

Morris Jette authored Feb 04, 2015

Previously it was not possible to distinguish between a job needing
exclusive nodes and the default job/partition configuration.

3de14946

job array slurmctld abort fix · 0ff342b5
Morris Jette authored Feb 04, 2015
```
Fix job array logic that can cause slurmctld to abort.
bug 1426
```
0ff342b5

Fix for CUDA v7.0+ · da2fba48

Morris Jette authored Feb 03, 2015

Enable CUDA v7.0+ use with a Slurm configuration of TaskPlugin=task/cgroup
ConstrainDevices=yes (in cgroup.conf). With that configuration
CUDA_VISIBLE_DEVICES will start at 0 rather than the device number.
bug 1421

da2fba48

03 Feb, 2015 7 commits
- Move cgroup.conf read logic · c6b13b0e
  Morris Jette authored Feb 03, 2015
```
Move the functions that read cgroup.conf from src/slurmd/common
to slurmd/common so that the gres/gpu plugin can use it.
```
  c6b13b0e
- Update scheduling configuration documenation. · 1ef7c0c6
  Brian Christiansen authored Feb 03, 2015
  
  1ef7c0c6
- More updates to liceneses/resources documentation. · df531529
  Brian Christiansen authored Feb 03, 2015
  
  df531529
- Indent. · 9c7301be
  David Bigagli authored Feb 03, 2015
  
  9c7301be
- Print spurious message about the absence of cgroup.conf at log level · a26eaa64
  David Bigagli authored Feb 03, 2015
```
debug2 instead of info.
```
  a26eaa64
- When a job uses multiple partition set the environment variable · 1f37d3b8
  David Bigagli authored Feb 03, 2015
```
SLURM_JOB_PARTITION to be the one in which the job started.
```
  1f37d3b8
- Remove defunct sched/gang configuration · 76fe5e03
  Morris Jette authored Feb 03, 2015
  
  76fe5e03