Commits · 0820a630eb66a0e63a4387f9e38005145ed8857c · Manuel G. Marciani / ces_slurm_simulator

20 Apr, 2016 1 commit
- Move some definitions into alphabetic order · 0820a630
  Morris Jette authored Apr 20, 2016
```
No change in any logic or definitions
```
  0820a630
15 Apr, 2016 1 commit
- For for coverity reported bug · 4f0e0236
  Morris Jette authored Apr 15, 2016
  
  4f0e0236
14 Apr, 2016 1 commit

Set burst buffer reason for job · 49d483db

Morris Jette authored Apr 14, 2016

If a job fails stage in, set its reason to BurstBufferOperation
with a string describing what happened. Previously the reason was
set to AdminHeld on stage-in failure.

49d483db

13 Apr, 2016 2 commits
- Fix typo in NEWS · 0f34e2ad
  Morris Jette authored Apr 13, 2016
  
  0f34e2ad
- Prevent deadlock for flow of data to the slurmdbd when sending reservation · cd7aa558
  Danny Auble authored Apr 12, 2016
```
that wasn't set up correctly.
```
  cd7aa558
12 Apr, 2016 3 commits
- power/cray - returned to operation · a7e03592
  Morris Jette authored Apr 12, 2016
```
power/cray - Fix bug introduced in 15.08.10 preventin operation in many
    cases.
bug 2628
```
  a7e03592
- power/cray - Prevent possible divide by zero · b6a8373c
  Morris Jette authored Apr 12, 2016
  
  b6a8373c
- Correct format in log · b118eca1
  Morris Jette authored Apr 12, 2016
```
Was printing integer using %u format
```
  b118eca1
11 Apr, 2016 7 commits

burst_buffer/cray persistent create/delete fail · 87e1cc67

Morris Jette authored Apr 11, 2016

burst_buffer/cray - Fix for script creating or deleting persistent buffer
    would fail "paths" operation and hold the job.
bug 2624

87e1cc67

MYSQL - Make the error message more specific when removing a reservation · bd8042e8
Danny Auble authored Apr 11, 2016
```
and it doesn't meet basic requirements.
```
bd8042e8
Correct error messages to reflect "remove" rather than "edit". · 3affa60f
Tim Wickberg authored Apr 11, 2016

3affa60f

backfill - minor performance enahcements · 395b5505

Morris Jette authored Apr 11, 2016

The gprof tool is showing most time is being consumed by the bit_test()
  function as called from the select plugin, which in turn was called
  by the backfill scheduler. These changes replace the for loop end-points.
  Previous logic tested for all possible nodes. The new logic identifes
  the first and last bit set in the node bitmap and uses those end-points
  instead. Node the logic to find the first and last bits set starts off
  with a word-based search (testing for a 64-bit zero value rather than
  testing each individual bit). The net result is a small performance
  improvement.
bug 2588

395b5505

Fix three typos. · e6e87c92
Tim Wickberg authored Apr 11, 2016

e6e87c92
burst_buffer/cray fix for pre_run fail · 8f667db4
Morris Jette authored Apr 11, 2016
```
burst_buffer/cray - Decrement job's prolog_running counter if pre_run fails.
bug 2621
```
8f667db4

Reset job's prolog_running counter · f3f41e10

Morris Jette authored Apr 11, 2016

If a job is no longer in configuring state, then clear the prolog_running
  counter on slurmctld restart or reconfigure.
bug 2621

f3f41e10

09 Apr, 2016 2 commits

Fix for commit · 06776b12

Morris Jette authored Apr 08, 2016

For case where job can't start and there are no running jobs
to remove in order to establish estimated start time.

06776b12

backfill scheduling enhancement · e62a9270

Morris Jette authored Apr 08, 2016

When determining when a pending job will be able to start, rather
than testing after removing each running job and trying to schedule
the pending jobs, remove multiple jobs that all end about the
same time before testing. This reduces the number of calls to
the job placement logic, which is time consuming.

e62a9270

08 Apr, 2016 2 commits
- Add new list function · 654f3bf8
  Morris Jette authored Apr 08, 2016
```
list_peek_next(), like list_next() but WITHOUT advancing the pointer
```
  654f3bf8
- Expand backfill scheduling logs · b3a49e14
  Morris Jette authored Apr 08, 2016
  
  b3a49e14
07 Apr, 2016 3 commits

Log poor backfill configuration parameters · 5675c5f7

Morris Jette authored Apr 07, 2016

Document and log cases where max jobs per user or partition is
  equal or greater than the max jobs test. In that case, a single
  user can easily stop all backfill scheduling.

5675c5f7

Fix handling for single-character prognames · 11320ebc
Sami Ilvonen authored Apr 07, 2016

11320ebc

fix for job "--contiguous" option · 47a07b54

Morris Jette authored Apr 06, 2016

Fix for job "--contiguous" option that could cause job allocation/launch
    failure or slurmctld crash.
bug 2573

47a07b54

06 Apr, 2016 8 commits
- Start NEWS for v15.08.11 · 3a8ecf32
  Morris Jette authored Apr 06, 2016
  
  3a8ecf32
- Update META for v15.08.10 tag · cb2ea0bb
  Morris Jette authored Apr 06, 2016
  
  cb2ea0bb
- Revert "Fix situation on a heterogeneous memory cluster where the order of" · 3ae45a51
  Danny Auble authored Apr 06, 2016
```
This reverts commit f559a55c.
```
  3ae45a51
- Fix situation on a heterogeneous memory cluster where the order of · f559a55c
  Danny Auble authored Apr 06, 2016
```
constraints mattered in a job.

Details include:
A job doesn't request memory but the system is running
with CR_*MEMORY with no default memory limit and the job requests nodes
with features of different sizes.  Previously the order of constraints
mattered where the smaller memory node would need to be requested first
or the job would fail.

Bug 2608
```
  f559a55c
- Don't change job time limit when updating unrelated field in a job · 594c7997
  Morris Jette authored Apr 06, 2016
```
Previous logic would get an account and/or QOS time limit and use
  that value to overwrite the incoming RPC's NO_VAL value, which
  would change a job's time limit when changing an unrelated
  field (e.g. priority, QOS, etc.).
bug 2610
```
  594c7997
- Avoid double calculation on partition QOS if the job is using the same QOS. · e17a7eaf
  Danny Auble authored Apr 06, 2016
  
  e17a7eaf
- Fix for SEGV · 55d31288
  Morris Jette authored Apr 06, 2016
```
Prevent use of NULL pointer and SEGV when changing a job's QOS when
  the slurmdbd is not configured.
```
  55d31288
- Fix spelling of 'daemon'. · b714beb6
  Tim Wickberg authored Apr 06, 2016
  
  b714beb6
05 Apr, 2016 3 commits

Fix backfill scheduler race condition · d8b18ff8

Morris Jette authored Apr 05, 2016

Fix backfill scheduler race condition that could cause invalid pointer in
    select/cons_res plugin. Bug introduced in 15.08.9, commit:
    efd9d35e

The scenario is as follows
1. Backfill scheduler is running, then releases locks
2. Main scheduling loop starts a job "A"
3. Backfill scheduler resumes, finds job "A" in its queue and
   resets it's partition pointer.
4. Job "A" completes and tries to remove resource allocation record
   from select/cons_res data structure, but fails to find it because
   it is looking in the table for the wrong partition.
5. Job "A" record gets purged from slurmctld
6. Select/cons_res plugin attempts to operate on resource allocation
   data structure, finds pointer into the now purged data structure
   of job "A" and aborts or gets SEGV
Bug 2603

d8b18ff8

Rename function, no real code change. The old function name was completely · 6f0c2d3f
Danny Auble authored Apr 05, 2016
```
misleading.
```
6f0c2d3f
Remove debug from commit 921c59e4 · 24566dd7
Danny Auble authored Apr 04, 2016

24566dd7

04 Apr, 2016 4 commits
- Remove duplicates from AccountingStorageTRES · 921c59e4
  Danny Auble authored Apr 04, 2016
  
  921c59e4
- Add slurm_set_accounting_storage_tres · 5751b9d6
  Danny Auble authored Apr 04, 2016
  
  5751b9d6
- If using PrologFlags=contain: Don't launch the extern step if a job is · 91a83e41
  Danny Auble authored Apr 04, 2016
```
canceled while launching.
```
  91a83e41
- Change in comment for greater clarity · 3f51a788
  Morris Jette authored Apr 04, 2016
  
  3f51a788
02 Apr, 2016 2 commits
- checkpoint/blcr plugin: Fix memory leak. · 08d520db
  Morris Jette authored Apr 02, 2016
  
  08d520db
- Fix potential divide by zero when tree_width=1 · ef8c5e1b
  Danny Auble authored Apr 01, 2016
  
  ef8c5e1b
01 Apr, 2016 1 commit
- Cosmetic change, no change to logic · fabc772e
  Morris Jette authored Apr 01, 2016
  
  fabc772e