Commits · f981a1f60ceee6a0492aa3c5b84103e58c3bb460 · Manuel G. Marciani / ces_slurm_simulator

14 Dec, 2019 6 commits
- Docs - effect -> affect in faq.html · f981a1f6
  Michael Hinton authored Dec 12, 2019
```
Bug 7308
```
  f981a1f6
- Docs - it's -> its in faq.html · f0436423
  Michael Hinton authored Dec 12, 2019
```
Bug 7308
```
  f0436423
- Docs - Use the correct form of word in faq.html · 0715590b
  Michael Hinton authored Dec 13, 2019
```
Bug 7308
```
  0715590b
- Docs - Add missing words in faq.html · a8fa5f0b
  Michael Hinton authored Dec 13, 2019
```
Bug 7308
```
  a8fa5f0b
- Docs - Remove extra words in faq.html · 514ee0d6
  Michael Hinton authored Dec 13, 2019
```
Bug 7308
```
  514ee0d6
- Docs - Proper capitalization in faq.html · b19d43ef
  Michael Hinton authored Dec 12, 2019
```
Bug 7308
```
  b19d43ef
12 Dec, 2019 1 commit
- Docs - fix mistake in map_gpu example. · 212eb666
  Michael Hinton authored Dec 11, 2019
  
  212eb666
10 Dec, 2019 4 commits
- Merge remote-tracking branch 'origin/slurm-18.08' into slurm-19.05 · b5d0f52e
  Danny Auble authored Dec 10, 2019
```
# Conflicts:
#	testsuite/expect/test34.2
```
  b5d0f52e
- Continuation of a5309c2a · 213226de
  Danny Auble authored Dec 10, 2019
  
  213226de
- Fix pending array tasks not always matching 1st task's reason · ee3d4715
  Michael Hinton authored Apr 18, 2019
```
Have the main scheduler and backfill scheduler make the reasons of
subsequent array tasks match the first array task, since they
sometimes didn't do this completely when the array was pending.

Bug 6814
```
  ee3d4715
- Fix formatting error in docs · ef026084
  Michael Hinton authored Apr 09, 2019
  
  ef026084
09 Dec, 2019 4 commits

Merge branch 'bug7629' into slurm-19.05 · 78c67325
Brian Christiansen authored Dec 09, 2019

78c67325

Detect possible infinite loop when placing tasks · 8960b805

Nate Rini authored Aug 29, 2019

It is possible there may be an infinite loop when placing tasks when
_at_tpn_limit() activates on every possible node. If that happens, then
the job can not be placed and an error will be returned instead.

Continuation of 16eb8b14

Bug 7629.

8960b805

Honor ntasks_per_node in _compute_c_b_task_dist() · 16eb8b14

Nate Rini authored Aug 27, 2019

Add _at_tpn_limit() as helper to determine when a given node is over the
tasks_per_node limit and to log when then happens.

Bug 7629.

16eb8b14

Testsuite - Add -b/--begin-from-test option to regresion.py · 438d6203

Marcin Stolarek authored Dec 05, 2019

Option may be useful when running --stop-on-first-fail,
when the issue is fixed than it will allow restart from
the failed one.

Bug 7433.

438d6203

06 Dec, 2019 1 commit
- Testsuite - Fix test31.2 if SchedulerParams has nohold_on_prolog_fail. · 2b95e0a6
  Felip Moll authored Jun 21, 2019
```
Bug 7274
```
  2b95e0a6
05 Dec, 2019 3 commits
- Reference correct variables in test. · 089ff62f
  Danny Auble authored Dec 05, 2019
  
  089ff62f
- Merge remote-tracking branch 'origin/slurm-18.08' into slurm-19.05 · 1c4e91d0
  Danny Auble authored Dec 05, 2019
  
  1c4e91d0
- Make it so test34.2 works if it was previously canceled midway. · a5309c2a
  Danny Auble authored Dec 05, 2019
  
  a5309c2a
04 Dec, 2019 2 commits
- Docs - fix typo for 'bind'. · 43b679c7
  Michael Hinton authored Dec 03, 2019
  
  43b679c7
- Docs - fix grammar for 'processes'. · 759db888
  Michael Hinton authored Dec 03, 2019
  
  759db888
03 Dec, 2019 1 commit
- Testsuite - Add test for mixed AND and XOR on test17.12 · 4f2331e4
  Marcin Stolarek authored Nov 27, 2019
```
Test regression bug 7378, commit 1c051c61.

Bug 7624.
```
  4f2331e4
02 Dec, 2019 3 commits

Fix parsing of delay_boot in controller when additional args follow · 9daa0563
Brian Christiansen authored Nov 13, 2019
```
Signed-off-by: Jason Booth <jbooth@schedmd.com>

Bug 7189
```
9daa0563

Testsuite - Fix sleep in test17.12 · e2921114

Marcin Stolarek authored Nov 29, 2019

Increase the sleep in job to 5s was required to make test reliable. It
doesn't result in longer execution since we're waiting for job to get
into RUNNING state and then we're canceling it. Short sleep can end-up
with overlook of RUNNING state and false negative result - 'Job is DONE
but expected RUNNING'.

Bug 7624.

e2921114

Testsuite - Fix missing log_error on test17.12 · 32c98f96

Marcin Stolarek authored Dec 02, 2019

Previously there were no log_error before setting exit_code to 1
due scontrol error.
Now user can identify the reason of final FAILURE result.

Bug 7624.

32c98f96

28 Nov, 2019 3 commits
- Fix typo for 'transitional'. · 0921a953
  Nate Rini authored Nov 27, 2019
  
  0921a953
- Revert "Fix typo for 'transitional'." · 919d932e
  Tim Wickberg authored Nov 27, 2019
```
This reverts commit fea86e4c.
```
  919d932e
- Fix typo for 'transitional'. · fea86e4c
  Tim Wickberg authored Nov 27, 2019
  
  fea86e4c
26 Nov, 2019 6 commits
- Fix format build error on FreeBSD · 683415cc
  Broderick Gardner authored Nov 26, 2019
```
Bug 8153
```
  683415cc
- Gramattical fix for 'relinquished'. · d3974825
  Michael Hinton authored Nov 26, 2019
  
  d3974825
- Merge remote-tracking branch 'origin/slurm-18.08' into slurm-19.05 · 79bdeb98
  Danny Auble authored Nov 26, 2019
  
  79bdeb98
- Make Slurm compile on linux after sys/sysctl.h was deprecated. · 0f3ec361
  Danny Auble authored Nov 26, 2019
```
Bug 7987

Co-authored-by: Broderick Gardner <broderick@schedmd.com>
Signed-off-by: Broderick Gardner <broderick@schedmd.com>
```
  0f3ec361
- Testsuite - Use test name as job name in test9.9 · ba83c8f0
  Nate Rini authored Aug 29, 2019
```
This avoids possible overlaping with other jobs.

Bug 7661.
```
  ba83c8f0
- Fix typo for 'component'. · e3af99f3
  Michael Hinton authored Nov 25, 2019
  
  e3af99f3
21 Nov, 2019 3 commits

Docs - clarify immediate allocation requests conflict with defer. · 5f233be4
Alejandro Sanchez authored Nov 21, 2019
```
Bug 5175.

Signed-off-by: Marshall Garey <marshall@schedmd.com>
```
5f233be4

Fix misleading error for immediate alloc requests and defer combination. · 1b13f532

Alejandro Sanchez authored Nov 20, 2019

When an allocation request was done with the immediate=1 argument and
SchedulerParameters included defer, Slurm was returning a misleading
ESLURM_FRAGMENTATION error. Logic now a returns a more appropriate
ESLURM_CAN_NOT_START_IMMEDIATELY error for this scenario by decoupling
defer from the too fragmented logic in job_allocate().

Note that this doesn't change behavior as immediate + defer combination
continues having defer as the king in terms of precedence order, meaning
individual submit time allocation attempts will be avoided independently
of immediate.

Bug 5175.

1b13f532

Reject unrunnable jobs submitted to reservations. · ab52c868

Marshall Garey authored Oct 03, 2019

This effectively reverts commit 73351553. That commit's message is,

     "Improve support for overlapping advanced reservations.
      Patch from Bill Brophy, Bull."

Jobs submitted to reservations that request more resources than are on a
node will pend forever because of that commit. Reverting that commit
causes those jobs to be immediately rejected. Also, that commit doesn't
appear to "improve support for overlapping advanced reservations" in any
way.

The job is already immediately rejected if it asks for more resources
than are on a node without being submitted to a reservation, or if the
job asks for more nodes than are currently in the reservation. So, this
commit just makes behavior consistent.

Bug 5175.

ab52c868

19 Nov, 2019 1 commit
- Fix typo in quickstart.shtml · 761616a3
  Elliot Waite authored Nov 19, 2019
  
  761616a3
18 Nov, 2019 1 commit
- Remove stray bluegene.conf.example file. · f9479db3
  Tim Wickberg authored Nov 18, 2019
  
  f9479db3
15 Nov, 2019 1 commit

Fix both socket-[un]constrained GRES allocation issues. · efcd853a

Michael Hinton authored Oct 23, 2019

Do not assume that these sock_gres_t pointers always exist:
bits_by_sock
bits_by_sock[s]

If they don't, that means there are no current iteration socket `s`
constrained GRES and so the logic shouldn't allocate the current
iteration GRES `g`.

Analogously, do not assume that bits_any_sock sock_gres_t member pointer
is always valid. If it isn't, it means there are no socket-unconstrained
GRES available to satisfy the job request, so the logic should not
allocate the current iteration GRES `g`.

Otherwise, job/node struct members holding GRES allocation information
would end up being incorrect, leading to improper allocations and also
leading to errors logged in slurmctld log at deallocation time like:

error: gres/gpu: job <X> dealloc node <Y> GRES count underflow (0 < 1)

Bug 7827

efcd853a