Commits · dbf217497f73733e81db1d4a48e4d7ac20369576 · Manuel G. Marciani / ces_slurm_simulator

14 May, 2015 1 commit
- Update MemSpecLimit man documenation. · dbf21749
  Brian Christiansen authored May 14, 2015
  
  dbf21749
13 May, 2015 6 commits
- Add missing space · 44f2e6a4
  Danny Auble authored May 13, 2015
  
  44f2e6a4
- Additional debugging. · e8cb6276
  Brian Christiansen authored May 13, 2015
  
  e8cb6276
- Cray - Fix backup controller running native Slurm. · 5db73d97
  Brian Christiansen authored May 12, 2015
```
Bug 1627
```
  5db73d97
- Fix segfault when backup controller takes control for second time. · 54be6a4c
  Brian Christiansen authored May 12, 2015
  
  54be6a4c
- Ensure that xhash_free sets the hash table pointer to NULL. · 828e4d2d
  Brian Christiansen authored May 12, 2015
  
  828e4d2d
- Fix small memory leak in backup controller. · c77aa354
  Brian Christiansen authored May 12, 2015
  
  c77aa354
12 May, 2015 3 commits
- Update the sacctmgr man page. · 73fa2f58
  David Bigagli authored May 12, 2015
  
  73fa2f58
- Fix typo in previous commit · 721c7963
  Morris Jette authored May 11, 2015
  
  721c7963
- Load libtinfo as needed with ncureses tools · 584f4d68
  Morris Jette authored May 11, 2015
  
  584f4d68
11 May, 2015 2 commits

Explain job option of --mem=0 means all memory · c0d6edc9
Morris Jette authored May 11, 2015
```
This is a special case. This change documents the way Slurm has
always worked.
```
c0d6edc9

Purge old step data on job requeue · beecc7b0

Morris Jette authored May 11, 2015

Make sure that old step data is purged when a job is requeued.
Without this logic, if a job terminates abnormally then old step
data may be left in slurmctld. If the job is then requeued and
started on a different node, referencing that old job step data
can result in abnormal events. One specific failure mode is if
the job is requeued on a node with a different number of cores,
and the step terminated RPC arrives later, the job and step
bitmaps of allocated cores can differ in size generating an
abort.
bug 1660

beecc7b0

08 May, 2015 4 commits
- Fix typo in commit 0896ca06 · 09ed0ad3
  Danny Auble authored May 08, 2015
  
  09ed0ad3
- Make sure each job has a wckey if that is something that is tracked. · 0896ca06
  Danny Auble authored May 08, 2015
  
  0896ca06
- Ensure that SLURM_JOB_NAME is in a job's allocation. · c4e0bd9d
  Brian Christiansen authored May 08, 2015
```
Bug 1618
```
  c4e0bd9d
- Preserve errno on execve() failure in task plugin · bf81e826
  Jonathon Nelson authored May 08, 2015
  
  bf81e826
07 May, 2015 4 commits
- Make full node reservations display correctly the core count instead of · 00099596
  Danny Auble authored May 07, 2015
```
cpu count.
```
  00099596
- Add contributor name · 6940e586
  Morris Jette authored May 07, 2015
  
  6940e586
- [PATCH 01/42] added missing -I compilation option · c91467fa
  =Veronique Legrand authored May 07, 2015
  
  c91467fa
- Fix test12.4 to correctly fetch the user gid number. · d26ba8ee
  Nicolas Joly authored May 07, 2015
  
  d26ba8ee
06 May, 2015 2 commits
- Extend sleep in test for acctg info movement · b9674052
  Morris Jette authored May 06, 2015
  
  b9674052
- BLUEGENE - Set DB2NOEXITLIST when starting the slurmctld daemon to avoid · bce4b80f
  Danny Auble authored May 06, 2015
```
random crashing in db2 when the slurmctld is exiting.

Signed-off-by: Danny Auble <da@schedmd.com>
```
  bce4b80f
05 May, 2015 1 commit
- Cray: Add plugstack.conf.template sample SPANK config · 22a0e5a5
  Morris Jette authored May 05, 2015
  
  22a0e5a5
01 May, 2015 3 commits
- Remove trailing white spaces. · 0a391233
  David Bigagli authored May 01, 2015
  
  0a391233
- Correct contributor site name · d5bcb034
  Morris Jette authored May 01, 2015
  
  d5bcb034
- Fix sshare --Users: Initialize options, add usage info · b0b5bf3e
  Jens Svalgaard Kohrt authored May 01, 2015
  
  b0b5bf3e
30 Apr, 2015 6 commits
- Change test to match new scancel error messasge · bdf48983
  Morris Jette authored Apr 30, 2015
  
  bdf48983
- Change slurmctld agent timeout · 98e08216
  Morris Jette authored Apr 30, 2015
```
In slurmctld communication agent, make the thread timeout be the configured
value of MessageTimeout (or 30 seconds, whichever is larger) rather than
30 seconds.
```
  98e08216
- Fix typo in comments, no change in logic · 2f4b15d2
  Morris Jette authored Apr 30, 2015
  
  2f4b15d2
- Fix scancel step cancel bug · 5cb067fc
  Morris Jette authored Apr 30, 2015
```
Fix scancel bug which could return an error on attempt to signal a job step.
A simple "scancel 12.3" to signal a specific job step would fail. Adding
another option (say "-i", "--partion=", etc.) would fix this.
```
  5cb067fc
- Improve the if logic · b4d68aac
  David Bigagli authored Apr 29, 2015
  
  b4d68aac
- Initialize variables to prevent core dump. · 613afa0b
  David Bigagli authored Apr 29, 2015
  
  613afa0b
29 Apr, 2015 7 commits

Parse "#_*" as all tasks in job array · 4c9f70b0

Morris Jette authored Apr 29, 2015

Modify slurmctld's parsing of a job_id string for the job_signal and
job_requeue calls to treat a job ID value of "#_*" as representing
all tasks in a job ID number "#". Previously treated as invalid input.

Also set the last_job_update time so that if a pending job is killed,
then that is reported immediately by "squeue -i#" (previously it
may keep reporting stale date.

4c9f70b0

Minor update to mailing list page · 82586d86
Morris Jette authored Apr 29, 2015
```
Trying to avoid having technical questions sent to "sales@schedmd.com"
```
82586d86
Add link to NetBSD download package · 4ae726d2
Morris Jette authored Apr 29, 2015

4ae726d2

Minor improvement in sched_min_interval logic · c74c5ff1

jette authored Apr 28, 2015

This avoids letting the queued scheduling thread from starting
if the main scheduling loop is still running.

c74c5ff1

Revert "sad" · f70b7704

Danny Auble authored Apr 24, 2015

This reverts commit f9ebf5ad.

Conflicts:
	src/plugins/select/alps/basil_interface.c

f70b7704

ALPS - Have the slurmstepd running a batch job wait for an ALPS release · 2eefdbd6
Danny Auble authored Apr 28, 2015
```
before ending the job.
```
2eefdbd6

ALPS - Add new cray.conf variable NoAPIDSignalOnKill. When set to yes this · d4d64877

Danny Auble authored Apr 28, 2015

will make it so the slurmctld will not signal the apid's in a batch job.
Instead it relies on the rpc coming from the slurmctld to kill the job to
end things correctly.

d4d64877

28 Apr, 2015 1 commit

Change SchedulingParameters sched_min_interval meaning · 5ab69ccb

Morris Jette authored Apr 28, 2015

Make this be the minimum time between the end of one scheduling
cycle and the start of the next cycle (rather than using start
times for both).
Set the default value to 1,000,000 microseconds for Cray/ALPS
systems.

5ab69ccb