Commits · f5e3ef3d13a9db2ec50bddbace10dde257d13008 · Manuel G. Marciani / ces_slurm_simulator

27 Jan, 2012 2 commits

Fix typo in accounting when using reservations. Patch from Alejandro · 92487dec
Danny Auble authored Jan 27, 2012
```
Lucero Palau.
```
92487dec

Fix slurmd/slurmstepd daadlock condition · 3579aa43

Morris Jette authored Jan 26, 2012

This patch was previously applied to SLURM v2.4 and is being back-ported
due to problems being reported in SLURM v2.3. Original commit is here
https://github.com/SchedMD/slurm/commit/4c0eea7b8c20ccb1cacad51838a1ea8257cc637d

3579aa43

25 Jan, 2012 1 commit

Set DEFAULT flag in partition structure · 9f4ef925

Morris Jette authored Jan 24, 2012

Set DEFAULT flag in partition structure when slurmctld reads the
configuration file. Patch from Rémi Palancher. Note the flag is set
when the information is sent via RPC for sinfo.

9f4ef925

24 Jan, 2012 1 commit
- Start v2.3.4 NEWS · 10fcf40e
  Morris Jette authored Jan 24, 2012
  
  10fcf40e
22 Jan, 2012 1 commit

Fix for job_cnt_comp underflow errors · 3c839428

jette authored Jan 21, 2012

Fix race condition that could generate job_cnt_comp underflow errors on
front-end architectures (Cray or IBM BlueGene systems).

3c839428

20 Jan, 2012 1 commit

Fix for segv in slurmctld dependency processing · 49ecf2d0

Morris Jette authored Jan 20, 2012

Fix for possible invalid memory reference in slurmctld in job dependency
logic. Patch from Carles Fenoy (Barcelona Supercomputer Center).

49ecf2d0

19 Jan, 2012 1 commit
- Fix PrivateFlags bug when using Priority Multifactor plugin. If using sprio · 854a2025
  Danny Auble authored Jan 19, 2012
```
all jobs would be returned even if the flag was set.
Patch from Bill Brophy, Bull.
```
  854a2025
18 Jan, 2012 2 commits

Correction to --switch option implemenation · 8f1d9b57

Morris Jette authored Jan 18, 2012

Fix bug in --switch option with topology resulting in bad switch count use.
Patch from Alejandro Lucero Palau (Barcelona Supercomputer Center).

8f1d9b57

Fix for possible deadlock in accounting logic · 4c0eea7b
Morris Jette authored Jan 18, 2012
```
Avoid calling jobacct_gather_g_getinfo() until there is data to read from the socket.
```
4c0eea7b

15 Jan, 2012 1 commit
- Note nature of latest patches · c8948368
  jette authored Jan 15, 2012
  
  c8948368
14 Jan, 2012 1 commit
- Have sacctmgr remove user records when no associations exist for that user. · 0fe8e29f
  Danny Auble authored Jan 13, 2012
  
  0fe8e29f
13 Jan, 2012 3 commits
- Fix for sacct printing CPUTime(RAW) where the the is greater than a 32 bit · adf582b0
  Danny Auble authored Jan 13, 2012
```
number.
```
  adf582b0
- minor updates for latest commit · 08854a56
  Morris Jette authored Jan 13, 2012
  
  08854a56
- Let operators see reservation data even if private · 4c24fd7d
  Morris Jette authored Jan 12, 2012
```
Let operators see reservation data even if "PrivateData=reservations" flag
is set in slurm.conf. Patch from Don Albert, Bull.
```
  4c24fd7d
09 Jan, 2012 2 commits

Fix bug in srun --multi-prog configuration file · f59f6a27

Morris Jette authored Jan 09, 2012

Fix bug in srun --multi-prog configuration file to avoid printing duplicate
record error when "*" is used at the end of the file for the task ID. It
means all task IDs not otherwise identified.

f59f6a27

Fix possible slurmd deadlock from sbast command. · cb3b9fb5

Morris Jette authored Jan 09, 2012

Fix race condition where sbcast command can result in deadlock of slurmd
daemon. Patch by Don Albert, Bull.

cb3b9fb5

04 Jan, 2012 1 commit

Made squeue -n and -w options more consistent · 15b47474

jette authored Jan 03, 2012

Made squeue -n and -w options more consistent with salloc, sbatch, srun,
and scancel. Patch by Don Lipari, LLNL.

15b47474

28 Dec, 2011 2 commits
- BLUEGENE - Added DefaultConnType to the bluegene.conf file. This makes it · d0704321
  Danny Auble authored Dec 28, 2011
```
so you can specify any connection type you would like (TORUS or MESH) as
the default in dynamic mode.  Previously it always defaulted to TORUS.
```
  d0704321
- Permit gres count configuration of zero. · 0d779c41
  Morris Jette authored Dec 28, 2011
  
  0d779c41
27 Dec, 2011 1 commit

Add new command, sdiag · 4fdf2742

jette authored Dec 26, 2011

Add new command, sdiag, which reports a variety of job scheduling
statistics. Based upon work by Alejandro Lucero Palau, BSC.

4fdf2742

21 Dec, 2011 1 commit
- Modify PAM module to use same libslurm as built with · d46b33f6
  Morris Jette authored Dec 20, 2011
  
  d46b33f6
19 Dec, 2011 2 commits
- Fix bug in sview layout if node count less than configured grid_x_width. · be1f9868
  Morris Jette authored Dec 19, 2011
  
  be1f9868
- Modify srun --multi-prog argument processing · 47f6502e
  Morris Jette authored Dec 19, 2011
```
Behavior of srun --multi-prog modified so that any program arguments
specified on the command line will be appended to the program arguments
specified in the program configuration file.
```
  47f6502e
17 Dec, 2011 1 commit
- Note recent code changes · f455c48a
  Morris Jette authored Dec 16, 2011
  
  f455c48a
16 Dec, 2011 1 commit
- Fix man2html process to compile in the build directory instead of the · 66518aa6
  Danny Auble authored Dec 16, 2011
```
source dir.
```
  66518aa6
15 Dec, 2011 1 commit

Prevent resetting a held job's priority · fa477448

Morris Jette authored Dec 14, 2011

Prevent resetting a held job's priority when updating other job parameters.
Patch from Alejandro Lucero Palau, BSC.

fa477448

14 Dec, 2011 2 commits
- Handle numeric suffix of "T" for terabyte units · f58a563f
  Morris Jette authored Dec 14, 2011
```
Patch from John Thiltges, University of Nebraska-Lincoln.
```
  f58a563f
- BGQ - more thorough handling of blocks with multiple jobs running on them. · 8b0aaa95
  Danny Auble authored Dec 13, 2011
  
  8b0aaa95
13 Dec, 2011 1 commit
- BGQ - handle deadlock issue when a nodeboard goes into an error state. · 0d1a504b
  Danny Auble authored Dec 13, 2011
  
  0d1a504b
09 Dec, 2011 8 commits
- Add slashes in front of derived exit code when modifying a job. · 42b72f63
  Danny Auble authored Dec 09, 2011
  
  42b72f63
- Fixed issue with comment field being used in a job finishing before it · 76381a75
  Danny Auble authored Dec 09, 2011
```
starts in accounting.
```
  76381a75
- Fixed issue with QOS preemption when adding new QOS. · 63791053
  Danny Auble authored Dec 09, 2011
  
  63791053
- Terminate job/step if srun process killed · 6aead2dd
  Morris Jette authored Dec 09, 2011
```
Add an srun shepard process to cancel a job and/or step of the srun process
is killed abnormally (e.g. SIGKILL).
```
  6aead2dd
- Add slashes in front of derived exit code when modifying a job. · fca0660c
  Danny Auble authored Dec 09, 2011
  
  fca0660c
- Fixed issue with comment field being used in a job finishing before it · a178318f
  Danny Auble authored Dec 09, 2011
```
starts in accounting.
```
  a178318f
- Fixed issue with QOS preemption when adding new QOS. · 614cd5fb
  Danny Auble authored Dec 09, 2011
  
  614cd5fb
- sacct search for jobs using filtering was ignoring wckey filter. · 66d68934
  Morris Jette authored Dec 09, 2011
  
  66d68934
08 Dec, 2011 2 commits
- BGQ - handle preemption · 1276285f
  Danny Auble authored Dec 07, 2011
  
  1276285f
- BLUEGENE - Fixed preemption issue. · bcc3c6a9
  Danny Auble authored Dec 07, 2011
  
  bcc3c6a9
06 Dec, 2011 1 commit

Permit pending job to exeeded partition limit with QOS flag change. · 0e1abeda

Morris Jette authored Dec 06, 2011

One of our testers discovered a regression in version 2.3.1.  If a job is
pending due to PartitionNodeLimit and the limit is relieved with a
'sacctmgr modify qos name=<qos name> set flags=partitionmaxnodes' new jobs
exceeding the partition limit (but not the QOS limit) are allowed to run.
However, the pending job is never allowed to run.  Attached is a patch to
address this problem.  FYI, this problem doesn't exist in version 2.4.
Patch from Bill Brophy, Bull.

0e1abeda