Commits · dea974a5998426ddbb41317d644e1f6c54dd477f · Manuel G. Marciani / ces_slurm_simulator

10 Jan, 2013 5 commits
- Merge branch 'slurm-2.5' · dea974a5
  Morris Jette authored Jan 09, 2013
  
  dea974a5
- Clarify documenation with respect to CPU binding · 54ef6172
  Morris Jette authored Jan 09, 2013
  
  54ef6172
- Fix logic to optimize GRES topology with respect to allocated CPUs. · a6bfae71
  Morris Jette authored Jan 09, 2013
  
  a6bfae71
- Merge remote-tracking branch 'origin/slurm-2.5' · fbd601b4
  Danny Auble authored Jan 09, 2013
  
  fbd601b4
- BGQ - minor fix to show correct step node count. · cf922f10
  Danny Auble authored Jan 09, 2013
  
  cf922f10
09 Jan, 2013 14 commits
- Merge remote-tracking branch 'origin/slurm-2.5' · da2e1dd0
  Danny Auble authored Jan 09, 2013
  
  da2e1dd0
- Fixed bad variables · 2d8a8e4b
  Danny Auble authored Jan 09, 2013
  
  2d8a8e4b
- Added salloc to test8.11 for testing. · 8ca946b1
  Nathan Yee authored Jan 09, 2013
  
  8ca946b1
- Merge branch 'slurm-2.5' · ac10bf0d
  Morris Jette authored Jan 09, 2013
  
  ac10bf0d
- Update contributor agreement page to advise SchedMD contact · 9c0f8832
  Morris Jette authored Jan 09, 2013
  
  9c0f8832
- Add contributor agreements to web pages · 15c4f769
  Morris Jette authored Jan 09, 2013
  
  15c4f769
- Print new-line after scontrol EOF · 3f52e3ee
  David Bigagli authored Jan 09, 2013
  
  3f52e3ee
- Added printout for nojobs and nosteps options for AccountStorageEnforce · 702ae0cb
  Danny Auble authored Jan 09, 2013
  
  702ae0cb
- Merge remote-tracking branch 'origin/slurm-2.5' · a8d2a9d3
  Danny Auble authored Jan 09, 2013
  
  a8d2a9d3
- Add missing "safe" flag from print of AccountStorageEnforce option. · 720153b7
  Danny Auble authored Jan 09, 2013
  
  720153b7
- Merge remote-tracking branch 'origin/slurm-2.5' · b2072b90
  Danny Auble authored Jan 08, 2013
  
  b2072b90
- Add new AccountStorageEnforce options of 'nojobs' and 'nosteps' which will · caa974ae
  Danny Auble authored Jan 08, 2013
```
allow the use of accounting features like associations, qos and limits but
not keep track of jobs or steps in accounting.
```
  caa974ae
- Change int to uint16_t · eab078f9
  Danny Auble authored Jan 08, 2013
  
  eab078f9
- Fix remainder of redundant malloc failure errors · 14c08256
  Morris Jette authored Jan 08, 2013
  
  14c08256
08 Jan, 2013 12 commits

BLUEGENE - fix for QOS/Association node limits. · b50e2269
Danny Auble authored Jan 08, 2013

b50e2269
Merge branch 'slurm-2.5' · e6ea0ec4
Morris Jette authored Jan 08, 2013
```
Conflicts:
	testsuite/expect/test15.21
```
e6ea0ec4
In select/serial enforce both node and core count for reservation createio · f00745ea
jette authored Jan 08, 2013

f00745ea
Disable a test for select/serial plugin · 9bc0cf0b
jette authored Jan 08, 2013

9bc0cf0b
Modify test for change in how squeue reports node state alloc or idle · 82e0bb50
Morris Jette authored Jan 08, 2013

82e0bb50
Remove redundant malloc failure tests · 16d7318d
Nathan Yee authored Jan 08, 2013

16d7318d
Merge branch 'slurm-2.5' · e5c8de12
Morris Jette authored Jan 08, 2013

e5c8de12

Report node state as MAINT only if not allocated jobs · 2af5ce33

Rod Schultz authored Jan 08, 2013

One of our testers has observed that when a long running job continues to run after a maintenance reservation comes into effect sinfo reports the node as being in the allocated state while scontrol shows it to be in the maintenance state.

This can happen when a node is not completely allocated. (select cons_res, a partition which is not Shared=EXCLUSIVE, jobs allocated without –exclusive, or jobs that are allocated only some of the cpus on a node.)

Execution paths leading up to calls to node_state_string (slurm_protocol_defs.c) or node_state_string_compact, in scontrol, test for allocated_cpus less that total_cpus on the node and set the node state to MIXED rather than ALLOCATED, while similar paths in sinfo do not.

I think this is probably a bug, since the mixed state is defined and think it is desirable that both command return the same result.

The problem can be fixed with two logic changes (in multiple places)

1) node_state_string and node_state_string_compact have to check for mixed as well as allocated before returning the MAINT state. This means that the reported state for the node with the allocated job will be MIXED.

2) Sinfo must also check allocated_cpus less than total_cpus and set the state to MIXED before calling either node_state_string or node_state_string_compact.

The attached patch (against 2.5.1) makes these changes. The attached script is a test case.

2af5ce33

Fix advanced reservation recovery logic when upgrading from version 2.4. · 604b869e
Morris Jette authored Jan 08, 2013

604b869e

Added support for job arrays. · 2993b423

Morris Jette authored Jan 07, 2013

Phase 1 of effort. See "man sbatch" option -a/--array option for details.
Creates job records using sbatch. Reports job arrays using scontrol or
squeue. More work coming soon...

2993b423

Get rid of errors when using 64 bit bitmaps (nothing sets USE_64BIT_BITSTR · 18c9ecd7
Danny Auble authored Jan 07, 2013
```
today) so bitmaps are always 32bits.  If one would like to use 64bit
bitmaps just #define USE_64BIT_BITSTR in config.h.
```
18c9ecd7

Convert hostlist functions on a multi dimensional system to use a bitmap · eb7500c9

Danny Auble authored Jan 07, 2013

instead of a large array. This appears to speed up the process a big deal
before we were seeing times of over 6000 usecs just to memset the array
for a 5D system. With this patch on average the whole process takes
around 1000 usecs with many being way under that.

eb7500c9

07 Jan, 2013 1 commit
- BG - fix check of topology plugin · 11bc66aa
  Danny Auble authored Jan 07, 2013
  
  11bc66aa
04 Jan, 2013 5 commits

Use local no-mem functions · 3a6bd336

jette authored Jan 04, 2013

Make sure out of memory gets logged properly for slurmctld in foreground

Fix slurmd and slurmdbd to log out of memory to stdout in foreground

3a6bd336

Use local no-mem functions · 5e1d0210
jette authored Jan 04, 2013

5e1d0210

mpi/mvapich: Don't set MPIRUN_PROCESSES by default · fd5b0e56

Mark A. Grondona authored Jan 22, 2012

The MPIRUN_PROCESSES variable set by the mpi/mvapich plugin probably
is not needed for most if not all recent versions of mvapich.
This environment variable also negatively affects job scalability
since its length is proportional to the number of tasks in a job.
In fact, for very large jobs, the increased environment size can
lead to failures in execve(2).

Since MPIRUN_PROCESSES *might* be required in some older versions of
mvapich, this patch disables the setting of that variable completely
only if SLURM_NEED_MVAPICH_MPIRUN_PROCESSES is not set in the job's
environment. (Thus, by default MPIRUN_PROCESSES is disabled, but
the old behavior may be restored by setting the environment variable
above)

fd5b0e56

Merge branch 'master' of https://github.com/SchedMD/slurm · b196f153
jette authored Jan 03, 2013

b196f153
Fix logic in hostset_create for invalid input · 33cb1e40
jette authored Jan 03, 2013

33cb1e40

03 Jan, 2013 3 commits
- Merge branch 'slurm-2.5' · c79de9f1
  Morris Jette authored Jan 03, 2013
```
Conflicts:
	META
	NEWS
```
  c79de9f1
- Start news for V2.5.2 · 64048706
  Morris Jette authored Jan 03, 2013
  
  64048706
- Update META for v2.5.1 tag · 00f20189
  Morris Jette authored Jan 03, 2013
  
  00f20189