Commits · b2b5b908efaf09111c707e9ea596019d72070fc4 · Manuel G. Marciani / ces_slurm_simulator

13 Jul, 2012 6 commits

slurmd: set SLURM_CONF in prolog/epilog environment · b2b5b908

Mark A. Grondona authored Jul 11, 2012

Set SLURM_CONF in default prolog/epilog environment instead
of only in spank prolog/epilog environment.

This change fixes a potential hang during spank prolog/epilog
execution due to the possibility of memory allocation after
fork(2) and before exec(2) when invoking slurmstepd spank
prolog|epilog.

This also has the benefit that SLURM commands used in prolog and epilog
scripts will use the correct slurm.conf file.

b2b5b908

slurmstepd: don't call exec if task fails to get notification from parent · 9006dda4

Mark A. Grondona authored May 19, 2012

If exec_wait_child_wait_for_parent() fails for any reason, it is safer
to abort immediately rather than proceed to execute the user's job.

9006dda4

slurmstepd: Kill remaining children if fork fails · 5b8dba9e

Mark A. Grondona authored May 19, 2012

On a failure of fork(2), slurmstepd would print an error and exit,
possibly leaving previously forked children waiting.

Ensure a better cleanup by killing all active children on fork failure
before exiting slurmstepd.

5b8dba9e

slurmstepd: Close childfd of exec_wait_info in parent · eca089e3

Mark A. Grondona authored May 19, 2012

Close the read end of the pipe slurmstepd uses to notify children
it is time to call exec(2) in order to save one file descriptor per
task. (Previously, the read side of the pipe wasn't closed until
exec_wait_info was destroyed)

eca089e3

squeue: report number of nodes in completing for completing jobs · 2ddc6e70

Mark A. Grondona authored Jul 11, 2012

For some reason squeue was treating completing jobs the same as
pending jobs, and reported the number of nodes as the maximum of
requested nodelist, requested node count or CPUs (divided into nodes?)

This is in contrast to the squeue manpage which explicitly states
that the number of nodes reported for completing jobs should
be only the nodes that are still allocated to the job.

This patch removes the special handling of completing jobs in
src/squeue/print.c:_get_node_cnt(), so that the squeue output for
completing jobs matches documentation. A comment is also added
so that developers looking at the code understand what is going on.

2ddc6e70

Update to high throughput computing web page with more option descriptions · 46a3767e
Morris Jette authored Jul 12, 2012

46a3767e

12 Jul, 2012 10 commits
- move an info message to be debug · 2d3c09ae
  Danny Auble authored Jul 12, 2012
  
  2d3c09ae
- BGQ - add correct locking to ensure protected structures · 7432016e
  Danny Auble authored Jul 12, 2012
  
  7432016e
- BGQ - add creation of bitmap if it does not already exist · 8640832b
  Danny Auble authored Jul 12, 2012
  
  8640832b
- BGQ - Make it possible for a multi midplane allocation to run on more · 010570f4
  Danny Auble authored Jul 12, 2012
```
than 1 midplane but not the entire allocation.
```
  010570f4
- BGQ - correct logic to place multiple (< 1 midplane) steps inside a · 5ed86088
  Danny Auble authored Jul 12, 2012
```
multi midplane block allocation.
```
  5ed86088
- BGQ - correctly remove running jobs when freeing a shared block. · a1f9b6a7
  Danny Auble authored Jul 12, 2012
  
  a1f9b6a7
- update slurm spec file to correctly build on a cray · 5fa2a17d
  Danny Auble authored Jul 12, 2012
  
  5fa2a17d
- BLUEGENE - better debug messages · eeb31e78
  Danny Auble authored Jul 12, 2012
  
  eeb31e78
- BLUEGENE - Handle job completion correctly if an admin removes a block · 5430c095
  Danny Auble authored Jul 12, 2012
```
where other blocks on an overlapping midplane are running jobs.
```
  5430c095
- Minor format change to sbatch man page · aedc5be9
  Morris Jette authored Jul 12, 2012
  
  aedc5be9
11 Jul, 2012 4 commits
- BLUEGENE - If a large block (> 1 midplane) is in error and underlying · 0c371d36
  Danny Auble authored Jul 11, 2012
```
hardware is marked bad remove the larger block and create a block over
just the bad hardware making the other hardware available to run on.
```
  0c371d36
- BGQ - make sure we have a valid block when creating or finishing a step · 4731a11b
  Danny Auble authored Jul 11, 2012
```
allocation.
```
  4731a11b
- BLUEGENE - Sanity check just to make sure BLOCK_MAGIC is correct · 74b70963
  Danny Auble authored Jul 11, 2012
  
  74b70963
- BLUEGENE - remove race condition where if a block is removed while waiting · 11e2759f
  Danny Auble authored Jul 11, 2012
```
for a job to finish on it the number of unused cpus wasn't updated
correctly.
```
  11e2759f
09 Jul, 2012 1 commit
- Fix bug in task layout with select/cons_res plugin and --ntasks-per-node · f9f087f2
  Martin Perry authored Jul 09, 2012
```
See Bugzilla #73 for more complete description of the problem.
Patch by Martin Perry, Bull.
```
  f9f087f2
06 Jul, 2012 1 commit

Fix for incorrect partition point for job · dd1d573f

Carles Fenoy authored Jul 05, 2012

If job is submitted to more than one partition, it's partition pointer can
be set to an invalid value. This can result in the count of CPUs allocated
on a node being bad, resulting in over- or under-allocation of its CPUs.
Patch by Carles Fenoy, BSC.

Hi all,

After a tough day I've finally found the problem and a solution for 2.4.1
I was able to reproduce the explained behavior by submitting jobs to 2 partitions.
This makes the job to be allocated in one partition but in the schedule function the partition of the job is changed to the NON allocated one. This makes that the resources can not be free at the end of the job.

I've solved this by changing the IS_PENDING test some lines above in the schedule function in (job_scheduler.c)

This is the code from the git HEAD (Line 801). As this file has changed a lot from 2.4.x I have not done a patch but I'm commenting the solution here.
I've moved the if(!IS_JOB_PENDING) after the 2nd line (part_ptr...). This prevents the partition of the job to be changed if it is already starting in another partition.

job_ptr = job_queue_rec->job_ptr;

part_ptr = job_queue_rec->part_ptr;
job_ptr->part_ptr = part_ptr;
xfree(job_queue_rec);

if (!IS_JOB_PENDING(job_ptr))

continue; /* started in other partition */

Hope this is enough information to solve it.

I've just realized (while writing this mail) that my solution has a memory leak as job_queue_rec is not freed.

Regards,
Carles Fenoy

dd1d573f

04 Jul, 2012 1 commit
- Tweak test for down node · 4dc4fe90
  Morris Jette authored Jul 03, 2012
  
  4dc4fe90
03 Jul, 2012 4 commits
- BLUEGENE - Correct potential deadlock issue when hardware goes bad and · f0949d91
  Danny Auble authored Jul 03, 2012
```
there are jobs running on that hardware.
```
  f0949d91
- Add gres count value check (>0 && <NO_VAL, 0xfffffffe) · 88ad2c61
  Morris Jette authored Jul 03, 2012
  
  88ad2c61
- Clarify time limit handling in man page. · d37cab14
  Lipari, Don authored Jul 03, 2012
  
  d37cab14
- Fix typo in bluegene web page · 00b78dfa
  Tim Wickberg authored Jul 03, 2012
  
  00b78dfa
02 Jul, 2012 5 commits
- Update META for tag 2.4.1 · c8651870
  Danny Auble authored Jul 02, 2012
  
  c8651870
- fix to make 2.4.0 work to 2.4.1 state · 219aa3e8
  Danny Auble authored Jul 02, 2012
  
  219aa3e8
- Fix bug for job state change from 2.3 -> 2.4 job state can now be preserved · 3bc86988
  Carles Fenoy authored Jul 02, 2012
```
correctly when transitioning.  This also applies for 2.4.0 -> 2.4.1, no
state will be lost. (Thanks to Carles Fenoy)
```
  3bc86988
- Note maximum gres count is 4G · f35ad166
  Morris Jette authored Jul 02, 2012
  
  f35ad166
- Note maximum gres count supported · 9410e98e
  Morris Jette authored Jul 02, 2012
  
  9410e98e
29 Jun, 2012 2 commits
- Document that gang scheduled jobs all must fit into memory · 8bad9a3c
  Morris Jette authored Jun 29, 2012
  
  8bad9a3c
- fix mpi formatting problem in slurm.conf man page · f259fca4
  Morris Jette authored Jun 28, 2012
  
  f259fca4
28 Jun, 2012 2 commits
- Changes for 2.4 tag · 94ea2e84
  Danny Auble authored Jun 28, 2012
  
  94ea2e84
- Fix typos intialize->initialize from Janne Blomqvist · 5173c388
  Janne Blomqvist authored Jun 28, 2012
```
janne.blomqvist@aalto.fi
```
  5173c388
27 Jun, 2012 3 commits
- Fix for setting reason field for user/system hold · a5431885
  Mark Nelson authored Jun 27, 2012
  
  a5431885
- Note how --distribution=arbitrary does not control task layout at job level · 24e9ee76
  Morris Jette authored Jun 27, 2012
  
  24e9ee76
- Fix for step arbitrary allocation with hostlist from job's env vars · 07407fc3
  Morris Jette authored Jun 26, 2012
  
  07407fc3
26 Jun, 2012 1 commit

Added logic for a Natural Sort · 369437f1

Danny Auble authored Jun 26, 2012

(via code from Martin Pool <mbp sourcefrog net>)
so we can get a correct alphanumeric sort of hostnames.

369437f1