Commits · 5fe8c10f55982cd72806d678b5af403599193817 · Manuel G. Marciani / ces_slurm_simulator

22 May, 2012 1 commit
- Fix DefMemPerCPU for partition definitions · 5fe8c10f
  Danny Auble authored May 22, 2012
  
  5fe8c10f
16 May, 2012 2 commits
- updated for 2.3.6 (if it ever happens) · 7ca49560
  Danny Auble authored May 16, 2012
  
  7ca49560
- update META for new 2.3.5 tag · 616bd430
  Danny Auble authored May 16, 2012
  
  616bd430
09 May, 2012 2 commits

Clarify job step default gres allocation · 0ffeac6b
Morris Jette authored May 09, 2012

0ffeac6b

Reset priority of system held jobs when dependency is satisfied · bf9f2452

Don Lipari authored May 09, 2012

The symptom is that SLURM schedules lower priority jobs to run when higher priority, dependent jobs have their dependencies satisfied.  This happens because dependent jobs still have a priority of 1 when the job queue is sorted in the schedule() function.  The proposed fix forces jobs to have their priority updated when their dependencies are satisfied.

bf9f2452

07 May, 2012 1 commit

Job priority reset bug on slurmctld restart · 5e9dca41

Don Lipari authored May 07, 2012

The commit 8b14f388 on Jan 19, 2011 is causing problems with Moab cluster-scheduled machines.  Under this case, Moab hands off every job submitted immediately to SLURM which gets a zero priority.  Once Moab schedules the job, Moab raises the job's priority to 10,000,000 and the job runs.

When you happen to restart the slurmctld under such conditions, the sync_job_priorities() function runs which attempts to raise job priorities into a higher range if they are getting too close to zero.  The problem as I see it is that you include the "boost" for zero priority jobs.  Hence the problem we are seeing is that once the slurmctld is restarted, a bunch of zero priority jobs are suddenly eligible.  So there becomes a disconnect between the top priority job Moab is trying to start and the top priority job SLURM sees.

I believe the fix is simple:

diff job_mgr.c~ job_mgr.c
6328,6329c6328,6331
<       while ((job_ptr = (struct job_record *) list_next(job_iterator)))
<               job_ptr->priority += prio_boost;
---
       while ((job_ptr = (struct job_record *) list_next(job_iterator))) {
               if (job_ptr->priority)
                       job_ptr->priority += prio_boost;
       }
Do you agree?

Don

5e9dca41

03 May, 2012 2 commits

Pick step's relative nodes based upon nodes allocated to job, not nodes available to job · 63833965
Matthieu Hautreux authored May 03, 2012

63833965

Fix segv in slurmctld for job step with relative option · 9bb178c3

Matthieu Hautreux authored May 03, 2012

Here is the way to reproduce it :
[root@cuzco27 georgioy]# salloc -n64 -N4 --exclusive
salloc: Granted job allocation 8
[root@cuzco27 georgioy]#srun -r 0 -n 30 -N 2 sleep 300&
[root@cuzco27 georgioy]#srun -r 1 -n 40 -N 3 sleep 300&
[root@cuzco27 georgioy]# srun: error: slurm_receive_msg: Zero Bytes were transmitted or received
srun: error: Unable to create job step: Zero Bytes were transmitted or received

9bb178c3

27 Apr, 2012 1 commit
- Fix minor issue where uid and gid were switched in sview for submitting · 8e5da472
  Danny Auble authored Apr 27, 2012
```
batch jobs.
```
  8e5da472
26 Apr, 2012 1 commit
- Revert commit 77645508 · 953fc030
  Morris Jette authored Apr 26, 2012
  
  953fc030
25 Apr, 2012 1 commit

Append "*" to default partition name with format and no size · 77645508

Don Albert authored Apr 25, 2012

Show this HTML in a new window?
There is a minor problem with the display of partition names in
"sinfo".  Without options, the partition name field displays a
asterisk "*" at the end of the name of the Default partition.  If you
specify a formatting option which contains the %P field specifier with
a width option (e.g., sinfo -o %8P) the asterisk also is appended to
the default partition name.  With no width option, the "%P" displays
the name based on the full length of the name string, however, no "*"
is appended on the default partition name.

The attached patch for version 2.4.0-pre4 corrects the problem so that
the "*" is correctly appended when %P with no width specifier is
used. The patch will also apply to version 2.3.4.

  -Don Albert-

77645508

24 Apr, 2012 1 commit
- Fix to job preemption logic to preempt multiple jobs at the same time. · 27155dc8
  Morris Jette authored Apr 24, 2012
  
  27155dc8
23 Apr, 2012 2 commits
- Avoid sched/wiki2 parsing problem if quotes in user working dir or wckey · cf81b117
  Morris Jette authored Apr 23, 2012
  
  cf81b117
- Add support for switches parameter to the job_submit/lua plugin · 50360372
  Par Andersson authored Apr 22, 2012
  
  50360372
20 Apr, 2012 1 commit
- CRAY - fix for handling memory requests from user for an allocation. · 5604c5b4
  Danny Auble authored Apr 20, 2012
```
Previously the code would come up with how much memory a PE should have
instead of the memory a node should have.
```
  5604c5b4
17 Apr, 2012 1 commit

fix sched/wiki2 (Moab) to support "#" in job record information · 6cd20848

Morris Jette authored Apr 16, 2012

Fix sched/wiki2 to support job account name, gres, partition name, wckey,
or working directory that contains "#" (a job record separator). Without
this patch, the parsing will probably stop once reaching the "#".

6cd20848

12 Apr, 2012 3 commits
- fixed bad endif for front-end systems · 066ab9cc
  Danny Auble authored Apr 12, 2012
  
  066ab9cc
- Better warning in the slurmd if running with FastSchedule=0 · 3a5a72eb
  Danny Auble authored Apr 12, 2012
```
and cons_res or gang scheduling.  The slurm.conf is always used for a node
configuration in this case, and it ignores what the actual hardware is
because in this situation the slurmctld needs to make a bunch of bitmaps
before the slurmd's register.
```
  3a5a72eb
- Fix issue where log message is more than 256 chars and then has a format · f9aa52fc
  Danny Auble authored Apr 12, 2012
  
  f9aa52fc
10 Apr, 2012 6 commits
- Comment use of --diable-salloc-background · 2ec730f6
  Morris Jette authored Apr 10, 2012
  
  2ec730f6
- Fix clearing of limit values if an admin removes the limit for max cpus · fd999b73
  Danny Auble authored Apr 10, 2012
```
and time limit where it was previously set by an admin.
```
  fd999b73
- Fix state restore of job limit set from admin value for min_cpus. · ae185ed8
  Danny Auble authored Apr 10, 2012
  
  ae185ed8
- Fix potential race condition if MinJobAge is very low (i.e. 1) and using · 0fed555a
  Danny Auble authored Apr 10, 2012
```
slurmdbd accounting and running large amounts of jobs (>50 sec).  Job
information could be corrupted before it had a chance to reach the DBD.
```
  0fed555a
- better debug in assoc_mgr for associations without access to their default · 780ca5bb
  Danny Auble authored Apr 10, 2012
```
qos.
```
  780ca5bb
- Revert d53b7c26 · 54e06e0f
  Morris Jette authored Apr 10, 2012
  
  54e06e0f
05 Apr, 2012 1 commit

Prevent users from extending the EndTime of running jobs · 62edab22

Don Lipari authored Apr 04, 2012

While safeguards are in place to prevent unauthorized users from extending the
TimeLimit of their running jobs, there were no such restrictions for extending
the EndTime. This patch adds the same constraints to modifying EndTime that
currently exists for modifying TimeLimit.

62edab22

03 Apr, 2012 1 commit

Limit depth of circular job dependency check · 0caecbc5

Morris Jette authored Apr 02, 2012

Add support for new SchedulerParameters of max_depend_depth defining the
maximum number of jobs to test for circular dependencies (i.e. job A waits
for job B to start and job B waits for job A to start). Default value is
10 jobs.

0caecbc5

02 Apr, 2012 1 commit
- Note gres File option does not support regular expressions. · fce94e9f
  Morris Jette authored Apr 02, 2012
  
  fce94e9f
30 Mar, 2012 1 commit
- Fixed moab_2_slurmdb.pl script to correctly work for end records. · 046a633b
  Danny Auble authored Mar 30, 2012
  
  046a633b
29 Mar, 2012 2 commits

Fix in select/cons_res+topology+job with node range count · f64b29a2

Morris Jette authored Mar 28, 2012

The problem was conflicting logic in the select/cons_res plugin. Some of the code was trying to get the job the maximum node count in the range while other logic was trying to minimize spreading out of the job across multiple switches. As you note, this problem only happens when a range of node counts is specified and the select/cons_res plugin and the topology/tree plugin and even then it is not easy to reproduce (you included all of the details below).

Quoting Martin.Perry@Bull.com:

> Certain combinations of topology configuration and srun -N option produce
> spurious job rejection with "Requested node configuration is not
> available" with select/cons_res. The following example illustrates the
> problem.
>
> [sulu] (slurm) etc> cat slurm.conf
> ...
> TopologyPlugin=topology/tree
> SelectType=select/cons_res
> SelectTypeParameters=CR_Core
> ...
>
> [sulu] (slurm) etc> cat topology.conf
> SwitchName=s1 Nodes=xna[13-26]
> SwitchName=s2 Nodes=xna[41-45]
> SwitchName=s3 Switches=s[1-2]
>
> [sulu] (slurm) etc> sinfo
> PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
> ...
> jkob         up   infinite      4   idle xna[14,19-20,41]
> ...
>
> [sulu] (slurm) etc> srun -N 2-4 -n 4 -p jkob hostname
> srun: Force Terminated job 79
> srun: error: Unable to allocate resources: Requested node configuration is
> not available
>
> The problem does not occur with select/linear, or topology/none, or if -N
> is omitted, or for certain other values for -N (for example, -N 4-4 and -N
> 2-3 work ok). The problem seems to be in function _eval_nodes_topo in
> src/plugins/select/cons_res/job_test.c. The srun man page states that when
> -N is used, "the job will be allocated as many nodes as possible within
> the range specified and without delaying the initiation of the job."
> Consistent with this description, the requested number of nodes in the
> above example is 4 (req_nodes=4).  However, the code that selects the
> best-fit topology switches appears to make the selection based on the
> minimum required number of nodes (min_nodes=2). It therefore selects
> switch s1.  s1 has only 3 nodes from partition jkob. Since this is fewer
> than req_nodes the job is rejected with the "node configuration" error.
>
> I'm not sure where the code is going wrong.  It could be in the
> calculation of the number of needed nodes in function _enough_nodes.  Or
> it could be in the code that initializes/updates req_nodes or rem_nodes. I
> don't feel confident that I understand the logic well enough to propose a
> fix without introducing a regression.
>
> Regards,
> Martin

f64b29a2

Format change, no change in logic · ebca432e
Morris Jette authored Mar 28, 2012

ebca432e

27 Mar, 2012 2 commits

Use site maximum for option switch wait time. · 85f8ac03

Morris Jette authored Mar 27, 2012

When the optional max_time is not specified for --switches=count, the site
max (SchedulerParameters=max_switch_wait=seconds) is used for the job.
Based on patch from Rod Schultz.

85f8ac03

Correction to init.d/slurmdbd exit code for status option · 471ba178
Morris Jette authored Mar 27, 2012
```
Patch by Bill Brophy, Bull.
```
471ba178

26 Mar, 2012 1 commit

Fixed the setting of SLURM_SUBMIT_DIR for Moab · a5d8962c

Morris Jette authored Mar 26, 2012

Patch by Don Lipari, LLNL.
https://github.com/chaos/slurm/commit/4de11bf0a8cd18207a60e7d3e1fa7a6fde0da431

a5d8962c

21 Mar, 2012 4 commits
- CRAY: Fix support for SlurmdTimeout=0 · 4dd9e697
  Morris Jette authored Mar 21, 2012
```
CRAY: Fix support for configuration with SlurmdTimeout=0 (never mark
node that is DOWN in ALPS as DOWN in SLURM).
```
  4dd9e697
- Minor test mods for old RedHat distro · 455283c2
  Morris Jette authored Mar 21, 2012
  
  455283c2
- make test work better on different systems · 47aebf2c
  Morris Jette authored Mar 21, 2012
  
  47aebf2c
- Modify Makefiles to support Hardening flags · a7e89e72
  Morris Jette authored Mar 20, 2012
  
  a7e89e72
20 Mar, 2012 2 commits
- Improve support for overlapping reservations · 73351553
  Morris Jette authored Mar 20, 2012
```
Improve support for overlapping advanced reservations.
Patch from Bill Brophy, Bull.
```
  73351553
- Merge pull request #13 from grondo/2.3-step-memcg-fixes · d835060d
  Morris Jette authored Mar 20, 2012
```
task/cgroup: minor job step memcg fixes
```
  d835060d