Commits · 6a06a145bcab8a58ff71462a4383dcb7aea94bed · Manuel G. Marciani / ces_slurm_simulator

29 Jan, 2011 14 commits

scontrol: disable wait_job on Cray systems · 6a06a145

Moe Jette authored Jan 29, 2011

On Cray, wait_job means to confirm the already existing ALPS reservation. This
is handled already:
 * for salloc by select_g_job_ready() - hence no need to call again;
 * for batch jobs it is done in the stepdmanager.
Hence just print a warning to the user.

13_scontrol-no-wait_job.diff

6a06a145

salloc: add support for Cray · c036763e

Moe Jette authored Jan 29, 2011

This adds support for execution of salloc on a local Cray system,
disabling node sharing (still not supported on XT/XE).

It further disables running salloc within salloc, as it leads to errors: since
Cray uses process group / PAGG IDs for tracking its reservations, running
salloc from within salloc invariably leads to a ALPS resource allocation error.

Thirdly, it disable Cray node allocation on non-Cray systems, since this
requires that the host on which salloc spawns the shell process is capable
of Cray task launch.

If it is not, then the remote slurmctld will reserve the requested nodes, but
the local host runninc salloc will neither be able to confirm the ALPS 
reservation (due to the absence of a local apbasil command), nor would it be
able to run jobs on the compute nodes.

To distinguish this case from general task launch (we use a frontend host where
salloc could end up running jobs on different clusters, depending on the value
exported via $SLURM_CONF), the following condition is tested:

 * Cray build support has been enabled (HAVE_CRAY);
 * the loaded slurm.conf uses select/cray (required on Cray hosts);
 * the local host does not have support for apbasil (HAVE_NATIVE_CRAY undefined).

Since the 'apbasil' command is only available on native Cray systems, this
combination of conditions seems sufficient to prevent accidentally using
salloc on a host which does not support it.

(For sbatch the case is different, since the job script runs on the remote host.)

11_salloc.diff
done with minor change for Cray emulation

c036763e

select/cray: do the inventory immediately before each schedule · 100defe0

Moe Jette authored Jan 29, 2011

This puts the Basil inventory immediately before each (backfill) schedule. 

Having considered multiple alternatives, this is the most robust and least
wasteful solution. The reason is that ALPS keeps internal node state, which
can be changed
 * by the administrator (xtprocadmin),
 * by the node health checker programs (setting some nodes into 'suspect'),
 * by ALPS itself.

Tracking this periodically, e.g. every HealthCheckInterval, may mean to miss
some state changes. The result would not be a crash, but a subsequently
failed ALPS reservation, which would require to undo some of the slurm state.

Also added inventory to plugin/sched/wiki and wiki2 at get_node time

09_Cray-INVENTORY-directly-before-schedule.diff

100defe0

04_Cray-autoconf-rules.diff · a97bbf4f

Moe Jette authored Jan 29, 2011

select/cray: update compile-time and runtime support for Cray build

These changes update build support for Cray XT/XE:
 1. renamed '--cray-xt' into '--cray' since also XE systems are supported;
 2. autoconf rules to cover the various possible build cases:
    a) --enable-cray=off: HAVE_CRAY/HAVE_NATIVE_CRAY undefined,
    b) --enable-cray=on:  HAVE_CRAY defined
       b1) local host is a native Cray system: HAVE_NATIVE_CRAY defined
           (requires installation of mysql-devel and libexpat-devel packages),
       b2) local host is not a native Cray system: the conditionally built
           parts (basil_interface.c, libalps.la) are not built;
 3. updated configure logic:
    - since Cray support depends on mySQL, reordered tests in configure.ac,
    - reordered logic with regard to changes in (2),
    - an AM_CONDITIONAL to build native-Cray parts conditionally,
    - updated configure messages (XT/XE);
 4. run-time read_conf test to ensure use of select/cray is properly supported,
 5. an update of the NEWS file due to the change in (1) ==> may have a conflict
    in case you have a locally-updated copy.

I have compile-tested the three possible scenarios in (2).

a97bbf4f

-- Preserve NodeHostName when reordering nodes due to system topology. · 55ebc2dd
Moe Jette authored Jan 29, 2011
```
    03_Bug-fix_slurmctld-swap-both-NodeAddr-and-NodeHostname-when-reordering.diff
```
55ebc2dd

01_Cray-scontrol-warning-node-update.diff · f8ca2840

Moe Jette authored Jan 29, 2011

scontrol: warn user that base node state can not be changed on Cray

The base node state (UP, DOWN, ALLOCATED, ...) is handled by ALPS and inferred
from reading the output of ALPS inventory requests.

To avoid inconsistencies, it is not possible for a user to alter this node state.
This patch adds a warning to scontrol if a user wants to change node state through
slurm:

palu> scontrol update NodeName=nid00171 State=DOWN
State=DOWN can not be changed through slurm: use native Cray tools such as e.g. xtprocadmin(8)

The 'meta' states such as DRAIN can still be changed.

f8ca2840

svn merge -r22275:22267 https://eris.llnl.gov/svn/slurm/trunk · 3e7505dd
Moe Jette authored Jan 29, 2011
```
This reverses some patches from Gerrit that were old, going to work
forward now from the start
```
3e7505dd

-- Updated configure option "--enable-cray" to support interaction with Cray · 6d20c856

Moe Jette authored Jan 29, 2011

    XT/XE systems, and build on native Cray XT/XE systems (auto-detected).
    Building on native Cray systems requires the cray-MySQL-devel-enterprise
    rpm and expat XML parser library/headers.

select/cray: update compile-time and runtime support for Cray build

These changes update build support for Cray XT/XE:
 1. renamed '--cray-xt' into '--cray' since also XE systems are supported;
 2. autoconf rules to cover the various possible build cases:
    a) --enable-cray=off: HAVE_CRAY/HAVE_NATIVE_CRAY undefined,
    b) --enable-cray=on:  HAVE_CRAY defined
       b1) local host is a native Cray system: HAVE_NATIVE_CRAY defined
           (requires installation of mysql-devel and libexpat-devel packages),
       b2) local host is not a native Cray system: the conditionally built
           parts (basil_interface.c, libalps.la) are not built;
 3. updated configure logic:
    - since Cray support depends on mySQL, reordered tests in configure.ac,
    - reordered logic with regard to changes in (2),
    - an AM_CONDITIONAL to build native-Cray parts conditionally,
    - updated configure messages (XT/XE);
 4. run-time read_conf test to ensure use of select/cray is properly supported,
 5. an update of the NEWS file due to the change in (1) ==> may have a conflict
    in case you have a locally-updated copy.

I have compile-tested the three possible scenarios in (2).

6d20c856

-- Set Cray node order based upon ALPS_NIDORDER configuration. · 04bfa3c1

Moe Jette authored Jan 29, 2011

03_Cray-BASIL-node-ranking.diff
select/cray: perform node ranking

This supplies the select function-pointer to request a reordering of nodes based
on the current Cray node ordering.

The Cray node ordering is set internally via the ALPS_NIDORDER configuration
variables that controls the way ALPS considers nodes.

This ordering in turn determines the order of nodes as the appear subsequently
in the Inventory output. The present patch exploits this fact and uses an
auto-incrementing number to reflect the node ranking (counting is reversed
since the parser returns the nodes in stack/LIFO order).

The node ranking is performed on slurmctld (re-)configuration, hence the tests
are more stringent: exit if Inventory fails (this condition is extremely rare)
and if no nodes are powered up (also a condition that can be cured by restarting
slurmctld only when the system is ready).

04bfa3c1

-- Preserve node's NodeHostName field when reordering for topology. · dbf26340
Moe Jette authored Jan 29, 2011
```
    03_node-reordering-NodeHostName.diff
```
dbf26340
-- For Cray systems, resolve node attributes and coordinates from ALPS. · fd2dfdb9
Moe Jette authored Jan 29, 2011
```
    02_Cray-BASIL-node-attributes-and-coordinates.diff
```
fd2dfdb9
-- Prevent changing a node's Reason or State on a Cray system. · 6e1842fa
Moe Jette authored Jan 29, 2011
```
    02_salloc-no-node-update.diff
```
6e1842fa
Cray BASIL API: basic support added to the select/cray plugin. · 832898b7
Moe Jette authored Jan 29, 2011
```
    01_Cray-BASIL-basic-support.diff plus
    01_changes-from-first-revision-of-patch-01.diff
```
832898b7
Do not attempt to read the batch script for non-batch jobs. This patch · d0093b8e
Moe Jette authored Jan 29, 2011
```
    eliminates some inappropriate error messages. 01_interactive-no-script.diff
```
d0093b8e

28 Jan, 2011 3 commits
- -- Report the StartTime of a job as "Unknown" rather than the year 2106 if its · dd468928
  Moe Jette authored Jan 28, 2011
```
    expected start time was too far in the future for the backfill scheduler
    to compute.
```
  dd468928
- Modify squeue to report unset --mincores, --minthreads or --extra-node-info · 0a0addc9
  Moe Jette authored Jan 28, 2011
```
    values as "*" rather than 65534. Patch from Rod Schulz, BULL.
```
  0a0addc9
- -- Correct logic to set correct job hold state (admin or user) when setting · 6f76fb4a
  Moe Jette authored Jan 28, 2011
```
    the job's priority using scontrol's "update jobid=..." rather than its 
    "hold" or "holdu" commands.
```
  6f76fb4a
27 Jan, 2011 3 commits

Update NEWS for start of v2.2.2 development · fb68ff57
Moe Jette authored Jan 27, 2011

fb68ff57

Fixed issue when using a storage_accounting plugin directly without the... · 753e7225

Danny Auble authored Jan 26, 2011

Fixed issue when using a storage_accounting plugin directly without the slurmDBD updates weren't always sent correctly to the slurmctld, appears to OS dependent, reported by Fredrik Tegenfeldt.

753e7225

In sview, disable the sorting of node records by name at startup for · ae023516

Moe Jette authored Jan 26, 2011

    clusters over 1000 nodes. Users can enable this by selecting the "Name"
    tab. This change dramatically improves scalability of sview.

ae023516

26 Jan, 2011 3 commits
- -- Fixed bug in sacctmgr reported by Fredrik Tegenfeldt · 6c0ac77b
  Don Lipari authored Jan 26, 2011
  
  6c0ac77b
- -- Fix bug which would terminate a job step if any of the nodes allocated to · ae15fd34
  Moe Jette authored Jan 26, 2011
```
    it were removed from the job's allocation. Now only the tasks on those
    nodes are terminated.
```
  ae15fd34
- Fix for checking QOS to override partition limits, previously if not using QOS... · 4019892d
  Danny Auble authored Jan 25, 2011
```
Fix for checking QOS to override partition limits, previously if not using QOS some limits would be overlooked.
```
  4019892d
25 Jan, 2011 2 commits
- -- Disable deletion of partitions that have unfinished jobs (pending, · acd15029
  Moe Jette authored Jan 25, 2011
```
    running or suspended states). Patch from Martin Perry, BULL.
```
  acd15029
- -- In select/cons_res, correct handling of the option · c338e5cc
  Moe Jette authored Jan 24, 2011
```
    SelectTypeParameters=CR_ONE_TASK_PER_CORE.
```
  c338e5cc
24 Jan, 2011 2 commits
- BLUEGENE - Check preemptees one by one to preempt lower priority jobs first instead of first fit. · 8554b8d2
  Danny Auble authored Jan 24, 2011
  
  8554b8d2
- Remove checkpoint/xlch plugin · 98c09c1d
  Moe Jette authored Jan 24, 2011
  
  98c09c1d
22 Jan, 2011 1 commit
- Optimize advanced reservations resource selection for computer topology. · d25c4908
  Moe Jette authored Jan 21, 2011
```
Infrastructure in place plus logic in select/linear.
```
  d25c4908
21 Jan, 2011 5 commits
- Added new job termination state of JOB_PREEMPTED, "PR" or "PREEMPTED" to · e48f95e7
  Moe Jette authored Jan 21, 2011
```
    indicate job termination was due to preemption.
```
  e48f95e7
- Log a job's requeue or cancellation due to preemption to that job's stderr: · 35a59817
  Moe Jette authored Jan 21, 2011
```
    "*** JOB 65547 CANCELLED AT 2011-01-21T12:59:33 DUE TO PREEMPTION ***".
```
  35a59817
- Log a job cancellation due to preemption to that job's stderr. · 3ceaf0d4
  Moe Jette authored Jan 21, 2011
  
  3ceaf0d4
- Added flag "NoReserve" to a QOS to make it so all jobs are created equal... · be32983a
  Danny Auble authored Jan 21, 2011
```
Added flag "NoReserve" to a QOS to make it so all jobs are created equal within a QOS.  So if larger, higher priority jobs are unable to run they don't prevent smaller jobs from running even if running the smaller jobs delay the start of the larger, higher priority jobs.
```
  be32983a
- BLUEGENE - fixed issue where jobs wouldn't wait log enough for blocks to free... · 3242e8e9
  Danny Auble authored Jan 21, 2011
```
BLUEGENE - fixed issue where jobs wouldn't wait log enough for blocks to free and wanted to use blocks that are being freed for other jobs
```
  3242e8e9
20 Jan, 2011 2 commits
- start work on v2.3.0-pre2 · 28708357
  Moe Jette authored Jan 20, 2011
  
  28708357
- -- BLUEGENE - This fixes a race condition with dynamic mode to make it so we... · 452ae1a8
  Danny Auble authored Jan 19, 2011
```
-- BLUEGENE - This fixes a race condition with dynamic mode to make it so we copy the booted and job list of blocks before trying to create.
```
  452ae1a8
19 Jan, 2011 3 commits
- When restarting slurmctld with priority/basic, increment all job priorities · 8b14f388
  Moe Jette authored Jan 19, 2011
```
    so the highest job priority becomes TOP_PRIORITY.
```
  8b14f388
- clarify change in NEWS · 874fa719
  Moe Jette authored Jan 19, 2011
  
  874fa719
- Added new configuration parameter MaxJobId. Once reached, start job value · c3c0663c
  Moe Jette authored Jan 18, 2011
```
    at FirstJobId.
```
  c3c0663c
18 Jan, 2011 1 commit
- Added correct files to the slurm.spec file for correct perl api rpm creation. · 075beab6
  Danny Auble authored Jan 18, 2011
  
  075beab6
15 Jan, 2011 1 commit
- BLUEGENE - more robust checking for states when freeing blocks. · 42fe905d
  Danny Auble authored Jan 15, 2011
  
  42fe905d