User Tools

Site Tools


tools:run_1st_exp

This is an old revision of the document!


Running an experiment at BSC

The idea of this documentation is to provide information for people running climate models at BSC. The goal is not to replace the autosubmit documentation, it is to provide a more local information, in particular to help new people in the climate prediction group to run EC-Earth simulations.

Please, feel free to fill and/or correct this documentation
Documentation to be checked and completed. It should be good that people include in this page some synthesised information about how to debug climate models at BSC. The part concerning the post-processing of experiment also need to be completed.
Modification History:
Martin Ménégoz, 31/05/2016
Simon Wild, 09/04/2018

This user guide below is rather obsolete - it contains some helpful overview information and is worth reading through. For a working tutorial and an updated userguide you should look here: https://earth.bsc.es/gitlab/es/auto-ecearth3/wikis/tutorials and here https://earth.bsc.es/gitlab/es/auto-ecearth3/wikis/userguide

First step: Use autosubmit

To launch an experiment at BSC, for example a simulation with the ocean-atmosphere coupled version of EC-Earth, we have at our disposal autosubmit, a tools designed to run any models, and in particular climate models, on any machines. Use the documentation of autosubmit to get an experiment ID, create an experiment with the model version that you need, and to launch your experiment: http://autosubmit.readthedocs.io/en/latest/index.html. Autosubmit will provide you a name for your new experiment.
In the autosubmit tutorial under the above link some updates are necessary: HPC name should be changed to 'marenostrum4' and the gitlab url needs to be updated to 'https://earth.bsc.es/gitlab/es/auto-ecearth3' (see NB3 below). There are likely more updates necessary.

To load autosubmit in Earth infrastructure,

module load autosubmit

NB1: the information needed by autosubmit to prepare an experiment (model version, HPC, etc…) is set up in the file expdef_${exp}.conf. In particular, you have to indicate which model sources you want to use. At BSC, each model version appears generally under a git project: for example: https://earth.bsc.es/gitlab/es/auto-ecearth3 for the 2016 version.

NB2: to launch an experiment from BSC, do not do it from your local machine, but from the machine bscesautosubmit01 (to avoid overloading your poor own machine). ⇒ type ssh -XY bscesautosubmit01 to open the connection to this machine.

NB3: MareNosturm users, auto-ecearth3 assumes that you have a SSH alias to connect with a different user depending on the project to account the consumed hours to. Add to your .ssh/config following lines in case your project is bsc32 and your user bsc32704:

Host mn-bsc32
HostName mn1.bsc.es
User bsc32704
IdentityFile ~/.ssh/id_rsa  

Make sure you add all MN login nodes to the config file. See also here: https://earth.bsc.es/gitlab/es/auto-ecearth3/wikis/ssh.

NB4: When creating an experiment and you receive the error that some template/repository etc from gitlab is not found it is likely a permission issue.

Second step: configuration

Then, you have to configure your experiment. After the creation of your experiment, you will find a new directory corresponding to your experiment in /esnas/autosubmit/ including:

* pkl/ a directory used by autosubmit to “store” the work flow of your experiments.

* plot: a directory where autosubmit generates plots to monitor your experiment.

* tmp: a temporary directory where autosubmit will run the local setup of your experiment. You will have a look in this directory if the Local Setup of your experiment fails.

* conf/: contains configurations files that you will need to fill before running your experiment:

  • expdef_${exp}.conf: the definition of your experiment. here you will define startdates, number and length of chunks, and the model version you want to use.
  • autosubmit_${exp}.conf: autosubmit information. Generally, you will not modify this file that define in particular the version of autosubmit that you are using.
  • jobs_${exp}.conf: the information required by autosubmit when it will submit your job (requested time, total number of processor). For EC-Earth version anterior to EC-Earth 3.2Beta, you can use this table to find the number of processors and wall clock you have to put for each jobs: https://earth.bsc.es/wiki/doku.php?id=scaling:coupled_ec_earth . For EC_earth 3.2Beta: https://earth.bsc.es/wiki/doku.php?id=scaling:coupled_ec_earth_3.2_beta .
  • platforms_${exp}.conf: information related to the way to consume the computing time for your simulation. On which project can you afford such a simulation? You can specify a queue test if you want to test the new development that you are implementing.
  • proj_${exp}.conf : resolution, number of processors for each component of the model, initial conditions, options of the different components (atmospheric, oceanic models and coupling interface).

* proj: contains the directory model/ with all the source files of your climate model. This directory also contains the templates files. There is a version of templates for each HPC and each model version, since these files are used to transmit the information from the configuration files (that are in the conf/ directory) to the code during a simulation. You will find the different namelists defining the values of the flags or variables used by the different components of EC-Earth. For example, the following file set up the cloud physical properties needed by the physical parametrisations in IFS:
/esnas/autosubmit/[exp]/git/model/sources/sources/ifs-36r4/src/ifs/phys_ec/sucldp.F90

Test Suite

For a first attempt it might be worth considering to copy the .conf files from a test experiment with bsc_trunk https://earth.bsc.es/gitlab/es/auto-ecearth3/wikis/bsc_trunk.

Small configuration issues - May 2016

* BSC network is too slow for the current autosubmit current version. To avoid untimely failures, you have to specify in your .bashrc:
export SAGA_PTY_SSH_TIMEOUT=90
* When running on ecmwf or marenostrum model version older than EC-Eearth3.1f, you have to create a directory with the name of your experiment on your scratch /gpfs/scratch/bsc32/bsc[your_account] before launching your experiment or alternatively modify the localsetup.sh plugins by replacing putting a “/” at the end of DEST=$SCRATCH_DIR/$HPCPROJ/$HPCUSER/$EXPID .

Third step post-processing

To be completed.

Eventual step: debugging

When the model fails, it is recommended to have a look at the outputs files of EC-Earth, and in particular:

NEMO output: ocean.output

You will find a funny “AAAAAAAA” when the simulation didn't fail at the end of this output file. Otherwise, you have to look for “E R R O R” somewhere in this file to understand what's happened (do not forget the spaces between the letters of E R R O R).

If you want to check the last nemo file before a crash, you can assemble the results of the different processors in one single file (in the HPC directory where run the code). For example, on marenostrum3 with nemo running on 16 processors:

#########################
cp /gpfs/projects/bsc32/repository/apps/rebuild_nemo/rebuild_nemo* &
/gpfs/scratch/bsc32/bsc[your_account]/[exp]/[date]/[memb]/.
ldd rebuild_nemo.exe
./rebuild_nemo output.abort 16
ncview output.abort.nc
#########################

IFS output: NODE.* If the simulation is correctly finished, you will find at the end of this output file “End of Heap Utilization Profile”. Otherwise, look for “error” in the file.

How to extend an experiment

This session explains how to run extra chunks to an experiment already finished

  1. Modify the expdef_${exp}.conf and add the chuncks you want (the total n of chuncks you want have at end)
  2. Create the experiment (autosubmit create exp_ID, see autosubmit manual for more details: http://autosubmit.readthedocs.io/en/latest/index.html)
  3. Run REMOTESETUP and INI. In order to run only these jobs set LOCALSETUP to COMPLETED and SIM1 to SUSPENDED. (use autosubmit setstus command, see autosubmit manual for more details: http://autosubmit.readthedocs.io/en/latest/index.html)
  4. Recovery all the jobs already completed (autosubmit recovery exp_ID -all -s, see autosubmit manual for more datails: http://autosubmit.readthedocs.io/en/latest/index.html)
  5. Check that LOCALPOST and PLOT are set to WAITING, they might be COMPLETED, if the experiment was finished
  6. Transfer the restart file of the last chunk from /esnas/exp/ecearth/restartfiles/ID/STARTDATE/fc0/restarts to /gpfs/scratch/bsc32/bsc3xxxx/ID/STARTDATE/fc0

The file to be transferred are:

  • RESTC_xxxxxxx for the coupler
  • RESTO_xxxxxxxx for the ocean
  • RESTA_xxxxxxx for the atmosphere

Untar these files and run the experiment as explained in the previous session

tools/run_1st_exp.1543337820.txt.gz · Last modified: 2018/11/27 16:57 by rwhite