README.md 21.6 KB
Newer Older
rmarti1's avatar
rmarti1 committed
# Python AMIP reader <!-- omit in toc -->
Conversion from Fortran to Python of the EC Earth 4 project AMIP reader using the OASIS API from pyOASIS. This repository contains tests for running both implementations (Fortran code is taken from [EC Earth v3.3.3.1](https://earth.bsc.es/gitlab/es/auto-ecearth3/-/tree/3.3.3.1)) using the current setup for IFS and AMIP from EC Earth v3.3.3.1.
rmarti1's avatar
rmarti1 committed
The AMIP reader provides realistic sea surface temperature and sea ice data from 1979 to near present, thus enabling scientists to focus on the atmospheric model without the added complexity of ocean-atmosphere coupled model feedbacks. It is not meant to be used for climate change prediction, an endeavor that requires a coupled atmosphere-ocean model (e.g., see AMIP's sister project CMIP).

The new Python AMIP reader is a flexible, highly-customizable and ready out-of-the-box component with the following features:
* **Read and send any field** by including it in the [new namelist file](#new-namelist-file) and making sure the input netCDF files follow a [common structure](#add-a-new-field). The following fields have been tested:
  * SST (Sea Surface Temperature) and SIC (Sea Ice Concentration) constructed **mid-month fields** (exactly as the AMIP reader from [EC Earth v3.3.3.1](https://earth.bsc.es/gitlab/es/auto-ecearth3/-/tree/3.3.3.1)).
  * SST (Sea Surface Temperature) and SIC (Sea Ice Concentration) constructed **daily fields** (exactly as the AMIP reader for PRIMAVERA from [EC Earth v3.2.2 (PRIMAVERA production)](https://earth.bsc.es/gitlab/es/auto-ecearth3/-/tree/3.2.2_Primavera_production)).
  * Gridded emissions data as in these [TM5 routines](https://earth.bsc.es/gitlab/svn/ecearth-mirror/-/blob/ceb1f5b95ced1374f190d2a9f9b833a74627c5d8/sources/tm5mp/proj/co2/emission_read__co2.F90) from EC Earth in `m5mp/proj/co2emission_read__co2.F90`.
* **Backwards compatibility** with all the namelist files used by the old Fortran AMIP reader.
* **Dynamic `grids.nc` and `masks.nc` definitions** for OASIS based on the input netCDF files. It will not override any existing grid with the same name (as stated in the [OASIS 4.0 user guide](http://www.cerfacs.fr/oa4web/oasis3-mct_4.0/oasis3mct_UserGuide.pdf)), providing backwards compatibility with all current implementations.

All the source code for the new implementation can be found under the folder [`sources/amip-forcing/src/python/`](sources/amip-forcing/src/python/) of this repository.
Etienne Tourigny's avatar
Etienne Tourigny committed
A report on the implementation can be found in the BSC Earth Wiki at https://earth.bsc.es/wiki/lib/exe/fetch.php?media=degree_students:degree_thesis_rmartin.pdf .

rmarti1's avatar
rmarti1 committed
## Table of contents <!-- omit in toc -->
rmarti1's avatar
rmarti1 committed
- [Python dependencies](#python-dependencies)
- [Usage](#usage)
  - [New namelist file](#new-namelist-file)
    - [Example of the new namelist file](#example-of-the-new-namelist-file)
    - [Backwards compatibility](#backwards-compatibility)
  - [Add a new field](#add-a-new-field)
    - [Input netCDF files](#input-netcdf-files)
  - [Hardcode an operation (inner workings)](#hardcode-an-operation-inner-workings)
- [Run an example](#run-an-example)
rmarti1's avatar
rmarti1 committed
  - [Download the datasets](#download-the-datasets)
rmarti1's avatar
rmarti1 committed
  - [Compile](#compile)
  - [Run](#run)
- [Run the tests](#run-the-tests)
- [Visualize netCDF output files](#visualize-netcdf-output-files)
  - [Generate GIF files](#generate-gif-files)
rmarti1's avatar
rmarti1 committed

rmarti1's avatar
rmarti1 committed
## Python dependencies
The new Python implementation relies on the Python modules `f90nml`, `netCDF4` and `numpy`. In order to install these dependencies in your local machine you may execute the following command.
```bash
pip3 install f90nml netCDF4 numpy
```
Or if running in MN4 (assuming the modules `impi/2017.4` and `mkl/2017.4` are already loaded):
```bash
module load python/3.6.1
module load netcdf/4.2
```

## Usage
The new Python AMIP reader gets all its configuration from a new `namelist.amip` file. However, it can also be used as **drop-in replacement for the Fortran implementation without any additional configuration** for reading and sending the SST and SIC fields for both EC Earth v3.3.3.1 and EC Earth v3.2.2 (PRIMAVERA production). See the [backwards compatibility section](#backwards-compatibility) for more info.

In order to take advantage of all the new functionalities and have more control over the coupling fields and input data, it is highly recommended to use the [new `namelist.amip` configuration structure](#new-namelist-file). Although not necessary, it is also recommended to avoid setting the AMIP grid in the `grids.nc`, `masks.nc` and `areas.nc` files in advance, as the new AMIP reader is able to write them at runtime.

### New namelist file
Although the new AMIP reader works with the old `namelist.amip` configuration files, the new namelist configuration file provides greater flexibility to the user when reading and operating with the input data. Here is a scheme for the new namelist generator bash script taken from [`namelist_python.amip.sh`](./tests/data/namelist_python.amip.sh):
```bash
cat << EOF
!-----------------------------------------------------------------------
&NAMAMIP
!-----------------------------------------------------------------------
    RunLengthSec = ${leg_length_sec}
    TimeStepSec  = ${cpl_freq_amip_sec}
    StartYear    = ${leg_start_date_yyyymmdd:0:4}
    StartMonth   = ${leg_start_date_yyyymmdd:4:2}
    StartDay     = ${leg_start_date_yyyymmdd:6:2}
    FixYear      = ${ifs_cmip_fixyear}
    # Vars(1,:) = <id> <grid_name> <oasis_name> <file_pattern> <netcdf_variable> <yref_min> <yref_max> <timestep> <interpolate> <scale_factor> <offset> <min> <max>
    # ...
    # Vars(n,:) = ...
    LDebug       = false
!-----------------------------------------------------------------------
/
EOF
```

The main difference between the old and the new namelist files is the substitution of the fields exclusively defined for SST and SIC grids (`FileListSST`, `FileListSIC` and `LInterpolate`) for an agnostic array of fields `Vars(:,:)`. `Vars(:,:)` shape is `(n,13)`, being `n` the number of fields to exchange and each coupling field declaration must follow the structure below:

| Index  | Name   | Type                    | Description               |
| ------ | ------ | ----------------------- | ------------------------- |
| 0      | id | `string` | Unique identifier for the field. |
| 1      | grid_name | `string` | Grid name (must match one of the grids declared in the `namcouple` file). |
| 2      | oasis_name | `string` | Variable name (must match one of variables under the `grid_name` declared in the `namcouple` file). |
| 3      | file_pattern | `string` | File pattern of the input netCDF files. I.e: `'HadISST2_prelim_0to360_alldays_sst_[year].nc'`. All patterns must be between brackets `'[]'` and currently the only accepted pattern is: `year`. |
| 4      | netcdf_variable | `string` | Variable to read in the netCDF files. |
| 5      | yref_min | `int32` | Reference year for the `time` variable in the netCDF files. |
| 6      | yref_max | `int32` | The last year of the netCDF files. |
| 7      | timestep | `'monthly'` \| `'daily'` | The step in number of days of the `time` variable in the netCDF files. |
| 8      | interpolate | `true` \| `false` | Whether to interpolate or not (disabled if the timestep is set to `daily`). |
| 9      | scale_factor | `float64` | Scale factor applied after reading the input data. |
| 10     | offset | `float64` | Offset applied after reading the input data. |
| 11     | min | `float64` \| `None`* | Minimum value used to clip the input data before submitting the field. |
| 12     | max | `float64` \| `None`* | Maximum value used to clip the input data before submitting the field. |

\* In order to set a section as `None` you must leave blank the content between its comas. I.e: for setting the last section (`max`) as `None` in AMIP_sst, the field `Vars(1,:)` ends with `..., 273.15, 271.38, ,`. See the [example of the new namelist file](#example-of-the-new-namelist-file) for the complete configuration.

Below is an example of the old `namelist.amip` file used for AMIP forcing in EC Earth v3.3.3.1 next to the new equivalent one created by the script above:

#### Example of the old namelist file <!-- omit in toc -->
```
!-----------------------------------------------------------------------
&NAMAMIP
!-----------------------------------------------------------------------
    RunLengthSec = 5097600
    TimeStepSec  = 86400
    StartYear    = 1991
    StartMonth   = 01
    StartDay     = 01
    FixYear      = 0
    FileListSST  = 'tosbcs_input4MIPs_SSTsAndSeaIce_CMIP_PCMDI-AMIP-1-1-3_gn_187001-201706.nc'
    FileListSIC  = 'siconcbcs_input4MIPs_SSTsAndSeaIce_CMIP_PCMDI-AMIP-1-1-3_gn_187001-201706.nc'
    LDebug       = false
    LInterpolate = true
!-----------------------------------------------------------------------
/
```

#### Example of the new namelist file
```
!-----------------------------------------------------------------------
&NAMAMIP
!-----------------------------------------------------------------------
    RunLengthSec = 5097600
    TimeStepSec  = 86400
    StartYear    = 1991
    StartMonth   = 01
    StartDay     = 01
    FixYear      = 0
    Vars(1,:) = 'AMIP_sst_monthly', 'AMIP', 'AMIP_sst', 'tosbcs_input4MIPs_SSTsAndSeaIce_CMIP_PCMDI-AMIP-1-1-3_gn_187001-201706.nc', 'tosbcs', 1870, 2016, 'monthly', true, 1, 273.15, 271.38, ,
    Vars(2,:) = 'AMIP_sic_monthly', 'AMIP', 'AMIP_sic', 'siconcbcs_input4MIPs_SSTsAndSeaIce_CMIP_PCMDI-AMIP-1-1-3_gn_187001-201706.nc', 'siconcbcs', 1870, 2016, 'monthly', true, 0.01, 0, 0, 1,
    LDebug       = false
!-----------------------------------------------------------------------
/
```

#### Backwards compatibility
The new AMIP reader is backwards compatible with both EC Earth v3.3.3.1 and EC Earth v3.2.2 (PRIMAVERA production) namelist files. It achieves this by internally converting the configuration from the old namelist files to the new representation when reading the namelist file (source code in function `read_namelist()` in [`amip_utils.py`](./sources/amip-forcing/src/python/amip_utils.py)). Here is the internal conversion of the SST coupling field when using each old namelist file:

##### EC Earth v3.3.3.1 (AMIP forcing) <!-- omit in toc -->
| Property   | Value                   |
| ---------- | ----------------------- |
| id | `'AMIP_sst_monthly'` |
| grid_name | `'AMIP'` |
| oasis_name | `'AMIP_sst'` |
| file_pattern | `FileListSST` content from the old namelist file. |
| netcdf_variable | `'tosbcs'` | 
| yref_min | `1870` |
| yref_max | `2016` | The last year of the netCDF files. |
| timestep | `'monthly'` |
| interpolate | `LInterpolate` content from the old namelist file. |
| scale_factor | `1` | 
| offset | `273.15` |
| min | `271.38` |
| max | `None` |

##### EC Earth v3.2.2 (PRIMAVERA production) <!-- omit in toc -->
| Property   | Value                   |
| ---------- | ----------------------- |
| id | `'AMIP_sst_daily'` |
| grid_name | `'PSST'` |
| oasis_name | `'AMIP_sst'` |
| file_pattern | `AmipFileRoot` content from the old namelist file appending `'_sst_[year].nc'`. |
| netcdf_variable | `'sst'` | 
| yref_min | `1850` |
| yref_max | `2016` | The last year of the netCDF files. |
| timestep | `'daily'` |
| interpolate | `False` |
| scale_factor | `1` | 
| offset | `0` |
| min | `None` |
| max | `None` |

### Add a new field
Adding a new field solely consists on adding the input netCDF files to the runtime folder of the program and adding a new entry to the `Vars(:,:)` field in the namelist following the structure mentioned above (see the [new namelist file](#new-namelist-file) section).

#### Input netCDF files
All input netCDF files must contain the following variables (assuming the dimensions `time`, `lat` and `lon` are defined):

| Variable  | Type   | Shape                    | Description               |
| --------- | ------ | ------------------------ | ------------------------- |
| `time` | `double` | `(time)` | Days since the reference year defined as `yref_min` in its namelist entry. |
| `lat` or `latitude` | `double` | `(lat)` | Latitude. |
| `lon` or `longitude` | `double` | `(lon)` | Longitude. |
| Input variable name | `double` or `float` | `(time, ..., lat, lon)` | The variable name must match the `netcdf_variable` value in its namelist entry. |

### Hardcode an operation (inner workings)
If it is necessary to hardcode an operation with respect to one of the coupling fields, it is highly recommended to edit the code within the [`AMIPVar` class](sources/amip-forcing/src/python/amip_var.py). Right now, there are already hardcoded operations used to match the old Fortran implementation or read an specific index of the input netCDF variable. Here are some examples:

```python
logging.debug('{}: no time interpolation t,t2 {} {}'.format(self.id, t_local, self.t2))
# Hardcoded to match Fortran's AMIP clipping avoidance
if self.oasis_name == 'AMIP_sst':
    var_min = np.NINF
```

```python
# This section can be "hardcoded" by the user
# Hardcoded CO2 emissions data. Read only sector id 1: Energy
if self.var_name == 'CO2_em_anthro':
    raw_field = raw_field[1]
```

While the Python implementation was originally inspired by the Fortran implementation, it has now been refactored to take advantage of Python's OOP and simpler modules. For a deeper understanding, here are the simplified sequence and class diagrams of the current implementation.

![Sequence_Diagram](docs/Sequence_Diagram.png)

![Class_Diagram](docs/Class_Diagram.png)

## Run an example
rmarti1's avatar
rmarti1 committed
Running an example consists on executing one of the implementations (Fortran or Python) along with a mock component acting as IFS. This mock component simulates IFS reception behavior as designed in EC Earth v3.3.3.1. The detailed specification can be found below, with the coupling interval and run length taken directly from the `namelist.amip` file.
```python
# Constants
NAMELIST_FILE_NAME = 'namelist.amip'
L080_NX = 35718 # number of LAND grid cells at T159 resolution
L128_NX = 88838  # number of LAND grid cells at T255 resolution
L256_NX = 348528 # number of LAND grid cells at T511 resolution
```
rmarti1's avatar
rmarti1 committed

rmarti1's avatar
rmarti1 committed
**IMPORTANT NOTICE: This documentation assumes you are running an UNIX-based OS with Python +3.6 in the PATH, understand the basics of OASIS and are able to compile and run pyOASIS.**
rmarti1's avatar
rmarti1 committed

rmarti1's avatar
rmarti1 committed
The bash scripts for running the examples rely on the command `cdo` for checking the results. In order to install this dependency, you may execute the following code.
rmarti1's avatar
rmarti1 committed
```bash
apt install -y cdo
```
rmarti1's avatar
rmarti1 committed
Or if running in MN4:
```bash
module load cdo/1.9.3
rmarti1's avatar
rmarti1 committed
```

rmarti1's avatar
rmarti1 committed
### Download the datasets

Due to the large size of some of the required files and datasets, they are not provided in this repository. It is necessary to download them from MN4 before executing any code. Use the following script to download the required files (note you must change `bsc32074` to your username in MN4):
rmarti1's avatar
rmarti1 committed
```bash
rmarti1's avatar
rmarti1 committed
scp -r bsc32074@mn1.bsc.es:/gpfs/projects/bsc32/models/ecearth/v3.3.3.1/inidata/oasis/AMIP/* ./tests/data/forcing
scp -r bsc32074@mn1.bsc.es:/gpfs/projects/bsc32/models/ecearth/v3.3.3.1/inidata/amip-forcing/* ./tests/data/forcing
rmarti1's avatar
rmarti1 committed
# PRIMAVERA dataset
scp -r bsc32074@mn1.bsc.es:/gpfs/projects/bsc32/models/ecearth/v3.2.1/inidata/oasis/AMIP-reader/OLD_AMIP_NL/* ./tests/data/primavera
scp -r bsc32074@mn1.bsc.es:/gpfs/projects/bsc32/models/ecearth/v3.2.2/inidata/amip-primavera/* ./tests/data/primavera
rmarti1's avatar
rmarti1 committed
# Emissions dataset
scp -r bsc32074@mn1.bsc.es:/gpfs/projects/bsc32/models/ecearth/trunk/inidata/tm5/CMIP6/CO2-em-anthro_input4MIPs_emissions_CMIP_CEDS-2017-05-18_gn_200001-201412.nc ./tests/data/co2
rmarti1's avatar
rmarti1 committed
```
If you are downloading the data from PRIMAVERA (+74 GB), it is highly recommended to do it in the background:
```bash
nohup scp -r bsc32074@mn1.bsc.es:/gpfs/projects/bsc32/models/ecearth/v3.2.2/inidata/amip-primavera/* ./tests/data/primavera > nohup.out 2>&1
# ctrl + z
bg
```

Or if running in MN4:
```bash
cp /gpfs/projects/bsc32/models/ecearth/v3.3.3.1/inidata/oasis/AMIP/* ./tests/data/forcing
cp /gpfs/projects/bsc32/models/ecearth/v3.3.3.1/inidata/amip-forcing/* ./tests/data/forcing
cp /gpfs/projects/bsc32/models/ecearth/v3.2.1/inidata/oasis/AMIP-reader/OLD_AMIP_NL/* ./tests/data/primavera
cp /gpfs/projects/bsc32/models/ecearth/v3.2.2/inidata/amip-primavera/* ./tests/data/primavera
rmarti1's avatar
rmarti1 committed
cp /gpfs/projects/bsc32/models/ecearth/trunk/inidata/tm5/CMIP6/CO2-em-anthro_input4MIPs_emissions_CMIP_CEDS-2017-05-18_gn_200001-201412.nc ./tests/data/co2
rmarti1's avatar
rmarti1 committed

### Compile
rmarti1's avatar
rmarti1 committed
Each of the example codes is located in its respect folder inside the [`./sources/amip-forcing/`](./sources/amip-forcing/) folder ([`src/fortran/`](./sources/amip-forcing/src/fortran/) and [`src/python/`](./sources/amip-forcing/src/python/)). In the source folder for the Fortran code there is a [Makefile](./sources/amip-forcing/src/fortran/Makefile). Before compiling OASIS, you must create a `make.inc` file which redirects to a make file with your configuration. You may issue the following command to create your own `make.inc` file:
```bash
cp sources/oasis3-mct/util/make_dir/make.templ.inc sources/oasis3-mct/util/make_dir/make.inc 
```

This would be the contents in `make.inc`:
```
curr_path = $(dir $(realpath $(lastword $(MAKEFILE_LIST))))
include  $(curr_path)/make.linux_gnu_openmpi4
```
 
For running in MN4, you may change the `include` path to `make.mn4`, so the file contents of `make.inc` would look like this:
```
curr_path = $(dir $(realpath $(lastword $(MAKEFILE_LIST))))
include  $(curr_path)/make.mn4
```
rmarti1's avatar
rmarti1 committed
**WARNING: `make.mn4` assumes the modules `impi/2017.4` and `mkl/2017.4` are loaded.** You may need to modify or add a new makefile for your specific MPI setup.

By executing the following command, you will compile OASIS, pyOASIS and the Fortran implementation of this model.
rmarti1's avatar
rmarti1 committed
```bash
(cd ./sources/amip-forcing/src/fortran && make clean)
(cd ./sources/amip-forcing/src/fortran && make oasis)
(cd ./sources/amip-forcing/src/fortran && make)
```

If this is the first time you are compiling pyOASIS in the current session, you must update `LD_LIBRARY_PATH` and `PYTHONPATH` environment variables. Luckily, pyOASIS provides a automatic script:
rmarti1's avatar
rmarti1 committed
```bash
source sources/oasis3-mct/generated/python/init.sh
rmarti1's avatar
rmarti1 committed
```
```bash
source sources/oasis3-mct/generated/python/init.csh
rmarti1's avatar
rmarti1 committed
```
Alternatively, these lines could be added to your `~/.bashrc` or `~/.cshrc` file.

### Run
In order to run the example code with your desired programming language you must execute the script `run_example.sh` inside the [`tests/`](./tests/) folder. Here is an example on how to run the example with the AMIP reader running in Python and the IFS mock component with 4 MPI processes from 1990-01-01 to 1991-01-01 without a fixed year.
rmarti1's avatar
rmarti1 committed
```bash
bash ./run_example.sh python forcing 4 1990-01-01 1991-01-01 0 L128 true
rmarti1's avatar
rmarti1 committed
```

Here is the complete specification for the script:
```bash
./run_example.sh <lang> <amip_mode> <ifs_nproc> <leg_start_date> <leg_end_date> <ifs_cmip_fixyear> <ifs_grid> <amip_interpolate>
rmarti1's avatar
rmarti1 committed
```

Running this script will create a new folder in the root folder of this repository called `work_<lang>_<amip_mode>_<ifs_nproc>_<leg_start_date>_<leg_end_date>_<ifs_cmip_fixyear>_<ifs_grid>_<amip_interpolate>` with all the output and debug information from the run.
rmarti1's avatar
rmarti1 committed

## Run the tests
Execute the following command in the root folder of this repository to run all tests **except the large test** (make sure to meet all the [requirements](#requirements) stated above).
rmarti1's avatar
rmarti1 committed

If you are on MN4, it is highly recommended to start first an interactive session:
```
salloc -p interactive
```

rmarti1's avatar
rmarti1 committed
```bash
bash run_tests.sh
```
If you also want to run the large test along with the other ones, use the following:
```bash
bash run_tests.sh 1
```
rmarti1's avatar
rmarti1 committed

rmarti1's avatar
rmarti1 committed
This command will run both implementations (Fortran and Python) of the AMIP reader and check if all of them result in the same output. Here is an example of a successful test run:
rmarti1's avatar
rmarti1 committed
```
Running tests WITHOUT the large test. Use ./run_tests.sh 1 to run all tests including the large test.
rmarti1's avatar
rmarti1 committed
*****************************************************************
Running tests...
rmarti1's avatar
rmarti1 committed
*****************************************************************
Checking results...
Checking ./work_python_forcing_2_1892-01-01_1894-01-01_0_L128_true vs ./work_fortran_forcing_2_1892-01-01_1894-01-01_0_L128_true
cdo    diff: Processed 2 variables over 1462 timesteps [0.23s 52MB].
cdo    diff: Processed 2 variables over 1462 timesteps [0.22s 52MB].
cdo    diff: Processed 2 variables over 1462 timesteps [0.30s 52MB].
cdo    diff: Processed 2 variables over 1462 timesteps [0.31s 52MB].
rmarti1's avatar
rmarti1 committed
```

## Visualize netCDF output files
rmarti1's avatar
rmarti1 committed
In order to visualize the results from an experiment, you may first fix the grid and time axis of the `EXPOUT` generated files. First, move into the work directory created when running the experiment. For example:
cd ./tests/work_python_forcing_4_1990-01-01_1991-01-01_0_L128_true/
```

Then, from inside that directory, run the following script with the start date as the argument:
```bash
bash ./convert-ece4pyreader-grids.sh 19900101
```

As a result, you will now have 8 new files in the working directory for the experiment: `A_Ice_frac.nc`, `A_Ice_frac.grb`, `A_SST.nc`, `A_SST.grb`, `AMIP_sst.nc`, `AMIP_sic.nc`, `areas_AMIP.nc`, `areas_L128.grb`, `areas_L128.nc`, `masks_AMIP.nc`, `masks_L128.grb` and `masks_L128.nc`.

### Generate GIF files
Optionally, you may want to save visualization as a GIF file. In order to achieve this, you may first run `./convert-ece4pyreader-grids.sh` as before and then the script `nc_to_gif.py` located in the [`tests/`](./tests/) folder. If you are in the working folder and want to visualize the `A_Ice_frac.nc` file, you may run the following command:
./nc_to_gif.py work_python_forcing_4_1990-01-01_1991-01-01_0_L128_true A_Ice_frac
```

Here is the complete specification for the script:
```bash
./nc_to_gif.py <work_dir> <file>
```

Amirpasha Mozaffari's avatar
Amirpasha Mozaffari committed
Running the script will generate two files: a raw GIF file and an optimized one. In this case `A_Ice_frac.gif` and `A_Ice_frac_optimized.gif` are generated.