AgriculturalIndicators.md

---
title: "Agricultural Indicators"
author: "Earth Sciences department, Barcelona Supercomputing Center (BSC)"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
---

<!--
```{r, echo = FALSE}
knitr::opts_chunk$set(eval = FALSE)
```
-->

## Introduction

Apart from forecasts of Essential Climate Vaiables, Climate Services also provides a variety of the sectoral indicators are often required for Climate Services people, including researchers, decision-makers and farmers, etc.

In the project MEDGOLD, 10 indicators which were identified as critical indices for the three agricultural sectors - grape/wine, olive/olive oil and wheat/pasta - have been considered in this package CSIndicators.

The computing functions and the corresponding indicators are listed as follows:

---
1. **PeriodAccumulation.R -** Spring Total Precipitation (SprR) and Harvest Total Precipitation (HarvestR) 
2. **PeriodMean.R -** Growing Season Temperature (GST) and Spring Mean Temperature Maximum (SPRTX) 
3. **TotalTimeExceedingThreshold.R -** Number of Heat Stress Days - 35°C (SU35), 36°C (SU36), 40°C (SU40) and Spring Heat Stress Days - 32°C (Spr32)
4. **AccumulationExceedingThreshold.R -** Growing Degree Days (GDD)
5. **TotalSpellTimeExceedingThreshold.R -** Warm Spell Duration Index (WSDI)
---

The above functions can take both multidimensional arrays and the s2dv_cube objects (see note below). Taking PeriodAccumulation an example, **CST_**PeriodAccumulation handles the latter and PeriodAccumulation without the prefix can compute multidimensional arrays.

*Note: s2dv_cube and array classes can be handle by the functions in CSIndicators. See section 2 in vignette [Data retrieval and storage](https://cran.r-project.org/web/packages/CSTools/vignettes/Data_Considerations.html) from CSTools package for more information.*

There are some supplementary functions which must be called to smoothly run the above functions.

---
1. **SelectPeriodOnData.R -** to select the data in the requested period
2. **SelectPeriodOnDates.R -** to select the time dimension in the requested period
---

When period selection is needed, the `start` and `end` parameters have to be provided to cut out the portion in `time_dim`. Otherwise, the function will take the entire `time_dim`.

The examples of computing the aforementioned indicators are given by functions as follows.

### 1. PeriodAccumulation.R

Load the required libraries, CSIndicators and CSTools, by running

```{r}
library(CSIndicators)
library(CSTools)
library(s2dv)
library(s2dverification)
```

Here, we load the daily precipitation (**prlr** given in `var`) data sets of ECMWF SEAS5 seasonal forecast and ERA5 reanalysis for the four starting dates 20130401-20160401 with the entire 7-month forecast time, April-October.

The pathways of SEAS5 and ERA5 are given in the lists with some **whitecards** used to replace the variable name and iterative items such as year and month. See details of requirements in section 4 in vignette [Data retrieval and storage](https://cran.r-project.org/web/packages/CSTools/vignettes/Data_Considerations.html) from CSTools package.

The spatial domain covers parts of Douro Valley of Northern Portugal lon=[352.25, 353], lat=[41, 41.75]. These four values are provided in `lonmin`, `lonmax`, `latmin` and `latmax`.

With `grid` set to **r1440x721**, the SEAS5 forecast would be interpolated to the 0.25-degree ERA5 grid by using the **bicubic** method given in `method`.


```{r}
var <- 'prlr'; lonmax <- 353; lonmin <- 352.25; latmax <- 41.75; latmin <- 41
cdoMethod <- 'bicubic'; leadtimemin <- 1; leadtimemax <- 214
S5path_prlr <- list(path = '/esarchive/exp/ecmwf/system5c3s/original_files/chou/daily_mean/$VAR_NAME$_s0-24h/$VAR_NAME$_$YEAR$$MONTH$01.nc')
path_ERA5prlr_CDS <- list(path = '/esarchive/recon/ecmwf/era5/daily_mean/$VAR_NAME$_f1h-r1440x721cds/$VAR_NAME$_$YEAR$$MONTH$.nc')
sdates <- paste0(2013:2016, '04', '01')
prlr_dv <- CST_Load(var = var, exp = list(S5path_prlr), obs = list(path_ERA5prlr_CDS), sdates = sdates, lonmax = lonmax, lonmin = lonmin, latmax = latmax, latmin = latmin, storefreq = 'daily', leadtimemin = leadtimemin, leadtimemax = leadtimemax, nmember = 3, output = "lonlat", grid = "r1440x721", method = cdoMethod)
```

The output contains elements `prlr_dv$exp$data` and `prlr_dv$obs$data` with dimensions:

```{r}
dim(prlr_dv$exp$data)
dataset  member   sdate   ftime     lat     lon
      1       3       4     214       4       4
dim(prlr_dv$obs$data)
dataset  member   sdate   ftime     lat     lon
      1       1       4     214       4       4
```

To compute **SprR** of forecast and observation, we can run:

```{r}
SprR_exp <- CST_PeriodAccumulation(prlr_dv$exp, start = list(21, 4), end = list(21, 6))
SprR_obs <- CST_PeriodAccumulation(prlr_dv$obs, start = list(21, 4), end = list(21, 6))
```

The `start` and `end` are the initial and final dates and the day must be given before the month as above. They will be applied along the dimension `time_dim` (it is set to 'ftime' by default).

As mentioned, these paramters are optional, the function will take the entire timeseries when the period is not specified in `start` and `end`.

The dimensions of SprR forecasts and observations are:

```{r}
dim(SprR_exp$data)
dataset  member   sdate     lat     lon
      1       3       4       4       4
dim(SprR_obs$data)
dataset  member   sdate     lat     lon 
      1       1       4       4       4 
```

The forecast SprR for the 1st member from 2013-2016 of the 1st grid point in mm are:

```{r}
#SprR_exp$data[1,1,,1,1] * 86400 * 1000
#[1]  93.23205 230.41904 194.01412 226.52614
```

To compute **HarvestR**, run the following lines

```{r}
HarvestR_exp <- CST_PeriodAccumulation(prlr_dv$exp, start = list(21, 8), end = list(21, 10))
HarvestR_obs <- CST_PeriodAccumulation(prlr_dv$obs, start = list(21, 8), end = list(21, 10))
```

The forecast HarvestR for the 1st member from 2013-2016 of the 1st grid poinnt in mm are:

```{r}
HarvestR_exp$data[1,1,,1,1] * 86400 * 1000
#[1]  52.30026  42.88068 156.87961  32.18579
```

To compute the 2013-2016 ensemble-mean bias of forecast HarvestR, run

```{r}
fcst <- drop(HarvestR_exp$data) * 86400 * 1000
obs <- drop(HarvestR_obs$data) * 86400 * 1000

Bias <- MeanDims((fcst - InsertDim(obs, 1, dim(fcst)['member'])), 'member')
```
To plot the map of ensemble-mean bias of HarvestR forecast, run

```{r}
lon <- prlr_dv$obs$lon; lat <- prlr_dv$obs$lat

pname_root <- "/esarchive/scratch/cchou/MEDGOLD/"
# Load Douro Valley boundary
load(file = paste0(pname_root, "demo/input/DV_boundary.RData"))
douro$x <- douro$x - 360
cols <- c('#b2182b', '#d6604d', '#f4a582', '#fddbc7', '#d1e5f0', '#92c5de', '#4393c3', '#2166ac')
brks <- seq(-60, 60, by = 20)
toptitle <- 'Ensemble-mean bias of HarvestR in 2013'

PlotEquiMap(Bias[1,,], lon = lon, lat = lat, intylat = 1, intxlon = 1, width = 6, height = 6, filled.continents = FALSE, units = 'mm', title_scale = .8, axes_label_scale = 1, axes_tick_scale = 1, col_inf = cols[1], margin_scale = c(1, 1, 1, 1), cols = cols[2:7], col_sup = cols[8], brks = brks, toptitle = toptitle, colNA = 'white', bar_label_scale = 1.5, bar_extra_margin = c(0, 0, 0, 0), units_scale = 2, file = '/esarchive/scratch/cchou/MEDGOLD/grape/output/HarvestR_Bias_2013.pdf')
```
You will see the following maps of HarvestR bias in 2013.

**/esarchive/scratch/cchou/MEDGOLD/grape/output/HarvestR_Bias_2013.pdf**


### 2. PeriodMean.R

For the function PeriodMean, we use Growing Season Temperature (**GST**) as an example.

Firstly, we prepare a sample data of daily mean temperature by running

```{r}
var <- 'tas'; lonmax <- 353; lonmin <- 352.25; latmax <- 41.75; latmin <- 41
cdoMethod <- 'bicubic'; leadtimemin <- 1; leadtimemax <- 214
S5path <- list(path = '/esarchive/exp/ecmwf/system5c3s/daily_mean/$VAR_NAME$_f6h/$VAR_NAME$_$YEAR$$MONTH$01.nc')
ERA5path <- list(path = '/esarchive/recon/ecmwf/era5/daily_mean/$VAR_NAME$_f1h-r1440x721cds/$VAR_NAME$_$YEAR$$MONTH$.nc')
sdates <- paste0(2013:2016, '04', '01')
tas_dv <- CST_Load(var = var, exp = list(S5path), obs = list(ERA5path), sdates = sdates, lonmax = lonmax, lonmin = lonmin, latmax = latmax, latmin = latmin, storefreq = 'daily', leadtimemin = leadtimemin, leadtimemax = leadtimemax, nmember = 3, output = "lonlat", grid = "r1440x721", method = cdoMethod)
```

The output contains observations `tas_dv$obs$data` and forecast `tas_dv$exp$data`, and their dimensions and summaries are like

```{r}
dim(tas_dv$obs$data)
dataset  member   sdate   ftime     lat     lon 
      1       1       4     214       4       4 

dim(tas_dv$exp$data)
dataset  member   sdate   ftime     lat     lon 
      1       3       4     214       4       4 

summary(tas_dv$obs$data - 273.15)
 Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 3.63   14.38   17.89   17.65   21.24   30.21 

summary(tas_dv$exp$data - 273.15)
 Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.54   11.65   16.56   16.50   21.25   31.41 

```
To compute the GST for both observation and forecast, run the following lines

```{r}
tas_dv$exp$data <- tas_dv$exp$data - 273.15
tas_dv$obs$data <- tas_dv$obs$data - 273.15

GST_exp <- CST_PeriodMean(tas_dv$exp, start = list(1, 4), end = list(31, 10))
GST_obs <- CST_PeriodMean(tas_dv$obs, start = list(1, 4), end = list(31, 10))
```
The summaries and dimensions of the output are as follows:

```{r}
summary(GST_exp$data)
  Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 14.23   15.78   16.50   16.50   17.17   18.70 

summary(GST_obs$data)
  Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 15.34   16.85   17.72   17.65   18.41   19.60 

dim(GST_exp$data)
dataset  member   sdate     lat     lon 
      1       3       4       4       4 

dim(GST_obs$data)
dataset  member   sdate     lat     lon 
      1       1       4       4       4 
```

Here, we plot the 2013-2016 mean climatology of ERA5 GST by running 

```{r}
# compute ERA5 GST climatology
GST_Clim <- MeanDims(drop(GST_obs$data), 'sdate')

# plot the map of climatology
cols <- c('#ffffd4','#fee391','#fec44f','#fe9929','#ec7014','#cc4c02','#8c2d04')
brks <- seq(16, 18.5, by = 0.5)
toptitle <- '2013-2016 mean ERA5 GST'

PlotEquiMap(GST_Clim, lon = lon, lat = lat, intylat = 1, intxlon = 1, width = 6, height = 6, filled.continents = FALSE, units = '°C', title_scale = .8, axes_label_scale = 1, axes_tick_scale = 1, col_inf = cols[1], margin_scale = c(1, 1, 1, 1), cols = cols[2:6], col_sup = cols[7], , brks = brks, toptitle = toptitle, colNA = 'white', bar_label_scale = 1.5, bar_extra_margin = c(0, 0, 0, 0), units_scale = 2, file = '/esarchive/scratch/cchou/MEDGOLD/grape/output/GST_ERA5_Climatology.pdf')
```

The GST map is shown as below.

**/esarchive/scratch/cchou/MEDGOLD/grape/output/GST_ERA5_Climatology.pdf**


### 3. TotalTimeExceedingThreshold.R

For this function, **SU35** (Number of Heat Stress Days - 35°C) is taken as an example here.

The daily temperature maximum of the entire 7-month forecast period is needed for this indicator.

```{r}
var <- 'tasmax'; lonmax <- 353; lonmin <- 352.25; latmax <- 41.75; latmin <- 41
cdoMethod <- 'bicubic'; leadtimemin <- 1; leadtimemax <- 214
S5path <- list(path = '/esarchive/exp/ecmwf/system5c3s/daily/$VAR_NAME$/$VAR_NAME$_$YEAR$$MONTH$01.nc')
ERA5path <- list(path = '/esarchive/recon/ecmwf/era5/daily/$VAR_NAME$-r1440x721cds/$VAR_NAME$_$YEAR$$MONTH$.nc')
sdates <- paste0(2013:2016, '04', '01')
tasmax_dv <- CST_Load(var = var, exp = list(S5path), obs = list(ERA5path), sdates = sdates, lonmax = lonmax, lonmin = lonmin, latmax = latmax, latmin = latmin, storefreq = 'daily', leadtimemin = leadtimemin, leadtimemax = leadtimemax, nmember = 3, output = "lonlat", grid = "r1440x721", method = cdoMethod)
```

Change the unit of temperature to from °C to K for the comparison with the threshold defined.

After that, we can compute the SU35 by running

```{r}
tasmax_dv$exp$data <- tasmax_dv$exp$data - 273.15
tasmax_dv$obs$data <- tasmax_dv$obs$data - 273.15

threshold <- 35

SU35_exp <- CST_TotalTimeExceedingThreshold(tasmax_dv$exp, threshold = threshold, start = list(1, 4), end = list(31, 10))
SU35_obs <- CST_TotalTimeExceedingThreshold(tasmax_dv$obs, threshold = threshold, start = list(1, 4), end = list(31, 10))
```