diff --git a/vignettes/NAOindex_81to91.png b/vignettes/NAOindex_81to91.png new file mode 100644 index 0000000000000000000000000000000000000000..7ee1220690eab76395e8d4dcde3f11a1c4d76553 Binary files /dev/null and b/vignettes/NAOindex_81to91.png differ diff --git a/vignettes/NAOpredictions.png b/vignettes/NAOpredictions.png new file mode 100644 index 0000000000000000000000000000000000000000..ec96e8651f8ccf72933be25b53213ac1afa36d8f Binary files /dev/null and b/vignettes/NAOpredictions.png differ diff --git a/vignettes/RMSSSforNAOprediction.png b/vignettes/RMSSSforNAOprediction.png new file mode 100644 index 0000000000000000000000000000000000000000..6e394b353f118262080045fb28c7423945405514 Binary files /dev/null and b/vignettes/RMSSSforNAOprediction.png differ diff --git a/vignettes/ScoringForecast.md b/vignettes/ScoringForecast.md new file mode 100644 index 0000000000000000000000000000000000000000..2f3679f74d96404cbd387f100e6d36a54676fee1 --- /dev/null +++ b/vignettes/ScoringForecast.md @@ -0,0 +1,254 @@ +--- +title: "Untitled" +output: github_document +--- + + +This vignette illustrates several examples of how to score a seasonal forecast using different functions in s2dverification. + + +### 1-Load dependencies + +This example requires the following system libraries: + +- libssl-dev +- libnecdf-dev +- cdo + + +The **s2dverification R package** should be loaded by running the following lines in R, onces it is integrated into CRAN mirror. + +```r +library(s2dverification) +library(devtools) +``` + + + +### 2-Define the problem and loading the data with the corresponding parameters + +In this vignette we will quantify how skilled a seasonal forecast is. The model will be the EUROSIP multi-model seasonal forecasting system and our real truth will be observations represented by the ERAinterim reanalysis dataset. + +For more information about both datasets see the next documentation: +- [**ERAinterim**](https://www.ecmwf.int/en/forecasts/datasets/archive-datasets/reanalysis-datasets/era-interim). +- [**EUROSIP system**](https://www.ecmwf.int/en/forecasts/documentation-and-support/long-range/seasonal-forecast-documentation/eurosip-user-guide/multi-model). + +Both datasets can be downloaded from the corresponding server. However in this example we will use the BSC esarchive database. Files path parameters for the Load() function are defined as follows: + +```r +path_exp_EUROSIP <- list(name = 'system4_m1', + path = file.path('/esarchive/exp/EUROSIP/ecmwf', + '$EXP_NAME$/monthly_mean', + '$VAR_NAME$_f6h/$VAR_NAME$_$START_DATE$.nc')) + +path_obs_eraI <- list(name = 'erainterim', + path = file.path('/esarchive/recon/ecmwf', + '$OBS_NAME$/monthly_mean', + '$VAR_NAME$_f6h/$VAR_NAME$_$YEAR$$MONTH$.nc')) +``` + +Our case of study will be predicting the North Atlantic Oscillation index ([**NAO**](https://en.wikipedia.org/wiki/North_Atlantic_oscillation)), usually defined as the difference in the sea level pressure anomaly between the permanent High pressure system over the Azores Island and the Low pressure system over Iceland. Changes in the NAO index have a direct impact on European season weather since it controls the intensity and pathways of the storms in the continent; Positive NAO values are related with stronger storms and thus wet winters in Central and Northern Europe and its Atlantic facade. Negative NAO values correspond to reduced storm activity and rainfall in Northern Europe but increased in the South and Mediterranean Sea. Especially during the months of November to April, the NAO is responsible for much of the variability of weather in the North Atlantic region, affecting wind speed and wind direction changes, changes in temperature and moisture distribution and the intensity, number and track of storms. + +For this vignette we will use the whole North Atlantic as geographical domain and we will select all forecasts starting in November for the 1981 - 1991 time period. The EUROSIP system provides 7-months long window forecast inluding 51 different members. + +```r +#North Atlantic domain definition +lonNA <- c(-90, 40) +latNA <- c(20, 80) + +#Selected time periods: +startDates <- paste(1981:1991, '1101', sep = '') + +#Loading sea level pressure maps for the whole North Atlantic and for the specific period +data <- Load(var = 'psl', + exp = list(path_exp_EUROSIP), + obs = list(path_obs_eraI), + sdates = startDates, + output = 'lonlat', + lonmin = lonNA[1], lonmax = lonNA[2], + latmin = latNA[1], latmax = latNA[2]) + + +#Retrieving observational and forecast data +obs <- data$obs +exp <- data$mod + +``` + + +### 3-NAO index, Forecast vs Observations + +As a first step we will compare the predicted intensity of the NAO index for observations and for the forecast system. + +In s2dverification the NAO index is computed using a common alternative definition; The NAO is defined as the first mode of variability (computed using Single Value Decomposition method) for the sea level pressure in the North Atlantic region [20N-80N, 80W-40E]. The `NAO()` function requires as input anomalies that are previously computed using `Ano_CrossValid()` function. + +```r +#Computing anomalies along sdates dimension +ano <- Ano_CrossValid(exp, obs) + +#Computing NAO index as the first mode of slp variability in the North Atlantic +nao <- NAO(ano$ano_exp, ano$ano_obs, data$lon, data$lat) + +#For clarification, we normalize the NAO index for each member dividing by the corresponding standard deviation +nao_exp_sd <- apply(nao$NAO_exp, MARGIN = 1, sd, na.rm = T) +nao_obs_sd <- sd(nao$NAO_obs) +nao_exp_n <- nao$NAO_exp / nao_exp_sd +nao_obs_n <- nao$NAO_obs / nao_obs_sd + + +# Finally plot the NAO index using Box plots for all decembers. +PlotBoxWhisker(nao_exp_n, nao_obs_n, toptitle = "NAO index, DJF", + monini = 12, yearini = 1981, freq = 1, + drawleg = F, fileout = NULL) +legend(x = 3.8, y = 2.6, c('EUROSIP', 'EraInterim'), col = c(2, 4), pch = 15) +``` + + + +The figure above does not represent a good agreement between observations (blue line) and forecast (whisker boxes) due to the large dispersion through the 51 model members. The NAO signal is too weak due to the large dispersion among ensemble members thus almost disappearing (close to 0). + + +### 4-Quantifying the skillfulness of the prediction. The RMSSS + +Let's quantify how good or bad it is this prediction looking also for a better understanding of what is happening. To do so, we will use the Root Mean Square Error (RMSE), that it is just the squared and then rooted difference between the predicted and the true value. + +s2dverification includes the `RMS()` function to directly calculate the RMSE, however, in order to get a better understanding of the output values we will use the `RMSSS()` function (Root Mean Square Error Skill Score). The RMSSS equals the RMSE but it is normalized by a reference RMSE, usually asociated with the intrinsic variability of the system (for example the standard deviation of the climatology). In s2dverification the score is computed such that the best RMSSS score would be 1 (RMSE equals 0), while if RMSSS equals 0, the RMSE equals the variability reference of the system (i.e. the standard deviation). RMSSS can also be negative, meaning RMSE is greater than the variability reference. + +```r +#Defining NAO events +Lmember <- length(exp[1, , 1, 1, 1, 1]) + +#Calculating a RMSSS for the ensemble of members average from the mean NAO timeseries of all members +rmsss_m = RMSSS(var_exp = array(apply(nao_exp_n, MARGIN = 2, FUN = mean), dim = c(1,11)), + var_obs = nao_obs_n, pval = F) + +#Computing the RMSSS for each member +rmsss =RMSSS(var_exp = nao_exp_n, var_obs = nao_obs_n, pval = F) + +#Plotting..... +layout(matrix(c(1, 2), 1 , 2, byrow = TRUE)) + +#Plotting RMSSS for all members +plot(rmsss, type = 'h', lwd = 5, col = 'grey', xlab = 'Members', mgp = c(3,1,0), + main = sprintf('RMSSS for each member (RMSSS average %1.3f )', rmsss_m), + ylim = c(-1,1), cex.lab=2, cex.axis=2, cex.main = 1.9) +grid(col ='grey') +lines(rep(0, Lmember), lwd = 1, col = 'black') + +#Plotting boxplot for selected members with higher rmsss +isel = which(rmsss > 0.1) +rmsss_m_sel = RMSSS(var_exp = array(apply(nao_exp_n[isel, ], MARGIN = 2, FUN = mean), dim = c(1,11)), + var_obs = nao_obs_n, pval = F) + +PlotBoxWhisker(nao_exp_n[isel, ], nao_obs_n, + toptitle = sprintf('NAO index, selected-members (RMSSS=%1.2f)', rmsss_m_sel), + monini = 12, yearini = 1981, freq = 1, + drawleg = F, fileout = NULL) +legend(x = 4.95, y = 2.4, c('EUROSIP', 'EraInterim'), + col = c(2, 4), pch = 15, cex = 0.9, lty = 0) +``` + + + + +The above figure shows very different RMSSS for different members (left plot). Most of them have RMSSS close to 0, thus the prediction error is close to the system variability. **The RMSSS for the whole ensemble is 0.091**, what means a not very useful ensemble prediction. + +However, we can select the members that present a better RMSSS, i.e. those closer to 1, and recompute the ensemble RMSSS. This is shown in the right plot where only members 8th, 23rd, 40th and 45th are used. Now most marked NAO events are correctly predicted giving a **RMSSS of 0.66 for this selected-members ensemble**. + + +### 5-Quantifying the skillfulness of the prediction. The Brier Score + +Another option to quantify the goodness of this prediction is using the `BrierScore()` function. The BS measures the accuracy of a probabilistic categorical prediction calculating the mean squared difference between the predicted probability and the actual observation. BS scores perfect prediction when BS equals 0 (no difference between prediction and observation) and worst score corresponds to value of 1. + +Moreover, the BS can be descomposed in 3 terms: BS = Reliability - Resolution + Uncertainty. Reliability refers to how close the predicted probabilities are to the true probabilities (i.e. the lower the reliability, the lower the difference between prediction and observation and the better the score). Resolution relates with the difference between the predicted observation and the climatology (i.e. the higher the resolution, the greater the ability of the forecast to predict events different from the climatology). Uncertainty represents the inherent uncertainty of ocurring the event (i.e. the higher the uncertainty, the more difficult to correctly predict the event). + +In the case of the NAO index, we can discretize it in to a categorical variable defining a NAO event always that the index is greater than 0 and asociating the event with the value 1. Negative NAO index values then correspond to the non-ocurrence of a NAO event and thus they are assigned the value 0. For the forecast, we can convert the NAO index prediction of all ensemble members in to a NAO prediction probability computing the proportion of members with a positive NAO prediction. For comparison we do the same for the good scored selected-members ensemble of previous section. + + +```r + +#Defining number of sdates +Lsdates <- length(startDates) + +#For observations +nao_event_obs <- rep(0, Lsdates) +nao_event_obs[which(nao_obs_n > 0)] <- 1 + +#For all members +nao_event_exp <- array(0, dim = c(Lmember, Lsdates)) +nao_event_exp[which(nao_exp_n > 0)] = 1 +nao_event_exp_prob <- apply(nao_event_exp, MARGIN = 2, mean) + +BS <- BrierScore(nao_event_obs, nao_event_exp_prob) + +BSscore <- BS$bs +BSrel <- BS$rel +BSres <- BS$res +BSunc <- BS$unc + +#For selected members only +nao_event_exp_sel <- array(0, dim = c(length(isel), Lsdates)) +nao_event_exp_sel[which(nao_exp_n[isel, ] > 0)] = 1 +nao_event_exp_sel_prob <- apply(nao_event_exp_sel, MARGIN = 2, mean) + +BS_sel <- BrierScore(nao_event_obs, nao_event_exp_sel_prob) + +BSscore_sel <- BS_sel$bs +BSrel_sel <- BS_sel$rel +BSres_sel <- BS_sel$res +BSunc_sel <- BS_sel$unc + + +#Plotting NAO events and prediction probabilities +layout(matrix(c(1, 2), 1, 2, byrow = TRUE)) + +#For all ensemble members +plot(BS$obs - 0.5, type='p', lwd = 5, col = 'red', + xlab = 'NAO events', yaxt = 'n', ylab = 'NAO probabilities', mgp = c(3,1,0)) +grid(col = 'grey') +lines(BS$obs - 0.5, type = 'h', lwd = 5, col = 'red', yaxt = 'n') +lines(BS$pred - 0.5, type = 'h', lwd = 4, col = 'blue', yaxt = 'n') +lines(BS$pred - 0.5, type = 'p', lwd = 3, col = 'blue', yaxt = 'n') +axis(2, at = seq(-0.5, 0.5, 0.25), labels = seq(0, 1, 0.25), mgp = c(3,1,0)) +lines(rep(0, Lmember), lwd=2, col = 'black') +title('Predictions for all-members ensemble') + +#For selected-member ensemble only +plot(BS_sel$obs - 0.5, type = 'p', lwd = 5, col = 'red', + xlab = 'NAO events', yaxt = 'n', ylab = 'NAO probabilities', mgp = c(3,1,0)) +grid(col = 'grey') +lines(BS_sel$obs - 0.5, type = 'h', lwd = 5, col = 'red', yaxt = 'n') +lines(BS_sel$pred - 0.5, type = 'h', lwd = 4, col = 'blue', yaxt = 'n') +lines(BS_sel$pred - 0.5, type = 'p', lwd = 3, col = 'blue', yaxt = 'n') +axis(2, at = seq(-0.5, 0.5, 0.25), labels = seq(0, 1, 0.25), mgp = c(3, 1, 0)) +lines(rep(0, Lmember), lwd = 2, col = 'black') +title('Predictions for selected-members ensemble') + +``` + + + + +For the all-members ensemble, the results are: +**Total BS = 0.213,** +Reliability = 0.138, +Resolution = 0.172, +Uncertainty = 0.248. + +For the selected-members ensemble, the results are: +**Total BS = 0.136,** +Reliability = 0.015, +Resolution = 0.127, +Uncertainty = 0.248. + +To put the total Brier Score values in to context, note that the BS definition is such that in general a forecast that always gives a 50% probability prediction for all cases would have a BS of 0.25. + +Taking this into account, the all-members ensemble prediction with a BS of 0.213 seems only a bit better than flipping a coin in terms of Brier Scoring. The selected-members ensemble instead presents a clear improvement in the total Brier Score (lower value). This is due to the better scoring (lower value) achieved by the reliability term in this ensemble, in agreement with the previous RMSSS results and also shown in the right plot of the figure above; The prediction probabilities are much closer to the NAO ocurrence distribution. However, the resolution scores are not that different between the all-members and the selected-members ensemble. This is due to the fact that even with a notable difference in RMSSS and reliability, the two ensembles mostly predict equally the correct occurrence of positive/negative NAO events (left plot). Finally, the uncertainty is obviously the same for the two cases, since the observations to compare with are the same. + + +### 6-Appendix: Forecast scores ranges + +| SCORE | RMSSS | BS (total) | Reliability | Resolution | Uncertainty | +|:-----:|:-----:|:----------:|:-----------:|:----------:|:--------------:| +| Best | 1 | 0 | 0 | 1 | Obs. dependant | +| Worst | -Inf | 1 | 1 | 0 | Obs. dependant |