Calibration.Rd 7.23 KB
Newer Older
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/CST_Calibration.R
\name{Calibration}
\alias{Calibration}
\title{Forecast Calibration}
\usage{
nperez's avatar
nperez committed
Calibration(
  exp,
  obs,
nperez's avatar
nperez committed
  cal.method = "mse_min",
  eval.method = "leave-one-out",
  multi.model = FALSE,
  na.fill = TRUE,
nperez's avatar
nperez committed
  na.rm = TRUE,
  apply_to = NULL,
  alpha = NULL,
  memb_dim = "member",
  sdate_dim = "sdate",
  dat_dim = NULL,
  ncores = NULL
\item{exp}{A multidimensional array with named dimensions (at least 'sdate' 
and 'member') containing the seasonal hindcast experiment data. The hindcast 
is used to calibrate the forecast in case the forecast is provided; if not, 
the same hindcast will be calibrated instead.}

\item{obs}{A multidimensional array with named dimensions (at least 'sdate') 
containing the observed data.}

\item{exp_cor}{An optional multidimensional array with named dimensions (at 
least 'sdate' and 'member') containing the seasonal forecast experiment 
data. If the forecast is provided, it will be calibrated using the hindcast 
and observations; if not, the hindcast will be calibrated instead. If there  
is only one corrected dataset, it should not have dataset dimension. If there 
is a corresponding corrected dataset for each 'exp' forecast, the dataset 
dimension must have the same length as in 'exp'. The default value is NULL.}

\item{cal.method}{A character string indicating the calibration method used, 
can be either \code{bias}, \code{evmos}, \code{mse_min}, \code{crps_min} 
or \code{rpc-based}. Default value is \code{mse_min}.}

\item{eval.method}{A character string indicating the sampling method used, 
can be either \code{in-sample} or \code{leave-one-out}. Default value is 
the \code{leave-one-out} cross validation. In case the forecast is 
provided, any chosen eval.method is over-ruled and a third option is 
used.}

\item{multi.model}{A boolean that is used only for the \code{mse_min} 
method. If multi-model ensembles or ensembles of different sizes are used, 
it must be set to \code{TRUE}. By default it is \code{FALSE}. Differences 
between the two approaches are generally small but may become large when 
using small ensemble sizes. Using multi.model when the calibration method 
is \code{bias}, \code{evmos} or \code{crps_min} will not affect the result.}

\item{na.fill}{A boolean that indicates what happens in case calibration is 
not possible or will yield unreliable results. This happens when three or 
less forecasts-observation pairs are available to perform the training phase
of the calibration. By default \code{na.fill} is set to true such that NA 
values will be returned. If \code{na.fill} is set to false, the uncorrected 
data will be returned.}

\item{na.rm}{A boolean that indicates whether to remove the NA values or 
not. The default value is \code{TRUE}.}

\item{apply_to}{A character string that indicates whether to apply the 
calibration to all the forecast (\code{"all"}) or only to those where the 
correlation between the ensemble mean and the observations is statistically 
significant (\code{"sign"}). Only useful if \code{cal.method == "rpc-based"}.}

\item{alpha}{A numeric value indicating the significance level for the 
correlation test. Only useful if \code{cal.method == "rpc-based" & apply_to == 
"sign"}.}
\item{memb_dim}{A character string indicating the name of the member 
dimension. By default, it is set to 'member'.}

\item{sdate_dim}{A character string indicating the name of the start date 
dimension. By default, it is set to 'sdate'.}

\item{dat_dim}{A character string indicating the name of dataset dimension. 
The length of this dimension can be different between 'exp' and 'obs'. 
The default value is NULL.}

\item{ncores}{An integer that indicates the number of cores for parallel 
computation using multiApply function. The default value is NULL (one core).}
An array containing the calibrated forecasts with the dimensions 
nexp, nobs and same dimensions as in the 'exp' array. nexp is the number of 
experiment (i.e., 'dat_dim' in exp), and nobs is the number of observation 
(i.e., 'dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. 
If 'exp_cor' is provided the returned array will be with the same dimensions as 
'exp_cor'.
Five types of member-by-member bias correction can be performed. 
The \code{"bias"} method corrects the bias only, the \code{"evmos"} method 
applies a variance inflation technique to ensure the correction of the bias 
and the correspondence of variance between forecast and observation (Van 
Schaeybroeck and Vannitsem, 2011). The ensemble calibration methods 
\code{"mse_min"} and \code{"crps_min"} correct the bias, the overall forecast 
variance and the ensemble spread as described in Doblas-Reyes et al. (2005) 
and Van Schaeybroeck and Vannitsem (2015), respectively. While the 
\code{"mse_min"} method minimizes a constrained mean-squared error using three 
parameters, the \code{"crps_min"} method features four parameters and 
minimizes the Continuous Ranked Probability Score (CRPS). The 
\code{"rpc-based"} method adjusts the forecast variance ensuring that the 
ratio of predictable components (RPC) is equal to one, as in Eade et al. 
Eva Rifà's avatar
Eva Rifà committed
(2014). Both in-sample or our out-of-sample (leave-one-out cross 
validation) calibration are possible.
nperez's avatar
nperez committed
\details{
Both the \code{na.fill} and \code{na.rm} parameters can be used to 
indicate how the function has to handle the NA values. The \code{na.fill} 
parameter checks whether there are more than three forecast-observations pairs 
to perform the computation. In case there are three or less pairs, the 
computation is not carried out, and the value returned by the function depends 
on the value of this parameter (either NA if \code{na.fill == TRUE} or the 
uncorrected value if \code{na.fill == TRUE}). On the other hand, \code{na.rm} 
is used to indicate the function whether to remove the missing values during 
the computation of the parameters needed to perform the calibration.
nperez's avatar
nperez committed
}
\examples{
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7)
dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7)
obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7)
dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7)
a <- Calibration(exp = mod1, obs = obs1)
Doblas-Reyes F.J, Hagedorn R, Palmer T.N. The rationale behind the 
success of multi-model ensembles in seasonal forecasting-II calibration and 
combination. Tellus A. 2005;57:234-252. doi:10.1111/j.1600-0870.2005.00104.x

Eade, R., Smith, D., Scaife, A., Wallace, E., Dunstone, N., 
Hermanson, L., & Robinson, N. (2014). Do seasonal-to-decadal climate 
predictions underestimate the predictability of the read world? Geophysical 
Research Letters, 41(15), 5620-5628. \doi{10.1002/2014GL061146}

Van Schaeybroeck, B., & Vannitsem, S. (2011). Post-processing 
through linear regression. Nonlinear Processes in Geophysics, 18(2), 
147. \doi{10.5194/npg-18-147-2011}

Van Schaeybroeck, B., & Vannitsem, S. (2015). Ensemble 
post-processing using member-by-member approaches: theoretical aspects. 
Quarterly Journal of the Royal Meteorological Society, 141(688), 807-818.  
\doi{10.1002/qj.2397}
\code{\link{CST_Start}}
nperez's avatar
nperez committed
\author{
Verónica Torralba, \email{veronica.torralba@bsc.es}
nperez's avatar
nperez committed
Bert Van Schaeybroeck, \email{bertvs@meteo.be}
}