CST_Calibration.Rd 8.64 KB
Newer Older
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/CST_Calibration.R
\name{CST_Calibration}
\alias{CST_Calibration}
Bert Van schaeybroeck's avatar
Bert Van schaeybroeck committed
\title{Forecast Calibration}
\usage{
nperez's avatar
nperez committed
CST_Calibration(
  exp,
  obs,
nperez's avatar
nperez committed
  cal.method = "mse_min",
  eval.method = "leave-one-out",
  multi.model = FALSE,
  na.fill = TRUE,
nperez's avatar
nperez committed
  na.rm = TRUE,
  apply_to = NULL,
  alpha = NULL,
  memb_dim = "member",
  sdate_dim = "sdate",
  dat_dim = NULL,
  ncores = NULL
}
\arguments{
\item{exp}{An object of class \code{s2dv_cube} as returned by \code{CST_Start} 
function with at least 'sdate' and 'member' dimensions, containing the  
seasonal hindcast experiment data in the element named \code{data}. The 
hindcast is used to calibrate the forecast in case the forecast is provided; 
if not, the same hindcast will be calibrated instead.}
\item{obs}{An object of class \code{s2dv_cube} as returned by \code{CST_Start} 
function with at least 'sdate' dimension, containing the observed data in 
the element named \code{$data}.}
\item{exp_cor}{An optional object of class \code{s2dv_cube} as returned by 
\code{CST_Start} function with at least 'sdate' and 'member' dimensions, 
containing the seasonal forecast experiment data in the element named 
\code{data}. If the forecast is provided, it will be calibrated using the 
hindcast and observations; if not, the hindcast will be calibrated instead. 
If there is only one corrected dataset, it should not have dataset dimension. 
If there is a corresponding corrected dataset for each 'exp' forecast, the 
dataset dimension must have the same length as in 'exp'. The default value 
is NULL.}
\item{cal.method}{A character string indicating the calibration method used, 
can be either \code{bias}, \code{evmos}, \code{mse_min}, \code{crps_min} or 
\code{rpc-based}. Default value is \code{mse_min}.}
\item{eval.method}{A character string indicating the sampling method used, it 
can be either \code{in-sample} or \code{leave-one-out}. Default value is the 
\code{leave-one-out} cross validation. In case the forecast is provided, any 
chosen eval.method is over-ruled and a third option is used.}
\item{multi.model}{A boolean that is used only for the \code{mse_min} 
method. If multi-model ensembles or ensembles of different sizes are used, 
it must be set to \code{TRUE}. By default it is \code{FALSE}. Differences 
between the two approaches are generally small but may become large when 
using small ensemble sizes. Using multi.model when the calibration method is 
\code{bias}, \code{evmos} or \code{crps_min} will not affect the result.}
\item{na.fill}{A boolean that indicates what happens in case calibration is 
not possible or will yield unreliable results. This happens when three or 
less forecasts-observation pairs are available to perform the training phase 
of the calibration. By default \code{na.fill} is set to true such that NA 
values will be returned. If \code{na.fill} is set to false, the uncorrected 
data will be returned.}
\item{na.rm}{A boolean that indicates whether to remove the NA values or not. 
The default value is \code{TRUE}. See Details section for further 
information about its use and compatibility with \code{na.fill}.}
nperez's avatar
nperez committed

\item{apply_to}{A character string that indicates whether to apply the 
calibration to all the forecast (\code{"all"}) or only to those where the 
correlation between the ensemble mean and the observations is statistically
significant (\code{"sign"}). Only useful if \code{cal.method == "rpc-based"}.}
nperez's avatar
nperez committed

\item{alpha}{A numeric value indicating the significance level for the 
correlation test. Only useful if \code{cal.method == "rpc-based" & apply_to 
== "sign"}.}
nperez's avatar
nperez committed

\item{memb_dim}{A character string indicating the name of the member dimension.
By default, it is set to 'member'.}
\item{sdate_dim}{A character string indicating the name of the start date 
dimension. By default, it is set to 'sdate'.}
\item{dat_dim}{A character string indicating the name of dataset dimension. 
The length of this dimension can be different between 'exp' and 'obs'. 
The default value is NULL.}

\item{ncores}{An integer that indicates the number of cores for parallel 
computations using multiApply function. The default value is one.}
\value{
An object of class \code{s2dv_cube} containing the calibrated 
forecasts in the element \code{data} with the dimensions nexp, nobs and same 
dimensions as in the 'exp' object. nexp is the number of experiment 
(i.e., 'dat_dim' in exp), and nobs is the number of observation (i.e., 
'dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. If 'exp_cor' 
is provided the returned array will be with the same dimensions as 'exp_cor'.
}
\description{
Five types of member-by-member bias correction can be performed. 
The \code{"bias"} method corrects the bias only, the \code{"evmos"} method 
applies a variance inflation technique to ensure the correction of the bias 
and the correspondence of variance between forecast and observation (Van 
Schaeybroeck and Vannitsem, 2011). The ensemble calibration methods 
\code{"mse_min"} and \code{"crps_min"} correct the bias, the overall forecast 
variance and the ensemble spread as described in Doblas-Reyes et al. (2005) 
and Van Schaeybroeck and Vannitsem (2015), respectively. While the 
\code{"mse_min"} method minimizes a constrained mean-squared error using three 
parameters, the \code{"crps_min"} method features four parameters and 
minimizes the Continuous Ranked Probability Score (CRPS). The 
\code{"rpc-based"} method adjusts the forecast variance ensuring that the 
ratio of predictable components (RPC) is equal to one, as in Eade et al. 
(2014). It is equivalent to function \code{Calibration} but for objects 
of class \code{s2dv_cube}.
}
\details{
Both the \code{na.fill} and \code{na.rm} parameters can be used to 
indicate how the function has to handle the NA values. The \code{na.fill} 
parameter checks whether there are more than three forecast-observations pairs 
to perform the computation. In case there are three or less pairs, the 
computation is not carried out, and the value returned by the function depends 
on the value of this parameter (either NA if \code{na.fill == TRUE} or the 
uncorrected value if \code{na.fill == TRUE}). On the other hand, \code{na.rm} 
is used to indicate the function whether to remove the missing values during 
the computation of the parameters needed to perform the calibration.
Bert Van schaeybroeck's avatar
Bert Van schaeybroeck committed
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7)
dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7)
Bert Van schaeybroeck's avatar
Bert Van schaeybroeck committed
obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7)
dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7)
lon <- seq(0, 30, 5)
lat <- seq(0, 25, 5)
coords <- list(lat = lat, lon = lon)
exp <- list(data = mod1, coords = coords)
obs <- list(data = obs1, coords = coords)
Bert Van schaeybroeck's avatar
Bert Van schaeybroeck committed
attr(exp, 'class') <- 's2dv_cube'
attr(obs, 'class') <- 's2dv_cube'
a <- CST_Calibration(exp = exp, obs = obs, cal.method = "mse_min", eval.method = "in-sample")

# Example 2:
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7)
mod2 <- 1 : (1 * 3 * 1 * 5 * 6 * 7)
dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7)
dim(mod2) <- c(dataset = 1, member = 3, sdate = 1, ftime = 5, lat = 6, lon = 7)
obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7)
dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7)
lon <- seq(0, 30, 5)
lat <- seq(0, 25, 5)
coords <- list(lat = lat, lon = lon)
exp <- list(data = mod1, coords = coords)
obs <- list(data = obs1, coords = coords)
exp_cor <- list(data = mod2, lat = lat, lon = lon)
attr(exp, 'class') <- 's2dv_cube'
attr(obs, 'class') <- 's2dv_cube'
attr(exp_cor, 'class') <- 's2dv_cube'
a <- CST_Calibration(exp = exp, obs = obs, exp_cor = exp_cor, cal.method = "evmos")

}
\references{
Doblas-Reyes F.J, Hagedorn R, Palmer T.N. The rationale behind the 
success of multi-model ensembles in seasonal forecasting-II calibration and 
combination. Tellus A. 2005;57:234-252. \doi{10.1111/j.1600-0870.2005.00104.x}

Eade, R., Smith, D., Scaife, A., Wallace, E., Dunstone, N., 
Hermanson, L., & Robinson, N. (2014). Do seasonal-to-decadal climate 
predictions underestimate the predictability of the read world? Geophysical 
Research Letters, 41(15), 5620-5628. \doi{10.1002/2014GL061146}

Van Schaeybroeck, B., & Vannitsem, S. (2011). Post-processing 
through linear regression. Nonlinear Processes in Geophysics, 18(2), 
147. \doi{10.5194/npg-18-147-2011}

Van Schaeybroeck, B., & Vannitsem, S. (2015). Ensemble 
post-processing using member-by-member approaches: theoretical aspects. 
Quarterly Journal of the Royal Meteorological Society, 141(688), 807-818.  
\doi{10.1002/qj.2397}
nperez's avatar
nperez committed
\seealso{
\code{\link{CST_Start}}
\author{
Verónica Torralba, \email{veronica.torralba@bsc.es}

Bert Van Schaeybroeck, \email{bertvs@meteo.be}
}