diff --git a/.Rbuildignore b/.Rbuildignore index 6008b579d5dd55cd0c61aa4e05404c8437d91f3f..0a2185526f4f30e029d88db4c83f416220c149e6 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -1,5 +1,6 @@ .*\.git$ .*\.gitignore$ +.*\.gitlab$ .*\.tar.gz$ .*\.pdf$ .*^(?!inst)\.nc$ diff --git a/.gitlab/issue_templates/Default.md b/.gitlab/issue_templates/Default.md new file mode 100644 index 0000000000000000000000000000000000000000..30ba62047b6689018c721a83ea68ef042346cf78 --- /dev/null +++ b/.gitlab/issue_templates/Default.md @@ -0,0 +1,26 @@ +(This is a template to report problems or suggest a new development. Please fill in the relevant information and remove the rest.) + +Hi @aho (and @erifarov), + +#### Summary +(Bug: Summarize the bug and explain briefly the expected and the current behavior.) +(New development: Summarize the development needed.) + +#### Example +(Bug: Provide a **minimal reproducible example** and the error message. See [How To Build A Minimal Reproducible Example](https://docs.google.com/document/d/1zRlmsRwFDJctDB94x6HGf6ezu3HFHhEjaBu0hVcrwTI/edit#heading=h.skblym4acpw5)) +(New development: Provide an example script or useful piece of code if appliable.) + +```r +#Example: +exp <- PlotEquiMap(...) +``` +> Error in ColorBar: &%$("!* + +#### Module and Package Version +(Which R version are you using? e.g., R/4.1.2) +(What other modules and their versions required to reproduce this issue? e.g., PROJ/4.8.0-foss-2015a) +(Which R package versions are you using? Check with sessionInfo(). e.g., s2dv_1.4.1) +(Which machine are you using? WS, Nord3, hub, others...) + +#### Other Relevant Information +(Additional information, e.g., the plots.) diff --git a/DESCRIPTION b/DESCRIPTION index 3255c011668ec2f1cb85adc92bbc0009c6e154cb..ec571f4ce6bd0ba036ae274b7e085237d6c19863 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: s2dv Title: A Set of Common Tools for Seasonal to Decadal Verification -Version: 1.4.1 +Version: 2.0.0 Authors@R: c( person("BSC-CNS", role = c("aut", "cph")), person("An-Chi", "Ho", , "an.ho@bsc.es", role = c("aut", "cre")), @@ -51,3 +51,4 @@ LazyData: true SystemRequirements: cdo Encoding: UTF-8 RoxygenNote: 7.2.0 +Config/testthat/edition: 3 diff --git a/NAMESPACE b/NAMESPACE index 6224a157d96700787b4a63b5897d1d6c71d53996..9214a1a54c1b7c4bc7e1e81203b51ac5e674cc26 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -36,10 +36,13 @@ export(EuroAtlanticTC) export(Filter) export(GMST) export(GSAT) +export(GetProbs) export(Histo2Hindcast) export(InsertDim) export(LeapYear) export(Load) +export(MSE) +export(MSSS) export(MeanDims) export(NAO) export(Persistence) diff --git a/NEWS.md b/NEWS.md index a69b74c328eef248e7c6cdac9e839937fdb7feea..4ea74df7b20bb85cb94f3e81cdea5a1379c3551c 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,23 @@ +# s2dv 2.0.0 (Release date: 2023-10-11) +The compability break happens at the parameter changes. All the functionality remains +the same but please pay attention to the parameter changes like name or default value if some error is raised. + +**Bugfixes** +- ColorBar() bug fix for an if condition for warning when var_limits is not provided +- PlotEquiMap() and PlotLayout() are able to plot all NAs maps now. + +**Development** +- ACC() remove parameter "space_dim". Use "lat_dim" and "lon_dim" instead. +- ACC(), Ano_CrossValid(), RMS(), Corr(), and RatioSDRMS() parameter "memb_dim" is optional for obs +- Change the default value of the parameter "dat_dim" in all the functions to NULL (except Ano_CrossValid(), Clim(), and Consistent_Trend()) +- Change parameter "conf.lev" to "alpha" in all appliable functions +- New function: GetProbs(), MSE(), MSSS() +- RPSS() efficiency improvement +- CDORemap() new parameter "ncores" to use multiple cores +- RMSSS(), RPSS(), CRPSS(), AbsBiasSS() have parameter "sig_method.type" to choose the test type of Random Walk test +- CRPSS() has non-cross-validation climatological forecast +- RPS() and RPSS() have new parameter "na.rm" to set the criterion of NA amount + # s2dv 1.4.1 (Release date: 2023-06-02) - Resubmit to CRAN because it was archived due to dependency issue. diff --git a/R/ACC.R b/R/ACC.R index 64036f41e87eada3dcd6d4fba39614bdbcacde75..131d15a0ae3d0d442aa6867193ff3900654a6c1f 100644 --- a/R/ACC.R +++ b/R/ACC.R @@ -16,18 +16,13 @@ #' The dimension should be the same as 'exp' except the length of 'dat_dim' #' and 'memb_dim'. #'@param dat_dim A character string indicating the name of dataset (nobs/nexp) -#' dimension. The default value is 'dataset'. If there is no dataset -#' dimension, set NULL. +#' dimension. The default value is NULL (no dataset). #'@param lat_dim A character string indicating the name of the latitude #' dimension of 'exp' and 'obs' along which ACC is computed. The default value #' is 'lat'. #'@param lon_dim A character string indicating the name of the longitude #' dimension of 'exp' and 'obs' along which ACC is computed. The default value #' is 'lon'. -#'@param space_dim A character string vector of 2 indicating the name of the -#' latitude and longitude dimensions (in order) along which ACC is computed. -#' The default value is c('lat', 'lon'). This argument has been deprecated. -#' Use 'lat_dim' and 'lon_dim' instead. #'@param avg_dim A character string indicating the name of the dimension to be #' averaged, which is usually the time dimension. If no need to calculate mean #' ACC, set as NULL. The default value is 'sdate'. @@ -42,6 +37,13 @@ #'@param lonlatbox A numeric vector of 4 indicating the corners of the domain of #' interested: c(lonmin, lonmax, latmin, latmax). The default value is NULL #' and the whole data will be used. +#'@param alpha A numeric indicating the significance level for the statistical +#' significance test. The default value is 0.05. +#'@param pval A logical value indicating whether to compute the p-value or not. +#' The default value is TRUE. +#'@param sign A logical value indicating whether to retrieve the statistical +#' significance of the test Ho: ACC = 0 based on 'alpha'. The default value is +#' FALSE. #'@param conf A logical value indicating whether to retrieve the confidence #' intervals or not. The default value is TRUE. #'@param conftype A charater string of "parametric" or "bootstrap". @@ -53,10 +55,6 @@ #' make sure that your experiment and observation always have the same number #' of members. "bootstrap" requires 'memb_dim' has value. The default value is #' 'parametric'. -#'@param conf.lev A numeric indicating the confidence level for the -#' regression computation. The default value is 0.95. -#'@param pval A logical value indicating whether to compute the p-value or not. -#' The default value is TRUE. #'@param ncores An integer indicating the number of cores to use for parallel #' computation. The default value is NULL. #' @@ -68,6 +66,12 @@ #' exp), and nobs is the number of observation (i.e., dat_dim in obs). If #' dat_dim is NULL, nexp and nobs are omitted. #'} +#'\item{macc}{ +#' The mean anomaly correlation coefficient with dimensions +#' c(nexp, nobs, the rest of the dimension except lat_dim, lon_dim, memb_dim, and +#' avg_dim). Only present if 'avg_dim' is not NULL. If dat_dim is NULL, nexp +#' and nobs are omitted. +#'} #'\item{conf.lower (if conftype = "parametric") or acc_conf.lower (if #' conftype = "bootstrap")}{ #' The lower confidence interval of ACC with the same dimensions as ACC. Only @@ -80,13 +84,10 @@ #'} #'\item{p.val}{ #' The p-value with the same dimensions as ACC. Only present if -#' \code{pval = TRUE} and code{conftype = "parametric"}. +#' \code{pval = TRUE} and \code{conftype = "parametric"}. #'} -#'\item{macc}{ -#' The mean anomaly correlation coefficient with dimensions -#' c(nexp, nobs, the rest of the dimension except lat_dim, lon_dim, memb_dim, and -#' avg_dim). Only present if 'avg_dim' is not NULL. If dat_dim is NULL, nexp -#' and nobs are omitted. +#'\item{$sign}{ +#' The statistical significance. Only present if \code{sign = TRUE}. #'} #'\item{macc_conf.lower}{ #' The lower confidence interval of MACC with the same dimensions as MACC. @@ -113,8 +114,9 @@ #'clim <- Clim(sampleData$mod, sampleData$obs) #'ano_exp <- Ano(sampleData$mod, clim$clim_exp) #'ano_obs <- Ano(sampleData$obs, clim$clim_obs) -#'acc <- ACC(ano_exp, ano_obs, lat = sampleData$lat) -#'acc_bootstrap <- ACC(ano_exp, ano_obs, conftype = 'bootstrap', lat = sampleData$lat) +#'acc <- ACC(ano_exp, ano_obs, lat = sampleData$lat, dat_dim = 'dataset') +#'acc_bootstrap <- ACC(ano_exp, ano_obs, conftype = 'bootstrap', +#' lat = sampleData$lat, dat_dim = 'dataset') #'# Combine acc results for PlotACC #'res <- array(c(acc$conf.lower, acc$acc, acc$conf.upper, acc$p.val), #' dim = c(dim(acc$acc), 4)) @@ -132,10 +134,10 @@ #'@importFrom stats qt qnorm quantile #'@importFrom ClimProjDiags Subset #'@export -ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', - space_dim = c('lat', 'lon'), avg_dim = 'sdate', memb_dim = 'member', - lat = NULL, lon = NULL, lonlatbox = NULL, - conf = TRUE, conftype = "parametric", conf.lev = 0.95, pval = TRUE, +ACC <- function(exp, obs, dat_dim = NULL, lat_dim = 'lat', lon_dim = 'lon', + avg_dim = 'sdate', memb_dim = 'member', + lat = NULL, lon = NULL, lonlatbox = NULL, alpha = 0.05, + pval = TRUE, sign = FALSE, conf = TRUE, conftype = "parametric", ncores = NULL) { # Check inputs @@ -154,26 +156,16 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', any(is.null(names(dim(obs))))| any(nchar(names(dim(obs))) == 0)) { stop("Parameter 'exp' and 'obs' must have dimension names.") } - if(!all(names(dim(exp)) %in% names(dim(obs))) | - !all(names(dim(obs)) %in% names(dim(exp)))) { - stop("Parameter 'exp' and 'obs' must have same dimension names.") - } ## dat_dim if (!is.null(dat_dim)) { if (!is.character(dat_dim) | length(dat_dim) > 1) { stop("Parameter 'dat_dim' must be a character string or NULL.") } if (!dat_dim %in% names(dim(exp)) | !dat_dim %in% names(dim(obs))) { - stop("Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension.", + stop("Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension. ", "Set it as NULL if there is no dataset dimension.") } } - ## space_dim (deprecated) - if (!missing("space_dim")) { - .warning("Parameter 'space_dim' is deprecated. Use 'lat_dim' and 'lon_dim' instead.") - lat_dim <- space_dim[1] - lon_dim <- space_dim[2] - } ## lat_dim if (!is.character(lat_dim) | length(lat_dim) != 1) { stop("Parameter 'lat_dim' must be a character string.") @@ -202,10 +194,19 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', if (!is.character(memb_dim) | length(memb_dim) > 1) { stop("Parameter 'memb_dim' must be a character string.") } - if (!memb_dim %in% names(dim(exp)) | !memb_dim %in% names(dim(obs))) { - stop("Parameter 'memb_dim' is not found in 'exp' or 'obs' dimension.", + if (!memb_dim %in% names(dim(exp)) & !memb_dim %in% names(dim(obs))) { + stop("Parameter 'memb_dim' is not found in 'exp' nor 'obs' dimension. ", "Set it as NULL if there is no member dimension.") } + # Add [member = 1] + if (memb_dim %in% names(dim(exp)) & !memb_dim %in% names(dim(obs))) { + dim(obs) <- c(dim(obs), 1) + names(dim(obs))[length(dim(obs))] <- memb_dim + } + if (!memb_dim %in% names(dim(exp)) & memb_dim %in% names(dim(obs))) { + dim(exp) <- c(dim(exp), 1) + names(dim(exp))[length(dim(exp))] <- memb_dim + } } ## lat if (is.null(lat)) { @@ -234,6 +235,18 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', } else { select_lonlat <- FALSE } + ## alpha + if (!is.numeric(alpha) | any(alpha < 0) | any(alpha > 1) | length(alpha) > 1) { + stop("Parameter 'alpha' must be a numeric number between 0 and 1.") + } + ## pval + if (!is.logical(pval) | length(pval) > 1) { + stop("Parameter 'pval' must be one logical value.") + } + ## sign + if (!is.logical(sign) | length(sign) > 1) { + stop("Parameter 'sign' must be one logical value.") + } ## conf if (!is.logical(conf) | length(conf) > 1) { stop("Parameter 'conf' must be one logical value.") @@ -246,15 +259,6 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', if (conftype == 'bootstrap' & is.null(memb_dim)) { stop("Parameter 'memb_dim' cannot be NULL when parameter 'conftype' is 'bootstrap'.") } - ## conf.lev - if (!is.numeric(conf.lev) | any(conf.lev < 0) | any(conf.lev > 1) | - length(conf.lev) > 1) { - stop("Parameter 'conf.lev' must be a numeric number between 0 and 1.") - } - } - ## pval - if (!is.logical(pval) | length(pval) > 1) { - stop("Parameter 'pval' must be one logical value.") } ## ncores if (!is.null(ncores)) { @@ -266,6 +270,9 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', ## exp and obs (2) name_exp <- sort(names(dim(exp))) name_obs <- sort(names(dim(obs))) + if(!all(name_exp %in% name_obs) | !all(name_obs %in% name_exp)) { + stop("Parameter 'exp' and 'obs' must have same dimension names.") + } if (!is.null(dat_dim)) { name_exp <- name_exp[-which(name_exp == dat_dim)] name_obs <- name_obs[-which(name_obs == dat_dim)] @@ -274,7 +281,7 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', name_exp <- name_exp[-which(name_exp == memb_dim)] name_obs <- name_obs[-which(name_obs == memb_dim)] } - if (!all(dim(exp)[name_exp] == dim(obs)[name_obs])) { + if (!identical(dim(exp)[name_exp], dim(obs)[name_obs])) { stop(paste0("Parameter 'exp' and 'obs' must have same length of ", "all the dimensions except 'dat_dim' and 'memb_dim'.")) } @@ -329,8 +336,8 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', fun = .ACC, dat_dim = dat_dim, avg_dim = avg_dim, lat = lat, - conftype = conftype, pval = pval, conf = conf, conf.lev = conf.lev, - ncores = ncores) + conftype = conftype, pval = pval, conf = conf, alpha = alpha, + sign = sign, ncores = ncores) # If bootstrap, calculate confidence level if (conftype == 'bootstrap') { @@ -346,8 +353,8 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', fun = .ACC_bootstrap, dat_dim = dat_dim, memb_dim = memb_dim, avg_dim = avg_dim, lat = lat, - conftype = conftype, pval = pval, conf = conf, conf.lev = conf.lev, - ncores = ncores) + conftype = conftype, pval = pval, conf = conf, alpha = alpha, + sign = sign, ncores = ncores) #NOTE: pval? res <- list(acc = res$acc, acc_conf.lower = res_conf$acc_conf.lower, @@ -360,8 +367,8 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', return(res) } -.ACC <- function(exp, obs, dat_dim = 'dataset', avg_dim = 'sdate', lat, - conf = TRUE, conftype = "parametric", conf.lev = 0.95, pval = TRUE) { +.ACC <- function(exp, obs, dat_dim = NULL, avg_dim = 'sdate', lat, alpha = 0.05, + pval = TRUE, sign = FALSE, conf = TRUE, conftype = "parametric") { # .ACC() should use all the spatial points to calculate ACC. It returns [nexp, nobs]. # If dat_dim = NULL, it returns a number. @@ -377,13 +384,14 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', nexp <- 1 nobs <- 1 } else { - nexp <- as.numeric(dim(exp)[length(dim(exp))]) - nobs <- as.numeric(dim(obs)[length(dim(obs))]) + nexp <- as.numeric(dim(exp)[dat_dim]) + nobs <- as.numeric(dim(obs)[dat_dim]) } if (is.null(avg_dim)) { acc <- array(dim = c(nexp = nexp, nobs = nobs)) if (pval) p.val <- array(dim = c(nexp = nexp, nobs = nobs)) + if (sign) signif <- array(dim = c(nexp = nexp, nobs = nobs)) if (conf) { conf.upper <- array(dim = c(nexp = nexp, nobs = nobs)) conf.lower <- array(dim = c(nexp = nexp, nobs = nobs)) @@ -394,6 +402,7 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', names(dim(acc))[3] <- avg_dim macc <- array(dim = c(nexp = nexp, nobs = nobs)) if (pval) p.val <- array(dim = c(nexp = nexp, nobs = nobs, dim(exp)[avg_dim])) + if (sign) signif <- array(dim = c(nexp = nexp, nobs = nobs, dim(exp)[avg_dim])) if (conf) { conf.upper <- array(dim = c(nexp = nexp, nobs = nobs, dim(exp)[avg_dim])) conf.lower <- array(dim = c(nexp = nexp, nobs = nobs, dim(exp)[avg_dim])) @@ -416,12 +425,12 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', } else { # [lat, lon, dat], [lat, lon, avg_dim], or [lat, lon, avg_dim, dat] # exp exp <- array(exp, dim = c(prod(dim_exp[1:2]), dim_exp[3:length(dim_exp)])) - mean_exp <- apply(exp, 2:length(dim(exp)), mean, na.rm = TRUE) # [avg_dim, (dat)] + mean_exp <- colMeans(exp, na.rm = TRUE) # [avg_dim, (dat)] mean_exp <- rep(as.vector(mean_exp), each = prod(dim_exp[1:2])) exp <- array(sqrt(wt) * (as.vector(exp) - mean_exp), dim = dim_exp) # obs obs <- array(obs, dim = c(prod(dim_obs[1:2]), dim_obs[3:length(dim_obs)])) - mean_obs <- apply(obs, 2:length(dim(obs)), mean, na.rm = TRUE) # [avg_dim, (dat)] + mean_obs <- colMeans(obs, na.rm = TRUE) # [avg_dim, (dat)] mean_obs <- rep(as.vector(mean_obs), each = prod(dim_obs[1:2])) obs <- array(sqrt(wt) * (as.vector(obs) - mean_obs), dim = dim_obs) } @@ -452,19 +461,21 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', # handle bottom = 0 if (is.infinite(acc[iexp, iobs])) acc[iexp, iobs] <- NA - # pval and conf - if (pval | conf) { + # pval, sign, and conf + if (pval | conf | sign) { if (conftype == "parametric") { # calculate effective sample size eno <- .Eno(as.vector(obs_sub), na.action = na.pass) - if (pval) { - t <- qt(conf.lev, eno - 2) # a number - p.val[iexp, iobs] <- sqrt(t^2 / (t^2 + eno - 2)) + if (pval | sign) { + t <- qt(1 - alpha, eno - 2) # a number + p.value <- sqrt(t^2 / (t^2 + eno - 2)) + if (pval) p.val[iexp, iobs] <- p.value + if (sign) signif[iexp, iobs] <- !is.na(p.value) & p.value <= alpha } if (conf) { - conf.upper[iexp, iobs] <- tanh(atanh(acc[iexp, iobs]) + qnorm(1 - (1 - conf.lev) / 2) / sqrt(eno - 3)) - conf.lower[iexp, iobs] <- tanh(atanh(acc[iexp, iobs]) + qnorm((1 - conf.lev) / 2) / sqrt(eno - 3)) + conf.upper[iexp, iobs] <- tanh(atanh(acc[iexp, iobs]) + qnorm(1 - alpha / 2) / sqrt(eno - 3)) + conf.lower[iexp, iobs] <- tanh(atanh(acc[iexp, iobs]) + qnorm(alpha / 2) / sqrt(eno - 3)) } } } @@ -491,8 +502,8 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', if (is.infinite(acc[iexp, iobs, i])) acc[iexp, iobs, i] <- NA } - # pval and conf - if (pval | conf) { + # pval, sign, and conf + if (pval | sign | conf) { if (conftype == "parametric") { # calculate effective sample size along lat_dim and lon_dim # combine lat_dim and lon_dim into one dim first @@ -500,15 +511,17 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', dim = c(space = prod(dim(obs_sub)[1:2]), dim(obs_sub)[3])) eno <- apply(obs_tmp, 2, .Eno, na.action = na.pass) # a vector of avg_dim - if (pval) { - t <- qt(conf.lev, eno - 2) # a vector of avg_dim - p.val[iexp, iobs, ] <- sqrt(t^2 / (t^2 + eno - 2)) + if (pval | sign) { + t <- qt(1 - alpha, eno - 2) # a vector of avg_dim + p.value <- sqrt(t^2 / (t^2 + eno - 2)) + if (pval) p.val[iexp, iobs, ] <- p.value + if (sign) signif[iexp, iobs, ] <- !is.na(p.value) & p.value <= alpha } if (conf) { conf.upper[iexp, iobs, ] <- tanh(atanh(acc[iexp, iobs, ]) + - qnorm(1 - (1 - conf.lev) / 2) / sqrt(eno - 3)) + qnorm(1 - alpha / 2) / sqrt(eno - 3)) conf.lower[iexp, iobs, ] <- tanh(atanh(acc[iexp, iobs, ]) + - qnorm((1 - conf.lev) / 2) / sqrt(eno - 3)) + qnorm(alpha / 2) / sqrt(eno - 3)) } } } @@ -527,9 +540,9 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', conf.lower <- as.vector(conf.lower) conf.upper <- as.vector(conf.upper) } - if (pval) { - p.val <- as.vector(p.val) - } + if (pval) p.val <- as.vector(p.val) + if (sign) signif <- as.vector(signif) + } else { dim(acc) <- dim(acc)[3:length(dim(acc))] macc <- as.vector(macc) @@ -537,55 +550,48 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', dim(conf.lower) <- dim(conf.lower)[3:length(dim(conf.lower))] dim(conf.upper) <- dim(conf.upper)[3:length(dim(conf.upper))] } - if (pval) { - dim(p.val) <- dim(p.val)[3:length(dim(p.val))] - } + if (pval) dim(p.val) <- dim(p.val)[3:length(dim(p.val))] + if (sign) dim(signif) <- dim(signif)[3:length(dim(signif))] } } # Return output if (is.null(avg_dim)) { - if (conf & pval) { - return(list(acc = acc, conf.lower = conf.lower, conf.upper = conf.upper, - p.val = p.val)) - } else if (conf & !pval) { - return(list(acc = acc, conf.lower = conf.lower, conf.upper = conf.upper, - macc = macc)) - } else if (!conf & pval) { - return(list(acc = acc, p.val = p.val)) - } else { - return(list(acc = acc)) - } + output <- list(acc = acc) } else { - if (conf & pval) { - return(list(acc = acc, conf.lower = conf.lower, conf.upper = conf.upper, - p.val = p.val, macc = macc)) - } else if (conf & !pval) { - return(list(acc = acc, conf.lower = conf.lower, conf.upper = conf.upper, - macc = macc)) - } else if (!conf & pval) { - return(list(acc = acc, p.val = p.val, macc = macc)) - } else { - return(list(acc = acc, macc = macc)) - } + output <- list(acc = acc, macc = macc) } + if (conf) output <- c(output, list(conf.lower = conf.lower, conf.upper = conf.upper)) + if (pval) output <- c(output, list(p.val = p.val)) + if (sign) output <- c(output, list(sign = signif)) + return(output) } -.ACC_bootstrap <- function(exp, obs, dat_dim = 'dataset', - avg_dim = 'sdate', memb_dim = NULL, lat, - conf = TRUE, conftype = "parametric", conf.lev = 0.95, - pval = TRUE) { +.ACC_bootstrap <- function(exp, obs, dat_dim = NULL, + avg_dim = 'sdate', memb_dim = NULL, lat, alpha = 0.05, + pval = TRUE, sign = FALSE, conf = TRUE, conftype = "parametric") { # if (is.null(avg_dim)) - # exp: [memb_exp, dat_exp, lat, lon] - # obs: [memb_obs, dat_obs, lat, lon] + # exp: [memb_exp, (dat_exp), lat, lon] + # obs: [memb_obs, (dat_obs), lat, lon] # if (!is.null(avg_dim)) - # exp: [memb_exp, dat_exp, avg_dim, lat, lon] - # obs: [memb_obs, dat_obs, avg_dim, lat, lon] + # exp: [memb_exp, (dat_exp), avg_dim, lat, lon] + # obs: [memb_obs, (dat_obs), avg_dim, lat, lon] + + if (is.null(dat_dim)) { + nexp <- 1 + nobs <- 1 + dim(exp) <- c(dim(exp)[1], dat = 1, dim(exp)[-1]) + dim(obs) <- c(dim(obs)[1], dat = 1, dim(obs)[-1]) + dat_dim <- 'dat' + remove_dat_dim <- TRUE + } else { + nexp <- as.numeric(dim(exp)[dat_dim]) + nobs <- as.numeric(dim(obs)[dat_dim]) + remove_dat_dim <- FALSE + } - nexp <- as.numeric(dim(exp)[2]) - nobs <- as.numeric(dim(obs)[2]) nmembexp <- as.numeric(dim(exp)[1]) nmembobs <- as.numeric(dim(obs)[1]) @@ -633,8 +639,8 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', } #calculate the ACC of the randomized field - tmpACC <- .ACC(drawexp, drawobs, conf = FALSE, pval = FALSE, avg_dim = avg_dim, - lat = lat) + tmpACC <- .ACC(drawexp, drawobs, conf = FALSE, pval = FALSE, sign = FALSE, + avg_dim = avg_dim, lat = lat, dat_dim = dat_dim) if (is.null(avg_dim)) { acc_draw[, , jdraw] <- tmpACC$acc } else { @@ -645,26 +651,38 @@ ACC <- function(exp, obs, dat_dim = 'dataset', lat_dim = 'lat', lon_dim = 'lon', #calculate the confidence interval if (is.null(avg_dim)) { - acc_conf.upper <- apply(acc_draw, c(1, 2), - function (x) { - quantile(x, 1 - (1 - conf.lev) / 2, na.rm = TRUE)}) - acc_conf.lower <- apply(acc_draw, c(1, 2), - function (x) { - quantile(x, (1 - conf.lev) / 2, na.rm = TRUE)}) + acc_conf.upper <- apply(acc_draw, c(1, 2), + function (x) { + quantile(x, 1 - alpha / 2, na.rm = TRUE)}) + acc_conf.lower <- apply(acc_draw, c(1, 2), + function (x) { + quantile(x, alpha / 2, na.rm = TRUE)}) } else { - acc_conf.upper <- apply(acc_draw, c(1, 2, 3), - function (x) { - quantile(x, 1 - (1 - conf.lev) / 2, na.rm = TRUE)}) - acc_conf.lower <- apply(acc_draw, c(1, 2, 3), - function (x) { - quantile(x, (1 - conf.lev) / 2, na.rm = TRUE)}) - macc_conf.upper <- apply(macc_draw, c(1, 2), - function (x) { - quantile(x, 1 - (1 - conf.lev) / 2, na.rm = TRUE)}) - macc_conf.lower <- apply(macc_draw, c(1, 2), - function (x) { - quantile(x, (1 - conf.lev) / 2, na.rm = TRUE)}) + acc_conf.upper <- apply(acc_draw, c(1, 2, 3), + function (x) { + quantile(x, 1 - alpha / 2, na.rm = TRUE)}) + acc_conf.lower <- apply(acc_draw, c(1, 2, 3), + function (x) { + quantile(x, alpha / 2, na.rm = TRUE)}) + macc_conf.upper <- apply(macc_draw, c(1, 2), + function (x) { + quantile(x, 1 - alpha / 2, na.rm = TRUE)}) + macc_conf.lower <- apply(macc_draw, c(1, 2), + function (x) { + quantile(x, alpha / 2, na.rm = TRUE)}) + } + + if (remove_dat_dim) { + if (is.null(avg_dim)) { + dim(acc_conf.lower) <- NULL + dim(acc_conf.upper) <- NULL + } else { + dim(acc_conf.lower) <- dim(acc_conf.lower)[-c(1, 2)] + dim(acc_conf.upper) <- dim(acc_conf.upper)[-c(1, 2)] + dim(macc_conf.lower) <- NULL + dim(macc_conf.upper) <- NULL + } } # Return output diff --git a/R/AbsBiasSS.R b/R/AbsBiasSS.R index 8cb1bcee6edabb662aace1ba9e1aaa8da5e3ba82..e55d3d8f6a106a481f528c139d6e9b09c952e5d6 100644 --- a/R/AbsBiasSS.R +++ b/R/AbsBiasSS.R @@ -10,9 +10,9 @@ #'of reference forecasts are the climatological forecast (average of the #'observations), a previous model version, or another model. It is computed as #'\code{AbsBiasSS = 1 - AbsBias_exp / AbsBias_ref}. The statistical significance -#'is obtained based on a Random Walk test at the 95% confidence level (DelSole -#'and Tippett, 2016). If there is more than one dataset, the result will be -#'computed for each pair of exp and obs data. +#'is obtained based on a Random Walk test at the confidence level specified +#'(DelSole and Tippett, 2016). If there is more than one dataset, the result +#'will be computed for each pair of exp and obs data. #' #'@param exp A named numerical array of the forecast with at least time #' dimension. @@ -36,6 +36,12 @@ #' The default value is NULL. #'@param na.rm A logical value indicating if NAs should be removed (TRUE) or #' kept (FALSE) for computation. The default value is FALSE. +#'@param sig_method.type A character string indicating the test type of the +#' significance method. Check \code{RandomWalkTest()} parameter +#' \code{test.type} for details. The default is 'two.sided.approx', which is +#' the default of \code{RandomWalkTest()}. +#'@param alpha A numeric of the significance level to be used in the statistical +#' significance test. The default value is 0.05. #'@param ncores An integer indicating the number of cores to use for parallel #' computation. The default value is NULL. #' @@ -65,7 +71,8 @@ #'@import multiApply #'@export AbsBiasSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = NULL, - dat_dim = NULL, na.rm = FALSE, ncores = NULL) { + dat_dim = NULL, na.rm = FALSE, sig_method.type = 'two.sided.approx', + alpha = 0.05, ncores = NULL) { # Check inputs ## exp, obs, and ref (1) @@ -163,6 +170,22 @@ AbsBiasSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = NULL, if (!is.logical(na.rm) | length(na.rm) > 1) { stop("Parameter 'na.rm' must be one logical value.") } + ## alpha + if (any(!is.numeric(alpha) | alpha <= 0 | alpha >= 1 | length(alpha) > 1)) { + stop("Parameter 'alpha' must be a number between 0 and 1.") + } + ## sig_method.type + #NOTE: These are the types of RandomWalkTest() + if (!sig_method.type %in% c('two.sided.approx', 'two.sided', 'greater', 'less')) { + stop("Parameter 'sig_method.type' must be 'two.sided.approx', 'two.sided', 'greater', or 'less'.") + } + if (sig_method.type == 'two.sided.approx') { + if (alpha != 0.05) { + .warning("DelSole and Tippett (2016) aproximation is valid for alpha ", + "= 0.05 only. Returning the significance at the 0.05 significance level.") + } + } + ## ncores if (!is.null(ncores)) { if (!is.numeric(ncores) | ncores %% 1 != 0 | ncores <= 0 | @@ -202,13 +225,14 @@ AbsBiasSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = NULL, target_dims = target_dims, fun = .AbsBiasSS, dat_dim = dat_dim, - na.rm = na.rm, + na.rm = na.rm, alpha = alpha, sig_method.type = sig_method.type, ncores = ncores) return(output) } -.AbsBiasSS <- function(exp, obs, ref = NULL, dat_dim = NULL, na.rm = FALSE) { +.AbsBiasSS <- function(exp, obs, ref = NULL, dat_dim = NULL, na.rm = FALSE, + sig_method.type = 'two.sided.approx', alpha = 0.05) { # exp and obs: [sdate, (dat_dim)] # ref: [sdate, (dat_dim)] or NULL @@ -267,7 +291,9 @@ AbsBiasSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = NULL, bias_ref <- .Bias(exp = ref_data, obs = obs_data, na.rm = na.rm, absolute = TRUE, time_mean = FALSE) ## Skill score and significance biasSS[i, j] <- 1 - mean(bias_exp) / mean(bias_ref) - sign[i, j] <- .RandomWalkTest(skill_A = bias_exp, skill_B = bias_ref, sign = T, pval = F)$sign + sign[i, j] <- .RandomWalkTest(skill_A = bias_exp, skill_B = bias_ref, + test.type = sig_method.type, alpha = alpha, + sign = T, pval = F)$sign } } diff --git a/R/Ano_CrossValid.R b/R/Ano_CrossValid.R index d1996b9cd84023e8dc37e11af20b32b6e1cedd0f..13f7e977c37a1796879047f2be73324967d65f7a 100644 --- a/R/Ano_CrossValid.R +++ b/R/Ano_CrossValid.R @@ -72,10 +72,6 @@ Ano_CrossValid <- function(exp, obs, time_dim = 'sdate', dat_dim = c('dataset', any(is.null(names(dim(obs))))| any(nchar(names(dim(obs))) == 0)) { stop("Parameter 'exp' and 'obs' must have dimension names.") } - if(!all(names(dim(exp)) %in% names(dim(obs))) | - !all(names(dim(obs)) %in% names(dim(exp)))) { - stop("Parameter 'exp' and 'obs' must have same dimension name.") - } ## time_dim if (!is.character(time_dim) | length(time_dim) > 1) { stop("Parameter 'time_dim' must be a character string.") @@ -83,13 +79,38 @@ Ano_CrossValid <- function(exp, obs, time_dim = 'sdate', dat_dim = c('dataset', if (!time_dim %in% names(dim(exp)) | !time_dim %in% names(dim(obs))) { stop("Parameter 'time_dim' is not found in 'exp' or 'obs' dimension.") } + ## memb + if (!is.logical(memb) | length(memb) > 1) { + stop("Parameter 'memb' must be one logical value.") + } + ## memb_dim + if (!memb) { + if (!is.character(memb_dim) | length(memb_dim) > 1) { + stop("Parameter 'memb_dim' must be a character string.") + } + if (!memb_dim %in% names(dim(exp)) & !memb_dim %in% names(dim(obs))) { + stop("Parameter 'memb_dim' is not found in 'exp' nor 'obs' dimension. ", + "Set it as NULL if there is no member dimension.") + } +# # Add [member = 1] +# if (memb_dim %in% names(dim(exp)) & !memb_dim %in% names(dim(obs))) { +# dim(obs) <- c(dim(obs), 1) +# names(dim(obs))[length(dim(obs))] <- memb_dim +# } +# if (!memb_dim %in% names(dim(exp)) & memb_dim %in% names(dim(obs))) { +# dim(exp) <- c(dim(exp), 1) +# names(dim(exp))[length(dim(exp))] <- memb_dim +# } + } + ## dat_dim + reset_obs_dim <- reset_exp_dim <- FALSE if (!is.null(dat_dim)) { if (!is.character(dat_dim)) { stop("Parameter 'dat_dim' must be a character vector.") } - if (!all(dat_dim %in% names(dim(exp))) | !all(dat_dim %in% names(dim(obs)))) { - stop("Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension.", + if (!any(dat_dim %in% names(dim(exp))) & !any(dat_dim %in% names(dim(obs)))) { + stop("Parameter 'dat_dim' is not found in 'exp' nor 'obs' dimension.", " Set it as NULL if there is no dataset dimension.") } # If dat_dim is not in obs, add it in @@ -98,28 +119,22 @@ Ano_CrossValid <- function(exp, obs, time_dim = 'sdate', dat_dim = c('dataset', ori_obs_dim <- dim(obs) dim(obs) <- c(dim(obs), rep(1, length(dat_dim[which(!dat_dim %in% names(dim(obs)))]))) names(dim(obs)) <- c(names(ori_obs_dim), dat_dim[which(!dat_dim %in% names(dim(obs)))]) - } else { - reset_obs_dim <- FALSE } - } else { - reset_obs_dim <- FALSE - } - ## memb - if (!is.logical(memb) | length(memb) > 1) { - stop("Parameter 'memb' must be one logical value.") + # If dat_dim is not in obs, add it in + if (any(!dat_dim %in% names(dim(exp)))) { + reset_exp_dim <- TRUE + ori_exp_dim <- dim(exp) + dim(exp) <- c(dim(exp), rep(1, length(dat_dim[which(!dat_dim %in% names(dim(exp)))]))) + names(dim(exp)) <- c(names(ori_exp_dim), dat_dim[which(!dat_dim %in% names(dim(exp)))]) + } } - ## memb_dim + # memb_dim and dat_dim if (!memb) { - if (!is.character(memb_dim) | length(memb_dim) > 1) { - stop("Parameter 'memb_dim' must be a character string.") - } - if (!memb_dim %in% names(dim(exp)) | !memb_dim %in% names(dim(obs))) { - stop("Parameter 'memb_dim' is not found in 'exp' or 'obs' dimension.") - } if (!memb_dim %in% dat_dim) { stop("Parameter 'memb_dim' must be one element in parameter 'dat_dim'.") - } + } } + ## ncores if (!is.null(ncores)) { if (!is.numeric(ncores) | ncores %% 1 != 0 | ncores <= 0 | @@ -136,7 +151,7 @@ Ano_CrossValid <- function(exp, obs, time_dim = 'sdate', dat_dim = c('dataset', name_obs <- name_obs[-which(name_obs == dat_dim[i])] } } - if(!all(dim(exp)[name_exp] == dim(obs)[name_obs])) { + if (!identical(dim(exp)[name_exp], dim(obs)[name_obs])) { stop(paste0("Parameter 'exp' and 'obs' must have the same length of ", "all dimensions except 'dat_dim'.")) } @@ -160,10 +175,10 @@ Ano_CrossValid <- function(exp, obs, time_dim = 'sdate', dat_dim = c('dataset', outrows_exp <- MeanDims(exp, pos, na.rm = FALSE) + MeanDims(obs, pos, na.rm = FALSE) outrows_obs <- outrows_exp - - for (i in 1:length(pos)) { - outrows_exp <- InsertDim(outrows_exp, pos[i], dim(exp)[pos[i]]) - outrows_obs <- InsertDim(outrows_obs, pos[i], dim(obs)[pos[i]]) +#browser() + for (i_pos in sort(pos)) { + outrows_exp <- InsertDim(outrows_exp, i_pos, dim(exp)[i_pos]) + outrows_obs <- InsertDim(outrows_obs, i_pos, dim(obs)[i_pos]) } exp_for_clim <- exp obs_for_clim <- obs @@ -184,16 +199,24 @@ Ano_CrossValid <- function(exp, obs, time_dim = 'sdate', dat_dim = c('dataset', # Remove dat_dim in obs if obs doesn't have at first place if (reset_obs_dim) { - res_obs_dim <- ori_obs_dim[-which(names(ori_obs_dim) == time_dim)] - if (!memb & memb_dim %in% names(res_obs_dim)) { - res_obs_dim <- res_obs_dim[-which(names(res_obs_dim) == memb_dim)] - } - if (is.integer(res_obs_dim) & length(res_obs_dim) == 0) { - res$obs <- as.vector(res$obs) - } else { - res$obs <- array(res$obs, dim = res_obs_dim) - } + tmp <- match(names(dim(res$obs)), names(ori_obs_dim)) + dim(res$obs) <- ori_obs_dim[tmp[which(!is.na(tmp))]] } + if (reset_exp_dim) { + tmp <- match(names(dim(res$exp)), names(ori_exp_dim)) + dim(res$exp) <- ori_exp_dim[tmp[which(!is.na(tmp))]] + } + +# res_obs_dim <- ori_obs_dim[-which(names(ori_obs_dim) == time_dim)] +# if (!memb & memb_dim %in% names(res_obs_dim)) { +# res_obs_dim <- res_obs_dim[-which(names(res_obs_dim) == memb_dim)] +# } +# if (is.integer(res_obs_dim) & length(res_obs_dim) == 0) { +# res$obs <- as.vector(res$obs) +# } else { +# res$obs <- array(res$obs, dim = res_obs_dim) +# } +# } return(res) } diff --git a/R/CDORemap.R b/R/CDORemap.R index 09a00040c6b9eb7b9660a344f1891c41312614b0..4ea14fd5864883da1f5db9a597b62da10d158747 100644 --- a/R/CDORemap.R +++ b/R/CDORemap.R @@ -76,6 +76,9 @@ #'@param write_dir Path to the directory where to create the intermediate #' files for CDO to work. By default, the R session temporary directory is #' used (\code{tempdir()}). +#'@param ncores An integer indicating the number of theads used for +#' interpolation (i.e., \code{-P} in cdo command.) The default value is NULL +#' and \code{-P} is not used. #' #'@return A list with the following components: #' \item{'data_array'}{The interpolated data array (if an input array @@ -223,7 +226,8 @@ #'@export CDORemap <- function(data_array = NULL, lons, lats, grid, method, avoid_writes = TRUE, crop = TRUE, - force_remap = FALSE, write_dir = tempdir()) { #, mask = NULL) { + force_remap = FALSE, write_dir = tempdir(), + ncores = NULL) { #, mask = NULL) { .isRegularVector <- function(x, tol = 0.1) { if (length(x) < 2) { #stop("The provided vector must be of length 2 or greater.") @@ -564,6 +568,13 @@ CDORemap <- function(data_array = NULL, lons, lats, grid, method, if (!dir.exists(write_dir)) { stop("Parameter 'write_dir' must point to an existing directory.") } + # Check ncores + if (!is.null(ncores)) { + if (!is.numeric(ncores) | any(ncores %% 1 != 0) | any(ncores < 0) | + length(ncores) > 1) { + stop("Parameter 'ncores' must be a positive integer.") + } + } # if (!is.null(mask)) { # if (!is.numeric(mask) || !is.array(mask)) { # stop("Parameter 'mask' must be a numeric array.") @@ -814,9 +825,17 @@ CDORemap <- function(data_array = NULL, lons, lats, grid, method, ',', format(lat_extremes[1], scientific = FALSE), ',', format(lat_extremes[2], scientific = FALSE), ' -') } - err <- try({ - system(paste0("cdo -s ", sellonlatbox, "remap", method, ",", grid, " ", tmp_file, " ", tmp_file2), ignore.stdout = T, ignore.stderr = T) - }) + if (is.null(ncores)) { + err <- try({ + system(paste0("cdo -s ", sellonlatbox, "remap", method, ",", grid, " ", + tmp_file, " ", tmp_file2), ignore.stdout = T, ignore.stderr = T) + }) + } else { + err <- try({ + system(paste0("cdo -P ", ncores," -s ", sellonlatbox, "remap", method, ",", + grid, " ", tmp_file, " ", tmp_file2), ignore.stdout = T, ignore.stderr = T) + }) + } file.remove(tmp_file) if (is(err, 'try-error') || err > 0) { stop("CDO remap failed. Possible problem: parameter 'grid'.") diff --git a/R/CRPSS.R b/R/CRPSS.R index e2b0df6edbd4cd1227c0e5c34991e013422c5048..159e2bdb2894ce509008da1cbe899d2bcdfe6e90 100644 --- a/R/CRPSS.R +++ b/R/CRPSS.R @@ -7,11 +7,10 @@ #'infinite and 1. If the CRPSS is positive, it indicates that the forecast has #'higher skill than the reference forecast, while a negative value means that it #'has a lower skill. Examples of reference forecasts are the climatological -#'forecast (same probabilities for all categories for all time steps), -#'persistence, a previous model version, or another model. It is computed as -#'CRPSS = 1 - CRPS_exp / CRPS_ref. The statistical significance is obtained -#'based on a Random Walk test at the 95% confidence level (DelSole and Tippett, -#'2016). +#'forecast, persistence, a previous model version, or another model. It is +#'computed as 'CRPSS = 1 - CRPS_exp / CRPS_ref. The statistical significance is +#'obtained based on a Random Walk test at the specified confidence level +#'(DelSole and Tippett, 2016). #' #'@param exp A named numerical array of the forecast with at least time #' dimension. @@ -22,9 +21,12 @@ #' least time and member dimension. The dimensions must be the same as 'exp' #' except 'memb_dim' and 'dat_dim'. If there is only one reference dataset, #' it should not have dataset dimension. If there is corresponding reference -#' for each experiement, the dataset dimension must have the same length as in +#' for each experiment, the dataset dimension must have the same length as in #' 'exp'. If 'ref' is NULL, the climatological forecast is used as reference -#' forecast. The default value is NULL. +#' forecast. To build the climatological forecast, the observed values along +#' the whole time period are used as different members for all time steps. The +#' parameter 'clim.cross.val' controls whether to build it using +#' cross-validation. The default value is NULL. #'@param time_dim A character string indicating the name of the time dimension. #' The default value is 'sdate'. #'@param memb_dim A character string indicating the name of the member dimension @@ -36,6 +38,16 @@ #'@param Fair A logical indicating whether to compute the FairCRPSS (the #' potential CRPSS that the forecast would have with an infinite ensemble #' size). The default value is FALSE. +#'@param clim.cross.val A logical indicating whether to build the climatological +#' forecast in cross-validation (i.e. excluding the observed value of the time +#' step when building the probabilistic distribution function for that +#' particular time step). Only used if 'ref' is NULL. The default value is TRUE. +#'@param sig_method.type A character string indicating the test type of the +#' significance method. Check \code{RandomWalkTest()} parameter +#' \code{test.type} for details. The default is 'two.sided.approx', which is +#' the default of \code{RandomWalkTest()}. +#'@param alpha A numeric of the significance level to be used in the statistical +#' significance test. The default value is 0.05. #'@param ncores An integer indicating the number of cores to use for parallel #' computation. The default value is NULL. #' @@ -67,7 +79,8 @@ #'@importFrom ClimProjDiags Subset #'@export CRPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', dat_dim = NULL, - Fair = FALSE, ncores = NULL) { + Fair = FALSE, clim.cross.val = TRUE, sig_method.type = 'two.sided.approx', + alpha = 0.05, ncores = NULL) { # Check inputs ## exp, obs, and ref (1) @@ -159,9 +172,28 @@ CRPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', } } ## Fair - if (!is.logical(Fair) | length(Fair) > 1) { + if (!is.logical(Fair) | is.na(Fair) | length(Fair) > 1) { stop("Parameter 'Fair' must be either TRUE or FALSE.") } + ## clim.cross.val + if (!is.logical(clim.cross.val) | is.na(clim.cross.val) | length(clim.cross.val) != 1) { + stop("Parameter 'clim.cross.val' must be either TRUE or FALSE.") + } + ## alpha + if (any(!is.numeric(alpha) | alpha <= 0 | alpha >= 1 | length(alpha) > 1)) { + stop("Parameter 'alpha' must be a number between 0 and 1.") + } + ## sig_method.type + #NOTE: These are the types of RandomWalkTest() + if (!sig_method.type %in% c('two.sided.approx', 'two.sided', 'greater', 'less')) { + stop("Parameter 'sig_method.type' must be 'two.sided.approx', 'two.sided', 'greater', or 'less'.") + } + if (sig_method.type == 'two.sided.approx') { + if (alpha != 0.05) { + .warning("DelSole and Tippett (2016) aproximation is valid for alpha ", + "= 0.05 only. Returning the significance at the 0.05 significance level.") + } + } ## ncores if (!is.null(ncores)) { if (!is.numeric(ncores) | ncores %% 1 != 0 | ncores <= 0 | length(ncores) > 1) { @@ -192,14 +224,16 @@ CRPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', fun = .CRPSS, time_dim = time_dim, memb_dim = memb_dim, dat_dim = dat_dim, - Fair = Fair, + Fair = Fair, clim.cross.val = clim.cross.val, + sig_method.type = sig_method.type, alpha = alpha, ncores = ncores) return(output) } -.CRPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', dat_dim = NULL, - Fair = FALSE) { +.CRPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', + dat_dim = NULL, Fair = FALSE, clim.cross.val = TRUE, + sig_method.type = 'two.sided.approx', alpha = 0.05) { # exp: [sdate, memb, (dat)] # obs: [sdate, (dat)] @@ -228,12 +262,13 @@ CRPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', obs_time_len <- dim(obs)[time_dim] if (is.null(dat_dim)) { - ## Without cross-validation: - ## ref <- array(data = rep(obs, each = obs_time_len), dim = c(obs_time_len, obs_time_len)) - ## With cross-validation (excluding the value of that year to create ref for that year): - ref <- array(data = NA, dim = c(obs_time_len, obs_time_len - 1)) - for (i in 1:obs_time_len) { - ref[i, ] <- obs[-i] + if (isFALSE(clim.cross.val)) { ## Without cross-validation + ref <- array(data = rep(obs, each = obs_time_len), dim = c(obs_time_len, obs_time_len)) + } else if (isTRUE(clim.cross.val)) { ## With cross-validation (excluding the value of that year to create ref for that year) + ref <- array(data = NA, dim = c(obs_time_len, obs_time_len - 1)) + for (i in 1:obs_time_len) { + ref[i, ] <- obs[-i] + } } names(dim(ref)) <- c(time_dim, memb_dim) @@ -247,12 +282,13 @@ CRPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', names(dim(crps_ref)) <- c(time_dim, 'nobs') for (i_obs in 1:nobs) { - ## Without cross-validation: - ## ref <- array(data = rep(obs[, i_obs], each = obs_time_len), dim = c(obs_time_len, obs_time_len)) - ## With cross-validation (excluding the value of that year to create ref for that year): - ref <- array(data = NA, dim = c(obs_time_len, obs_time_len - 1)) - for (i in 1:obs_time_len) { - ref[i, ] <- obs[-i, i_obs] + if (isFALSE(clim.cross.val)) { ## Without cross-validation + ref <- array(data = rep(obs[, i_obs], each = obs_time_len), dim = c(obs_time_len, obs_time_len)) + } else if (isTRUE(clim.cross.val)) { ## With cross-validation (excluding the value of that year to create ref for that year) + ref <- array(data = NA, dim = c(obs_time_len, obs_time_len - 1)) + for (i in 1:obs_time_len) { + ref[i, ] <- obs[-i, i_obs] + } } names(dim(ref)) <- c(time_dim, memb_dim) @@ -293,14 +329,18 @@ CRPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', for (i in 1:nexp) { for (j in 1:nobs) { crpss[i, j] <- 1 - crps_exp_mean[i, j] / crps_ref_mean[j] - sign[i, j] <- .RandomWalkTest(skill_A = crps_exp_mean[i, j], skill_B = crps_ref_mean[j], sign = T, pval = F)$sign + sign[i, j] <- .RandomWalkTest(skill_A = crps_exp_mean[i, j], skill_B = crps_ref_mean[j], + test.type = sig_method.type, alpha = alpha, + sign = T, pval = F)$sign } } } else { for (i in 1:nexp) { for (j in 1:nobs) { crpss[i, j] <- 1 - crps_exp_mean[i, j] / crps_ref_mean[i, j] - sign[i, j] <- .RandomWalkTest(skill_A = crps_exp_mean[i, j], skill_B = crps_ref_mean[i, j], sign = T, pval = F)$sign + sign[i, j] <- .RandomWalkTest(skill_A = crps_exp_mean[i, j], skill_B = crps_ref_mean[i, j], + test.type = sig_method.type, alpha = alpha, + sign = T, pval = F)$sign } } } @@ -308,7 +348,9 @@ CRPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', } else { crpss <- 1 - mean(crps_exp) / mean(crps_ref) # Significance - sign <- .RandomWalkTest(skill_A = crps_exp, skill_B = crps_ref, sign = T, pval = F)$sign + sign <- .RandomWalkTest(skill_A = crps_exp, skill_B = crps_ref, + test.type = sig_method.type, alpha = alpha, + sign = T, pval = F)$sign } return(list(crpss = crpss, sign = sign)) diff --git a/R/ColorBar.R b/R/ColorBar.R index 286b1152f59dee536bd81d4a31fcfe5e668e0287..b6da2bb32820cbfd9bcce4f5aef2f661bfc289c3 100644 --- a/R/ColorBar.R +++ b/R/ColorBar.R @@ -288,7 +288,7 @@ ColorBar <- function(brks = NULL, cols = NULL, vertical = TRUE, triangle_ends <- triangle_ends } } - if (plot) { + if (plot && !is.null(var_limits)) { if ((bar_limits[1] >= var_limits[1]) && !triangle_ends[1]) { .warning("There are variable values smaller or equal to the lower limit ", "of the colour bar and the lower triangle end has been ", diff --git a/R/Corr.R b/R/Corr.R index 3430647ad579137b0fef76f48c33c5f03a1c000a..c11fcf69a9bf4197982c072bebe5d7c051e59971 100644 --- a/R/Corr.R +++ b/R/Corr.R @@ -23,8 +23,7 @@ #'@param time_dim A character string indicating the name of dimension along #' which the correlations are computed. The default value is 'sdate'. #'@param dat_dim A character string indicating the name of dataset (nobs/nexp) -#' dimension. The default value is 'dataset'. If there is no dataset -#' dimension, set NULL. +#' dimension. The default value is NULL (no dataset). #'@param comp_dim A character string indicating the name of dimension along which #' obs is taken into account only if it is complete. The default value #' is NULL. @@ -47,7 +46,6 @@ #' FALSE. #'@param alpha A numeric indicating the significance level for the statistical #' significance test. The default value is 0.05. -#'@param conf.lev Deprecated. Use alpha now instead. alpha = 1 - conf.lev. #'@param ncores An integer indicating the number of cores to use for parallel #' computation. The default value is NULL. #' @@ -91,24 +89,25 @@ #'leadtimes_per_startdate <- 60 #'corr <- Corr(MeanDims(smooth_ano_exp, 'member'), #' MeanDims(smooth_ano_obs, 'member'), -#' comp_dim = 'ftime', +#' comp_dim = 'ftime', dat_dim = 'dataset', #' limits = c(ceiling((runmean_months + 1) / 2), #' leadtimes_per_startdate - floor(runmean_months / 2))) #' #'# Case 2: Keep member dimension -#'corr <- Corr(smooth_ano_exp, smooth_ano_obs, memb_dim = 'member') +#'corr <- Corr(smooth_ano_exp, smooth_ano_obs, memb_dim = 'member', dat_dim = 'dataset') #'# ensemble mean -#'corr <- Corr(smooth_ano_exp, smooth_ano_obs, memb_dim = 'member', memb = FALSE) +#'corr <- Corr(smooth_ano_exp, smooth_ano_obs, memb_dim = 'member', memb = FALSE, +#' dat_dim = 'dataset') #' #'@import multiApply #'@importFrom ClimProjDiags Subset #'@importFrom stats cor pt qnorm #'@export -Corr <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', +Corr <- function(exp, obs, time_dim = 'sdate', dat_dim = NULL, comp_dim = NULL, limits = NULL, method = 'pearson', memb_dim = NULL, memb = TRUE, pval = TRUE, conf = TRUE, sign = FALSE, - alpha = 0.05, conf.lev = NULL, ncores = NULL) { + alpha = 0.05, ncores = NULL) { # Check inputs ## exp and obs (1) @@ -126,10 +125,6 @@ Corr <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', any(is.null(names(dim(obs))))| any(nchar(names(dim(obs))) == 0)) { stop("Parameter 'exp' and 'obs' must have dimension names.") } - if(!all(names(dim(exp)) %in% names(dim(obs))) | - !all(names(dim(obs)) %in% names(dim(exp)))) { - stop("Parameter 'exp' and 'obs' must have same dimension name") - } ## time_dim if (!is.character(time_dim) | length(time_dim) > 1) { stop("Parameter 'time_dim' must be a character string.") @@ -176,8 +171,17 @@ Corr <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', if (!is.character(memb_dim) | length(memb_dim) > 1) { stop("Parameter 'memb_dim' must be a character string.") } - if (!memb_dim %in% names(dim(exp)) | !memb_dim %in% names(dim(obs))) { - stop("Parameter 'memb_dim' is not found in 'exp' or 'obs' dimension.") + if (!memb_dim %in% names(dim(exp)) & !memb_dim %in% names(dim(obs))) { + stop("Parameter 'memb_dim' is not found in 'exp' nor 'obs' dimension. Set it as NULL if there is no member dimension.") + } + # Add [member = 1] + if (memb_dim %in% names(dim(exp)) & !memb_dim %in% names(dim(obs))) { + dim(obs) <- c(dim(obs), 1) + names(dim(obs))[length(dim(obs))] <- memb_dim + } + if (!memb_dim %in% names(dim(exp)) & memb_dim %in% names(dim(obs))) { + dim(exp) <- c(dim(exp), 1) + names(dim(exp))[length(dim(exp))] <- memb_dim } } ## memb @@ -196,12 +200,6 @@ Corr <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', if (!is.logical(sign) | length(sign) > 1) { stop("Parameter 'sign' must be one logical value.") } - ## conf.lev - ##NOTE: remove the parameter and the warning after v1.4.0 - if (!missing("conf.lev")) { - .warning(paste0("Argument 'conf.lev' is deprecated. Please use 'alpha' instead. ", - "'alpha' = ", 1 - conf.lev, " is used."), tag = '! Deprecation: ') - } ## alpha if (!is.numeric(alpha) | alpha < 0 | alpha > 1 | length(alpha) > 1) { stop("Parameter 'alpha' must be a numeric number between 0 and 1.") @@ -224,7 +222,7 @@ Corr <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', name_exp <- name_exp[-which(name_exp == memb_dim)] name_obs <- name_obs[-which(name_obs == memb_dim)] } - if(!all(dim(exp)[name_exp] == dim(obs)[name_obs])) { + if (!identical(dim(exp)[name_exp], dim(obs)[name_obs])) { stop(paste0("Parameter 'exp' and 'obs' must have same length of ", "all dimension except 'dat_dim' and 'memb_dim'.")) } @@ -257,7 +255,6 @@ Corr <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', obs[which(outrows)] <- NA rm(obs_sub, outrows) } - if (!is.null(memb_dim)) { if (!memb) { #ensemble mean exp <- MeanDims(exp, memb_dim, na.rm = TRUE) @@ -282,7 +279,7 @@ Corr <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', return(res) } -.Corr <- function(exp, obs, dat_dim = 'dataset', memb_dim = 'member', +.Corr <- function(exp, obs, dat_dim = NULL, memb_dim = 'member', time_dim = 'sdate', method = 'pearson', conf = TRUE, pval = TRUE, sign = FALSE, alpha = 0.05) { if (is.null(memb_dim)) { diff --git a/R/DiffCorr.R b/R/DiffCorr.R index 93078248dcf442e6d5500977f875ca1b57db8886..d42dfc24ec583b333ede3a36f038811a762b0ef1 100644 --- a/R/DiffCorr.R +++ b/R/DiffCorr.R @@ -30,8 +30,7 @@ #'@param method A character string indicating the correlation coefficient to be #' computed ("pearson" or "spearman"). The default value is "pearson". #'@param alpha A numeric of the significance level to be used in the statistical -#' significance test. If it is a numeric, "sign" will be returned. If NULL, the -#' p-value will be returned instead. The default value is NULL. +#' significance test (output "sign"). The default value is 0.05. #'@param handle.na A charcater string indicating how to handle missing values. #' If "return.na", NAs will be returned for the cases that contain at least one #' NA in "exp", "ref", or "obs". If "only.complete.triplets", only the time @@ -43,6 +42,11 @@ #' significantly different) or "one-sided" (to assess whether the skill of #' "exp" is significantly higher than that of "ref") following Steiger (1980). #' The default value is "two-sided". +#'@param pval A logical value indicating whether to return the p-value of the +#' significance test Ho: DiffCorr = 0. The default value is TRUE. +#'@param sign A logical value indicating whether to return the statistical +#' significance of the test Ho: DiffCorr = 0 based on 'alpha'. The default +#' value is FALSE. #'@param ncores An integer indicating the number of cores to use for parallel #' computation. The default value is NULL. #' @@ -54,12 +58,12 @@ #'\item{$sign}{ #' A logical array of the statistical significance of the correlation #' differences with the same dimensions as the input arrays except "time_dim" -#' (and "memb_dim" if provided). Returned only if "alpha" is a numeric. +#' (and "memb_dim" if provided). Returned only if "sign" is TRUE. #'} #'\item{$p.val}{ #' A numeric array of the p-values with the same dimensions as the input arrays -#' except "time_dim" (and "memb_dim" if provided). Returned only if "alpha" is -#' NULL. +#' except "time_dim" (and "memb_dim" if provided). Returned only if "pval" is +#' TRUE. #'} #' #'@references @@ -79,8 +83,9 @@ #'@import multiApply #'@export DiffCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', - memb_dim = NULL, method = 'pearson', alpha = NULL, - handle.na = 'return.na', test.type = "two-sided", ncores = NULL) { + memb_dim = NULL, method = 'pearson', alpha = 0.05, + handle.na = 'return.na', test.type = "two-sided", + pval = TRUE, sign = FALSE, ncores = NULL) { # Check inputs ## exp, ref, and obs (1) @@ -141,11 +146,8 @@ DiffCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', "Monte-Carlo simulations that are done in Siegert et al., 2017")) } ## alpha - if (!is.null(alpha)) { - if (any(!is.numeric(alpha) | alpha <= 0 | alpha >= 1 | - length(alpha) > 1)) { - stop('Parameter "alpha" must be NULL or a number between 0 and 1.') - } + if (sign & any(!is.numeric(alpha) | alpha <= 0 | alpha >= 1 | length(alpha) > 1)) { + stop('Parameter "alpha" must be a number between 0 and 1.') } ## handle.na if (!handle.na %in% c('return.na', 'only.complete.triplets', 'na.fail')) { @@ -185,10 +187,12 @@ DiffCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', } # output_dims - if (is.null(alpha)) { - output_dims <- list(diff.corr = NULL, p.val = NULL) - } else { - output_dims <- list(diff.corr = NULL, sign = NULL) + output_dims <- list(diff.corr = NULL) + if (pval) { + output_dims <- c(output_dims, list(p.val = NULL)) + } + if (sign) { + output_dims <- c(output_dims, list(sign = NULL)) } # Correlation difference if (is.array(N.eff)) { @@ -199,7 +203,7 @@ DiffCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', output_dims = output_dims, fun = .DiffCorr, method = method, alpha = alpha, handle.na = handle.na, - test.type = test.type, ncores = ncores) + test.type = test.type, pval = pval, sign = sign, ncores = ncores) } else { output <- Apply(data = list(exp = exp, obs = obs, ref = ref), target_dims = list(exp = time_dim, obs = time_dim, @@ -207,16 +211,18 @@ DiffCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', output_dims = output_dims, N.eff = N.eff, fun = .DiffCorr, method = method, alpha = alpha, handle.na = handle.na, - test.type = test.type, ncores = ncores) + test.type = test.type, pval = pval, sign = sign, ncores = ncores) } return(output) } -.DiffCorr <- function(exp, obs, ref, N.eff = NA, method = 'pearson', alpha = NULL, - handle.na = 'return.na', test.type = 'two.sided') { +.DiffCorr <- function(exp, obs, ref, N.eff = NA, method = 'pearson', alpha = 0.05, + handle.na = 'return.na', test.type = 'two.sided', + pval = TRUE, sign = FALSE) { - .diff.corr <- function(exp, obs, ref, method = 'pearson', N.eff = NA, alpha = NULL, test.type = 'two.sided') { + .diff.corr <- function(exp, obs, ref, method = 'pearson', N.eff = NA, alpha = 0.05, + test.type = 'two.sided', pval = TRUE, sign = FALSE) { # Correlation difference cor.exp <- cor(x = exp, y = obs, method = method) cor.ref <- cor(x = ref, y = obs, method = method) @@ -237,12 +243,14 @@ DiffCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', ## H0: the skill of exp is not higher than that of ref ## H1: the skill of exp is higher than that of ref - - p.value <- pt(t, df = N.eff - 3, lower.tail = FALSE) - - if (is.null(alpha)) { + + if (pval | sign) { + p.value <- pt(t, df = N.eff - 3, lower.tail = FALSE) + } + if (pval) { output$p.val <- p.value - } else { + } + if (sign) { output$sign <- ifelse(!is.na(p.value) & p.value <= alpha & output$diff.corr > 0, TRUE, FALSE) } @@ -250,12 +258,14 @@ DiffCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', ## H0: the skill difference of exp and ref is zero ## H1: the skill difference of exp and ref is different from zero - - p.value <- pt(abs(t), df = N.eff - 3, lower.tail = FALSE) - if (is.null(alpha)) { + if (pval | sign) { + p.value <- pt(abs(t), df = N.eff - 3, lower.tail = FALSE) + } + if (pval) { output$p.val <- p.value - } else { + } + if (sign) { output$sign <- ifelse(!is.na(p.value) & p.value <= alpha / 2, TRUE, FALSE) } @@ -278,20 +288,22 @@ DiffCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', ref <- ref[!nna] output <- .diff.corr(exp = exp, obs = obs, ref = ref, method = method, - N.eff = N.eff, alpha = alpha, test.type = test.type) + N.eff = N.eff, alpha = alpha, pval = pval, sign = sign, test.type = test.type) } else if (handle.na == 'return.na') { # Data contain NA, return NAs directly without passing to .diff.corr - if (is.null(alpha)) { - output <- list(diff.corr = NA, p.val = NA) - } else { - output <- list(diff.corr = NA, sign = NA) + output <- list(diff.corr = NA) + if (pval) { + output <- c(output, list(p.val = NA)) + } + if (sign) { + output <- c(output, list(sign = NA)) } } } else { ## There is no NA output <- .diff.corr(exp = exp, obs = obs, ref = ref, method = method, - N.eff = N.eff, alpha = alpha, test.type = test.type) + N.eff = N.eff, alpha = alpha, pval = pval, sign = sign, test.type = test.type) } return(output) diff --git a/R/GetProbs.R b/R/GetProbs.R new file mode 100644 index 0000000000000000000000000000000000000000..9960c53ff3604d6f1fff182290d647fea449df5b --- /dev/null +++ b/R/GetProbs.R @@ -0,0 +1,256 @@ +#'Compute probabilistic forecasts or the corresponding observations +#' +#'Compute probabilistic forecasts from an ensemble based on the relative +#'thresholds, or the probabilistic observations (i.e., which probabilistic +#'category was observed). A reference period can be specified to calculate the +#'absolute thresholds between each probabilistic category. The absolute +#'thresholds can be computed in cross-validation mode. If data is an ensemble, +#'the probabilities are calculated as the percentage of members that fall into +#'each category. For observations (or forecast without member dimension), 1 +#'means that the event happened, while 0 indicates that the event did not +#'happen. Weighted probabilities can be computed if the weights are provided for +#'each ensemble member and time step. +#' +#'@param data A named numerical array of the forecasts or observations with, at +#' least, time dimension. +#'@param time_dim A character string indicating the name of the time dimension. +#' The default value is 'sdate'. +#'@param memb_dim A character string indicating the name of the member dimension +#' to compute the probabilities of the forecast, or NULL if there is no member +#' dimension (e.g., for observations, or for forecast with only one ensemble +#' member). The default value is 'member'. +#'@param prob_thresholds A numeric vector of the relative thresholds (from 0 to +#' 1) between the categories. The default value is c(1/3, 2/3), which +#' corresponds to tercile equiprobable categories. +#'@param indices_for_quantiles A vector of the indices to be taken along +#' 'time_dim' for computing the absolute thresholds between the probabilistic +#' categories. If NULL, the whole period is used. The default value is NULL. +#'@param weights A named numerical array of the weights for 'data' with +#' dimensions 'time_dim' and 'memb_dim' (if 'data' has them). The default value +#' is NULL. The ensemble should have at least 70 members or span at least 10 +#' time steps and have more than 45 members if consistency between the weighted +#' and unweighted methodologies is desired. +#'@param cross.val A logical indicating whether to compute the thresholds +#' between probabilistic categories in cross-validation mode. The default value +#' is FALSE. +#'@param ncores An integer indicating the number of cores to use for parallel +#' computation. The default value is NULL. +#' +#'@return +#'A numerical array of probabilities with dimensions c(bin, the rest dimensions +#'of 'data' except 'memb_dim'). 'bin' dimension has the length of probabilistic +#'categories, i.e., \code{length(prob_thresholds) + 1}. +#' +#'@examples +#'data <- array(rnorm(2000), dim = c(ensemble = 25, sdate = 20, time = 4)) +#'res <- GetProbs(data = data, time_dim = 'sdate', memb_dim = 'ensemble', +#' indices_for_quantiles = 4:17) +#' +#'@import multiApply +#'@importFrom easyVerification convert2prob +#'@export +GetProbs <- function(data, time_dim = 'sdate', memb_dim = 'member', indices_for_quantiles = NULL, + prob_thresholds = c(1/3, 2/3), weights = NULL, cross.val = FALSE, ncores = NULL) { + + # Check inputs + ## data + if (is.null(data)) { + stop("Parameter 'data' cannot be NULL.") + } + if (!is.numeric(data)) { + stop("Parameter 'data' must be a numeric array.") + } + if (any(is.null(names(dim(data)))) | any(nchar(names(dim(data))) == 0)) { + stop("Parameter 'data' must have dimension names.") + } + ## time_dim + if (!is.character(time_dim) | length(time_dim) != 1) + stop('Parameter "time_dim" must be a character string.') + if (!time_dim %in% names(dim(data))) { + stop("Parameter 'time_dim' is not found in 'data' dimensions.") + } + ## memb_dim + if (!is.null(memb_dim)) { + if (!is.character(memb_dim) | length(memb_dim) > 1) { + stop("Parameter 'memb_dim' must be a character string.") + } + if (!memb_dim %in% names(dim(data))) { + stop("Parameter 'memb_dim' is not found in 'data' dimensions. If no member ", + "dimension exists, set it as NULL.") + } + } + ## prob_thresholds + if (!is.numeric(prob_thresholds) | !is.vector(prob_thresholds) | + any(prob_thresholds <= 0) | any(prob_thresholds >= 1)) { + stop("Parameter 'prob_thresholds' must be a numeric vector between 0 and 1.") + } + ## indices_for_quantiles + if (is.null(indices_for_quantiles)) { + indices_for_quantiles <- 1:dim(data)[time_dim] + } else { + if (!is.numeric(indices_for_quantiles) | !is.vector(indices_for_quantiles)) { + stop("Parameter 'indices_for_quantiles' must be NULL or a numeric vector.") + } else if (length(indices_for_quantiles) > dim(data)[time_dim] | + max(indices_for_quantiles) > dim(data)[time_dim] | + any(indices_for_quantiles < 1)) { + stop("Parameter 'indices_for_quantiles' should be the indices of 'time_dim'.") + } + } + ## weights + if (!is.null(weights)) { + if (!is.array(weights) | !is.numeric(weights)) + stop("Parameter 'weights' must be a named numeric array.") + +# if (is.null(dat_dim)) { + if (!is.null(memb_dim)) { + lendim_weights <- 2 + namesdim_weights <- c(time_dim, memb_dim) + } else { + lendim_weights <- 1 + namesdim_weights <- c(time_dim) + } + if (length(dim(weights)) != lendim_weights | + any(!names(dim(weights)) %in% namesdim_weights)) { + stop(paste0("Parameter 'weights' must have dimension ", + paste0(namesdim_weights, collapse = ' and '), ".")) + } + if (any(dim(weights)[namesdim_weights] != dim(data)[namesdim_weights])) { + stop(paste0("Parameter 'weights' must have the same dimension length as ", + paste0(namesdim_weights, collapse = ' and '), " dimension in 'data'.")) + } + weights <- Reorder(weights, namesdim_weights) + +# } else { +# if (length(dim(weights)) != 3 | any(!names(dim(weights)) %in% c(memb_dim, time_dim, dat_dim))) +# stop("Parameter 'weights' must have three dimensions with the names of 'memb_dim', 'time_dim' and 'dat_dim'.") +# if (dim(weights)[memb_dim] != dim(exp)[memb_dim] | +# dim(weights)[time_dim] != dim(exp)[time_dim] | +# dim(weights)[dat_dim] != dim(exp)[dat_dim]) { +# stop(paste0("Parameter 'weights' must have the same dimension lengths ", +# "as 'memb_dim', 'time_dim' and 'dat_dim' in 'exp'.")) +# } +# weights <- Reorder(weights, c(time_dim, memb_dim, dat_dim)) +# } + } + ## cross.val + if (!is.logical(cross.val) | length(cross.val) > 1) { + stop("Parameter 'cross.val' must be either TRUE or FALSE.") + } + ## ncores + if (!is.null(ncores)) { + if (!is.numeric(ncores) | ncores %% 1 != 0 | ncores <= 0 | + length(ncores) > 1) { + stop("Parameter 'ncores' must be either NULL or a positive integer.") + } + } + + ############################### + + res <- Apply(data = list(data = data), + target_dims = c(time_dim, memb_dim), #, dat_dim), + output_dims = c("bin", time_dim), + fun = .GetProbs, +# dat_dim = dat_dim, + prob_thresholds = prob_thresholds, + indices_for_quantiles = indices_for_quantiles, + weights = weights, cross.val = cross.val, ncores = ncores)$output1 + + return(res) +} + +.GetProbs <- function(data, indices_for_quantiles, + prob_thresholds = c(1/3, 2/3), weights = NULL, cross.val = FALSE) { + # .GetProbs() is used in RPS, RPSS, ROCSS + # data + ## if data is exp: [sdate, memb] + ## if data is obs: [sdate, (memb)] + # weights: [sdate, (memb)], same as data + + # Add dim [memb = 1] to data if it doesn't have memb_dim + if (length(dim(data)) == 1) { + dim(data) <- c(dim(data), 1) + if (!is.null(weights)) dim(weights) <- c(dim(weights), 1) + } + # Absolute thresholds + if (cross.val) { + quantiles <- array(NA, dim = c(bin = length(prob_thresholds), sdate = dim(data)[1])) + for (i_time in 1:dim(data)[1]) { + if (is.null(weights)) { + quantiles[, i_time] <- quantile(x = as.vector(data[indices_for_quantiles[which(indices_for_quantiles != i_time)], ]), + probs = prob_thresholds, type = 8, na.rm = TRUE) + } else { + # weights: [sdate, memb] + sorted_arrays <- .sorted_distributions(data[indices_for_quantiles[which(indices_for_quantiles != i_time)], ], + weights[indices_for_quantiles[which(indices_for_quantiles != i_time)], ]) + sorted_data <- sorted_arrays$data + cumulative_weights <- sorted_arrays$cumulative_weights + quantiles[, i_time] <- approx(cumulative_weights, sorted_data, prob_thresholds, "linear")$y + } + } + + } else { + if (is.null(weights)) { + quantiles <- quantile(x = as.vector(data[indices_for_quantiles, ]), + probs = prob_thresholds, type = 8, na.rm = TRUE) + } else { + # weights: [sdate, memb] + sorted_arrays <- .sorted_distributions(data[indices_for_quantiles, ], + weights[indices_for_quantiles, ]) + sorted_data <- sorted_arrays$data + cumulative_weights <- sorted_arrays$cumulative_weights + quantiles <- approx(cumulative_weights, sorted_data, prob_thresholds, "linear")$y + } + quantiles <- array(rep(quantiles, dim(data)[1]), dim = c(bin = length(quantiles), dim(data)[1])) + } + # quantiles: [bin-1, sdate] + + # Probabilities + probs <- array(dim = c(dim(quantiles)[1] + 1, dim(data)[1])) # [bin, sdate] + for (i_time in 1:dim(data)[1]) { + if (anyNA(data[i_time, ])) { + probs[, i_time] <- rep(NA, dim = dim(quantiles)[1] + 1) + } else { + if (is.null(weights)) { + probs[, i_time] <- colMeans(easyVerification::convert2prob(data[i_time, ], + threshold = quantiles[, i_time])) + } else { + sorted_arrays <- .sorted_distributions(data[i_time, ], weights[i_time, ]) + sorted_data <- sorted_arrays$data + cumulative_weights <- sorted_arrays$cumulative_weights + # find any quantiles that are outside the data range + integrated_probs <- array(dim = dim(quantiles)) + for (i_quant in 1:dim(quantiles)[1]) { + # for thresholds falling under the distribution + if (quantiles[i_quant, i_time] < min(sorted_data)) { + integrated_probs[i_quant, i_time] <- 0 + # for thresholds falling over the distribution + } else if (max(sorted_data) < quantiles[i_quant, i_time]) { + integrated_probs[i_quant, i_time] <- 1 + } else { + integrated_probs[i_quant, i_time] <- approx(sorted_data, cumulative_weights, + quantiles[i_quant, i_time], "linear")$y + } + } + probs[, i_time] <- append(integrated_probs[, i_time], 1) - append(0, integrated_probs[, i_time]) + if (min(probs[, i_time]) < 0 | max(probs[, i_time]) > 1) { + stop(paste0("Probability in i_time = ", i_time, " is out of [0, 1].")) + } + } + } + } + + return(probs) +} + +.sorted_distributions <- function(data_vector, weights_vector) { + weights_vector <- as.vector(weights_vector) + data_vector <- as.vector(data_vector) + weights_vector <- weights_vector / sum(weights_vector) # normalize to 1 + sorter <- order(data_vector) + sorted_weights <- weights_vector[sorter] + cumulative_weights <- cumsum(sorted_weights) - 0.5 * sorted_weights + cumulative_weights <- cumulative_weights - cumulative_weights[1] # fix the 0 + cumulative_weights <- cumulative_weights / cumulative_weights[length(cumulative_weights)] # fix the 1 + return(list("data" = data_vector[sorter], "cumulative_weights" = cumulative_weights)) +} + diff --git a/R/InsertDim.R b/R/InsertDim.R index 533683d78f5c36863ed65d78e67983e8b8b1a0f2..ff88b581cda755db1c4519b8fb6d953ff146be12 100644 --- a/R/InsertDim.R +++ b/R/InsertDim.R @@ -8,8 +8,6 @@ #'@param lendim An integer indicating the length of the new dimension. #'@param name A character string indicating the name for the new dimension. #' The default value is NULL. -#'@param ncores An integer indicating the number of cores to use for parallel -#' computation. The default value is NULL. This parameter is deprecated now. #' #'@return An array as parameter 'data' but with the added named dimension. #' @@ -20,7 +18,7 @@ #' #'@import multiApply #'@export -InsertDim <- function(data, posdim, lendim, name = NULL, ncores = NULL) { +InsertDim <- function(data, posdim, lendim, name = NULL) { # Check inputs ## data @@ -61,9 +59,6 @@ InsertDim <- function(data, posdim, lendim, name = NULL, ncores = NULL) { stop("Parameter 'name' must be a character string.") } } - ## ncores - if (!missing("ncores")) - .warning("Argument 'ncores' is deprecated.", tag = '! Deprecation: ') ############################### # Calculate InsertDim diff --git a/R/Load.R b/R/Load.R index cca99bf3612242029f43ebeb9f54bf3bed45db58..47da2b32138902c1ba6ffadb6d8a35df9386b4b5 100644 --- a/R/Load.R +++ b/R/Load.R @@ -395,6 +395,10 @@ #' If not specified and the selected output type is 'lon', 'lat' or 'lonlat', #' this parameter takes as default value the grid of the first experimental #' dataset, which is read automatically from the source files.\cr +#' Note that the auto-detected grid type is not guarenteed to be correct, and +#' it won't be correct if the netCDF file doesn't contain global domain. +#' Please check the warning carefully to ensure the detected grid type is +#' expected, or assign this parameter even regridding is not needed. #' The grid must be supported by 'cdo' tools. Now only supported: rNXxNY #' or tTRgrid.\cr #' Both rNXxNY and tRESgrid yield rectangular regular grids. rNXxNY yields diff --git a/R/MSE.R b/R/MSE.R new file mode 100644 index 0000000000000000000000000000000000000000..61cf3bcfb0586682c7e3f107848959742d33e268 --- /dev/null +++ b/R/MSE.R @@ -0,0 +1,309 @@ +#'Compute mean square error +#' +#'Compute the mean square error for an array of forecasts and an array of +#'observations. The MSEs are computed along time_dim, the dimension which +#'corresponds to the start date dimension. If comp_dim is given, the MSEs are +#'computed only if obs along the comp_dim dimension are complete between +#'limits[1] and limits[2], i.e. there are no NAs between limits[1] and +#'limits[2]. This option can be activated if the user wants to account only +#'for the forecasts for which the corresponding observations are available at +#'all leadtimes.\cr +#'The confidence interval is computed by the chi2 distribution.\cr +#' +#'@param exp A named numeric array of experimental data, with at least #' 'time_dim' dimension. It can also be a vector with the same length as 'obs'. +#'@param obs A named numeric array of observational data, same dimensions as +#' parameter 'exp' except along 'dat_dim' and 'memb_dim'. It can also be a +#' vector with the same length as 'exp'. +#'@param time_dim A character string indicating the name of dimension along +#' which the correlations are computed. The default value is 'sdate'. +#'@param memb_dim A character string indicating the name of the member dimension +#' to compute the ensemble mean; it should be set to NULL if the input data are +#' already the ensemble mean. The default value is NULL. +#'@param dat_dim A character string indicating the name of dataset or member +#' (nobs/nexp) dimension. The datasets of exp and obs will be paired and +#' computed MSE for each pair. The default value is NULL. +#'@param comp_dim A character string indicating the name of dimension along which +#' obs is taken into account only if it is complete. The default value +#' is NULL. +#'@param limits A vector of two integers indicating the range along comp_dim to +#' be completed. The default value is c(1, length(comp_dim dimension)). +#'@param conf A logical value indicating whether to retrieve the confidence +#' intervals or not. The default value is TRUE. +#'@param alpha A numeric indicating the significance level for the statistical +#' significance test. The default value is 0.05. +#'@param ncores An integer indicating the number of cores to use for parallel +#' computation. The default value is NULL. +#' +#'@return +#'A list containing the numeric arrays with dimension:\cr +#' c(nexp, nobs, all other dimensions of exp except time_dim).\cr +#'nexp is the number of experiment (i.e., dat_dim in exp), and nobs is the +#'number of observation (i.e., dat_dim in obs).\cr +#'\item{$mse}{ +#' The mean square error. +#'} +#'\item{$conf.lower}{ +#' The lower confidence interval. Only present if \code{conf = TRUE}. +#'} +#'\item{$conf.upper}{ +#' The upper confidence interval. Only present if \code{conf = TRUE}. +#'} +#' +#'@examples +#'# Load sample data as in Load() example: +#'example(Load) +#'clim <- Clim(sampleData$mod, sampleData$obs) +#'ano_exp <- Ano(sampleData$mod, clim$clim_exp) +#'ano_obs <- Ano(sampleData$obs, clim$clim_obs) +#'smooth_ano_exp <- Smoothing(ano_exp, runmeanlen = 12, time_dim = 'ftime') +#'smooth_ano_obs <- Smoothing(ano_obs, runmeanlen = 12, time_dim = 'ftime') +#'res <- MSE(smooth_ano_exp, smooth_ano_obs, memb_dim = 'member', +#' comp_dim = 'ftime', limits = c(7, 54)) +#' +#'# Synthetic data: +#'exp1 <- array(rnorm(120), dim = c(dat = 3, sdate = 10, ftime = 4)) +#'obs1 <- array(rnorm(80), dim = c(dat = 2, sdate = 10, ftime = 4)) +#'res1 <- MSE(exp1, obs1, comp_dim = 'ftime', dat_dim = 'dat') +#' +#'exp2 <- array(rnorm(20), dim = c(sdate = 5, member = 4)) +#'obs2 <- array(rnorm(10), dim = c(sdate = 5, member = 2)) +#'res2 <- MSE(exp2, obs2, memb_dim = 'member') +#' +#'@import multiApply +#'@importFrom ClimProjDiags Subset +#'@importFrom stats qchisq +#'@export +MSE <- function(exp, obs, time_dim = 'sdate', dat_dim = NULL, memb_dim = NULL, + comp_dim = NULL, limits = NULL, conf = TRUE, alpha = 0.05, ncores = NULL) { + + # Check inputs + ## exp and obs (1) + if (is.null(exp) | is.null(obs)) { + stop("Parameter 'exp' and 'obs' cannot be NULL.") + } + if (!is.numeric(exp) | !is.numeric(obs)) { + stop("Parameter 'exp' and 'obs' must be a numeric array.") + } + if (is.null(dim(exp)) & is.null(dim(obs))) { #is vector + if (length(exp) == length(obs)) { + exp <- array(exp, dim = c(length(exp))) + names(dim(exp)) <- c(time_dim) + obs <- array(obs, dim = c(length(obs))) + names(dim(obs)) <- c(time_dim) + } else { + stop(paste0("Parameter 'exp' and 'obs' must be array with as least two ", + "dimensions time_dim and dat_dim, or vector of same length.")) + } + } else if (is.null(dim(exp)) | is.null(dim(obs))) { + stop(paste0("Parameter 'exp' and 'obs' must be array with as least two ", + "dimensions time_dim and dat_dim, or vector of same length.")) + } + if(any(is.null(names(dim(exp))))| any(nchar(names(dim(exp))) == 0) | + any(is.null(names(dim(obs))))| any(nchar(names(dim(obs))) == 0)) { + stop("Parameter 'exp' and 'obs' must have dimension names.") + } + ## time_dim + if (!is.character(time_dim) | length(time_dim) > 1) { + stop("Parameter 'time_dim' must be a character string.") + } + if (!time_dim %in% names(dim(exp)) | !time_dim %in% names(dim(obs))) { + stop("Parameter 'time_dim' is not found in 'exp' or 'obs' dimension.") + } + ## memb_dim + if (!is.null(memb_dim)) { + if (!is.character(memb_dim) | length(memb_dim) > 1) { + stop("Parameter 'memb_dim' must be a character string.") + } + if (!memb_dim %in% names(dim(exp)) & !memb_dim %in% names(dim(obs))) { + stop("Parameter 'memb_dim' is not found in 'exp' dimension.") + } + } + ## dat_dim + if (!is.null(dat_dim)) { + if (!is.character(dat_dim) | length(dat_dim) > 1) { + stop("Parameter 'dat_dim' must be a character string or NULL.") + } + if (!dat_dim %in% names(dim(exp)) | !dat_dim %in% names(dim(obs))) { + stop("Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension.", + "Set it as NULL if there is no dataset dimension.") + } + } + ## comp_dim + if (!is.null(comp_dim)) { + if (!is.character(comp_dim) | length(comp_dim) > 1) { + stop("Parameter 'comp_dim' must be a character string.") + } + if (!comp_dim %in% names(dim(exp)) | !comp_dim %in% names(dim(obs))) { + stop("Parameter 'comp_dim' is not found in 'exp' or 'obs' dimension.") + } + } + ## limits + if (!is.null(limits)) { + if (is.null(comp_dim)) { + stop("Paramter 'comp_dim' cannot be NULL if 'limits' is assigned.") + } + if (!is.numeric(limits) | any(limits %% 1 != 0) | any(limits < 0) | + length(limits) != 2 | any(limits > dim(exp)[comp_dim])) { + stop(paste0("Parameter 'limits' must be a vector of two positive ", + "integers smaller than the length of paramter 'comp_dim'.")) + } + } + ## conf + if (!is.logical(conf) | length(conf) > 1) { + stop("Parameter 'conf' must be one logical value.") + } + ## alpha + if (!is.numeric(alpha) | alpha < 0 | alpha > 1 | length(alpha) > 1) { + stop("Parameter 'alpha' must be a numeric number between 0 and 1.") + } + ## ncores + if (!is.null(ncores)) { + if (!is.numeric(ncores) | ncores %% 1 != 0 | ncores <= 0 | + length(ncores) > 1) { + stop("Parameter 'ncores' must be a positive integer.") + } + } + ## exp and obs (2) + name_exp <- sort(names(dim(exp))) + name_obs <- sort(names(dim(obs))) + if (!is.null(memb_dim)) { + if (memb_dim %in% name_exp) { + name_exp <- name_exp[-which(name_exp == memb_dim)] + } + if (memb_dim %in% name_obs) { + name_obs <- name_obs[-which(name_obs == memb_dim)] + } + } + if (!is.null(dat_dim)) { + name_exp <- name_exp[-which(name_exp == dat_dim)] + name_obs <- name_obs[-which(name_obs == dat_dim)] + } + if (!all(name_exp == name_obs)) { + stop("Parameter 'exp' and 'obs' must have the same dimension names.") + } + if (!all(dim(exp)[name_exp] == dim(obs)[name_obs])) { + stop(paste0("Parameter 'exp' and 'obs' must have same length of ", + "all dimensions except 'dat_dim' and 'memb_dim'.")) + } + if (dim(exp)[time_dim] < 2) { + stop("The length of time_dim must be at least 2 to compute MSE.") + } + + ############################### + ## Ensemble mean + if (!is.null(memb_dim)) { + if (memb_dim %in% names(dim(exp))) { + exp <- MeanDims(exp, memb_dim, na.rm = T) + } + if (memb_dim %in% names(dim(obs))) { + obs <- MeanDims(obs, memb_dim, na.rm = T) + } + } + + ############################### + # Sort dimension + name_exp <- names(dim(exp)) + name_obs <- names(dim(obs)) + order_obs <- match(name_exp, name_obs) + obs <- Reorder(obs, order_obs) + + ############################### + # Calculate MSE + + # Remove data along comp_dim dim if there is at least one NA between limits + if (!is.null(comp_dim)) { + if (is.null(limits)) { + limits <- c(1, dim(obs)[comp_dim]) + } + pos <- which(names(dim(obs)) == comp_dim) + obs_sub <- Subset(obs, pos, list(limits[1]:limits[2])) + outrows <- is.na(MeanDims(obs_sub, pos, na.rm = FALSE)) + outrows <- InsertDim(outrows, pos, dim(obs)[comp_dim]) + obs[which(outrows)] <- NA + } + + res <- Apply(list(exp, obs), + target_dims = list(c(time_dim, dat_dim), + c(time_dim, dat_dim)), + fun = .MSE, + time_dim = time_dim, dat_dim = dat_dim, + conf = conf, alpha = alpha, + ncores = ncores) + return(res) +} + +.MSE <- function(exp, obs, time_dim = 'sdate', dat_dim = NULL, conf = TRUE, alpha = 0.05) { + + if (is.null(dat_dim)) { + # exp: [sdate] + # obs: [sdate] + nexp <- 1 + nobs <- 1 + ini_dims <- dim(exp) + dim(exp) <- c(ini_dims, dat_dim = 1) + dim(obs) <- c(ini_dims, dat_dim = 1) + } else { + # exp: [sdate, dat_exp] + # obs: [sdate, dat_obs] + nexp <- as.numeric(dim(exp)[2]) + nobs <- as.numeric(dim(obs)[2]) + } + + dif <- array(dim = c(dim(exp)[1], nexp = nexp, nobs = nobs)) + chi <- array(dim = c(nexp = nexp, nobs = nobs)) + + if (conf) { + conflow <- alpha / 2 + confhigh <- 1 - conflow + conf.lower <- array(dim = c(nexp = nexp, nobs = nobs)) + conf.upper <- array(dim = c(nexp = nexp, nobs = nobs)) + } + + # dif + for (i in 1:nobs) { + dif[, , i] <- sapply(1:nexp, function(x) {exp[, x] - obs[, i]}) + } + + mse <- colMeans(dif^2, na.rm = TRUE) # array(dim = c(nexp, nobs)) + + if (conf) { + #count effective sample along sdate. eno: c(nexp, nobs) +# eno <- Eno(dif, time_dim) # slower than for loop below? + eno <- array(dim = c(nexp = nexp, nobs = nobs)) + for (n_obs in 1:nobs) { + for (n_exp in 1:nexp) { + eno[n_exp, n_obs] <- .Eno(dif[, n_exp, n_obs], na.action = na.pass) + } + } + + # conf.lower + chi <- sapply(1:nobs, function(i) { + qchisq(confhigh, eno[, i] - 1) + }) + conf.lower <- (eno * mse ** 2 / chi) ** 0.5 + + # conf.upper + chi <- sapply(1:nobs, function(i) { + qchisq(conflow, eno[, i] - 1) + }) + conf.upper <- (eno * mse ** 2 / chi) ** 0.5 + } + + ################################### + # Remove nexp and nobs if dat_dim = NULL + if (is.null(dat_dim)) { + dim(mse) <- NULL + if (conf) { + dim(conf.lower) <- NULL + dim(conf.upper) <- NULL + } + } + + ################################### + + res <- list(mse = mse) + if (conf) res <- c(res, list(conf.lower = conf.lower, conf.upper = conf.upper)) + + return(res) + +} diff --git a/R/MSSS.R b/R/MSSS.R new file mode 100644 index 0000000000000000000000000000000000000000..a11c50c85ee9a20b33c13666b333b0e6f6b22a97 --- /dev/null +++ b/R/MSSS.R @@ -0,0 +1,435 @@ +#'Compute mean square error skill score +#' +#'Compute the mean square error skill score (MSSS) between an array of forecast +#''exp' and an array of observation 'obs'. The two arrays should have the same +#'dimensions except along 'dat_dim' and 'memb_dim'. The MSSSs are computed along +#''time_dim', the dimension which corresponds to the start date dimension. +#'MSSS computes the mean square error skill score of each exp in 1:nexp +#'against each obs in 1:nobs which gives nexp * nobs MSSS for each grid point +#'of the array.\cr +#'The p-value and significance test are optionally provided by an one-sided +#'Fisher test or Random Walk test.\cr +#' +#'@param exp A named numeric array of experimental data which contains at least +#' time dimension (time_dim). It can also be a vector with the same length as +#' 'obs', then the vector will automatically be 'time_dim'. +#'@param obs A named numeric array of observational data which contains at least +#' time dimension (time_dim). The dimensions should be the same as parameter +#' 'exp' except the length of 'dat_dim' and 'memb_dim' dimension. It can also +#' be a vector with the same length as 'exp', then the vector will +#' automatically be 'time_dim'. +#'@param ref A named numerical array of the reference forecast data with at +#' least time dimension, or 0 (typical climatological forecast) or 1 +#' (normalized climatological forecast). If it is an array, the dimensions must +#' be the same as 'exp' except 'memb_dim' and 'dat_dim'. If there is only one +#' reference dataset, it should not have dataset dimension. If there is +#' corresponding reference for each experiment, the dataset dimension must +#' have the same length as in 'exp'. If 'ref' is NULL, the typical +#' climatological forecast is used as reference forecast (equivalent to 0.) +#' The default value is NULL. +#'@param dat_dim A character string indicating the name of dataset (nobs/nexp) +#' dimension. The default value is NULL. +#'@param time_dim A character string indicating the name of dimension along +#' which the MSSS are computed. The default value is 'sdate'. +#'@param memb_dim A character string indicating the name of the member dimension +#' to compute the ensemble mean; it should be set to NULL if the data are +#' already the ensemble mean. The default value is NULL. +#'@param pval A logical value indicating whether to compute or not the p-value +#' of the test Ho: MSSS = 0. The default value is TRUE. +#'@param sign A logical value indicating whether to compute or not the +#' statistical significance of the test Ho: MSSS = 0. The default value is +#' FALSE. +#'@param alpha A numeric of the significance level to be used in the +#' statistical significance test. The default value is 0.05. +#'@param sig_method A character string indicating the significance method. The +#' options are "one-sided Fisher" (default) and "Random Walk". +#'@param sig_method.type A character string indicating the test type of the +#' significance method. Check \code{RandomWalkTest()} parameter +#' \code{test.type} for details if parameter "sig_method" is "Random Walk". The +#' default is NULL (since "one-sided Fisher" doesn't have different test +#' types.) +#'@param ncores An integer indicating the number of cores to use for parallel +#' computation. The default value is NULL. +#' +#'@return +#'A list containing the numeric arrays with dimension:\cr +#' c(nexp, nobs, all other dimensions of exp except time_dim).\cr +#'nexp is the number of experiment (i.e., dat_dim in exp), and nobs is the +#'number of observation (i.e., dat_dim in obs). If dat_dim is NULL, nexp and +#'nobs are omitted.\cr +#'\item{$msss}{ +#' A numerical array of the mean square error skill score. +#'} +#'\item{$p.val}{ +#' A numerical array of the p-value with the same dimensions as $msss. +#' Only present if \code{pval = TRUE}. +#'} +#'\item{sign}{ +#' A logical array of the statistical significance of the MSSS with the same +#' dimensions as $msss. Only present if \code{sign = TRUE}. +#'} +#' +#'@examples +#'# Load sample data as in Load() example: +#'example(Load) +#'clim <- Clim(sampleData$mod, sampleData$obs) +#'ano_exp <- Ano(sampleData$mod, clim$clim_exp) +#'ano_obs <- Ano(sampleData$obs, clim$clim_obs) +#'rmsss <- MSSS(ano_exp, ano_obs, dat_dim = 'dataset', memb_dim = 'member') +#' +#'# Synthetic data: +#'exp <- array(rnorm(30), dim = c(dataset = 2, time = 3, memb = 5)) +#'obs <- array(rnorm(15), dim = c(time = 3, dataset = 1)) +#'res <- MSSS(exp, obs, time_dim = 'time', dat_dim = 'dataset', memb_dim = 'memb') +#' +#'@rdname MSSS +#'@import multiApply +#'@importFrom stats pf +#'@export +MSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = NULL, + memb_dim = NULL, pval = TRUE, sign = FALSE, alpha = 0.05, + sig_method = 'one-sided Fisher', sig_method.type = NULL, ncores = NULL) { + + # Check inputs + ## exp, obs, and ref (1) + if (is.null(exp) | is.null(obs)) { + stop("Parameter 'exp' and 'obs' cannot be NULL.") + } + if (!is.numeric(exp) | !is.numeric(obs)) { + stop("Parameter 'exp' and 'obs' must be a numeric array.") + } + if (is.null(dim(exp)) & is.null(dim(obs))) { #is vector + if (length(exp) == length(obs)) { + exp <- array(exp, dim = c(length(exp))) + names(dim(exp)) <- c(time_dim) + obs <- array(obs, dim = c(length(obs))) + names(dim(obs)) <- c(time_dim) + } else { + stop(paste0("Parameter 'exp' and 'obs' must be array with as least two ", + "dimensions time_dim and dat_dim, or vector of same length.")) + } + } else if (is.null(dim(exp)) | is.null(dim(obs))) { + stop(paste0("Parameter 'exp' and 'obs' must be array with as least two ", + "dimensions time_dim and dat_dim, or vector of same length.")) + } + if (any(is.null(names(dim(exp))))| any(nchar(names(dim(exp))) == 0) | + any(is.null(names(dim(obs))))| any(nchar(names(dim(obs))) == 0)) { + stop("Parameter 'exp' and 'obs' must have dimension names.") + } + if (!is.null(ref)) { + if (!is.numeric(ref)) { + stop("Parameter 'ref' must be numeric.") + } + if (is.array(ref)) { + if (any(is.null(names(dim(ref))))| any(nchar(names(dim(ref))) == 0)) { + stop("Parameter 'ref' must have dimension names.") + } + } else if (length(ref) != 1 | any(!ref %in% c(0, 1))) { + stop("Parameter 'ref' must be a numeric array or number 0 or 1.") + } + } else { + ref <- 0 + } + if (!is.array(ref)) { # 0 or 1 + ref <- array(data = ref, dim = dim(exp)) + } + + ## time_dim + if (!is.character(time_dim) | length(time_dim) > 1) { + stop("Parameter 'time_dim' must be a character string.") + } + if (!time_dim %in% names(dim(exp)) | !time_dim %in% names(dim(obs))) { + stop("Parameter 'time_dim' is not found in 'exp' or 'obs' dimension.") + } + ## dat_dim + if (!is.null(dat_dim)) { + if (!is.character(dat_dim) | length(dat_dim) > 1) { + stop("Parameter 'dat_dim' must be a character string or NULL.") + } + if (!dat_dim %in% names(dim(exp)) | !dat_dim %in% names(dim(obs))) { + stop("Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension.", + " Set it as NULL if there is no dataset dimension.") + } + } + ## memb_dim + if (!is.null(memb_dim)) { + if (!is.character(memb_dim) | length(memb_dim) > 1) { + stop("Parameter 'memb_dim' must be a character string.") + } + if (!memb_dim %in% names(dim(exp))) { + stop("Parameter 'memb_dim' is not found in 'exp' dimension.") + } + } + ## pval + if (!is.logical(pval) | length(pval) > 1) { + stop("Parameter 'pval' must be one logical value.") + } + ## sign + if (!is.logical(sign) | length(sign) > 1) { + stop("Parameter 'sign' must be one logical value.") + } + ## alpha + if (!is.numeric(alpha) | length(alpha) > 1) { + stop("Parameter 'alpha' must be one numeric value.") + } + ## sig_method + if (length(sig_method) != 1 | !any(sig_method %in% c('one-sided Fisher', 'Random Walk'))) { + stop("Parameter 'sig_method' must be one of 'one-sided Fisher' or 'Random Walk'.") + } + ## sig_method.type + if (sig_method == 'Random Walk') { + if (is.null(sig_method.type)) { + .warning("Parameter 'sig_method.type' must be specified if 'sig_method' is ", + "Random Walk. Assign it as 'two.sided'.") + sig_method.type <- "two.sided" + } + if (!any(sig_method.type %in% c('two.sided.approx', 'two.sided', 'greater', 'less'))) { + stop("Parameter 'sig_method.type' must be a method accepted by RandomWalkTest() parameter 'test.type'.") + } + if (sig_method.type == 'two.sided.approx' & pval == T) { + .warning("p-value cannot be calculated by Random Walk 'two.sided.approx' method.") + pval <- FALSE + if (alpha != 0.05) { + .warning("DelSole and Tippett (2016) aproximation is valid for alpha ", + "= 0.05 only. Returning the significance at the 0.05 significance level.") + alpha <- 0.05 + } + } + } + ## ncores + if (!is.null(ncores)) { + if (!is.numeric(ncores) | ncores %% 1 != 0 | ncores <= 0 | length(ncores) > 1) { + stop("Parameter 'ncores' must be a positive integer.") + } + } + ## exp, obs, and ref (2) + name_exp <- sort(names(dim(exp))) + name_obs <- sort(names(dim(obs))) + if (!is.null(memb_dim)) { + if (memb_dim %in% name_exp) { + name_exp <- name_exp[-which(name_exp == memb_dim)] + } + if (memb_dim %in% name_obs) { + name_obs <- name_obs[-which(name_obs == memb_dim)] + } + } + if (!is.null(dat_dim)) { + name_exp <- name_exp[-which(name_exp == dat_dim)] + name_obs <- name_obs[-which(name_obs == dat_dim)] + } + if (!all(name_exp == name_obs)) { + stop("Parameter 'exp' and 'obs' must have the same dimension names.") + } + if (!all(dim(exp)[name_exp] == dim(obs)[name_obs])) { + stop(paste0("Parameter 'exp' and 'obs' must have same length of ", + "all dimensions except 'dat_dim' and 'memb_dim'.")) + } + + name_ref <- sort(names(dim(ref))) + if (!is.null(memb_dim) && memb_dim %in% name_ref) { + name_ref <- name_ref[-which(name_ref == memb_dim)] + } + if (!is.null(dat_dim)) { + if (dat_dim %in% name_ref) { + if (!identical(dim(exp)[dat_dim], dim(ref)[dat_dim])) { + stop(paste0("If parameter 'ref' has dataset dimension, it must be ", + "equal to dataset dimension of 'exp'.")) + } + name_ref <- name_ref[-which(name_ref == dat_dim)] + } + } + if (!identical(length(name_exp), length(name_ref)) | + !identical(dim(exp)[name_exp], dim(ref)[name_ref])) { + stop(paste0("Parameter 'exp' and 'ref' must have the same length of ", + "all dimensions except 'memb_dim' and 'dat_dim' if there is ", + "only one reference dataset.")) + } + + if (dim(exp)[time_dim] <= 2) { + stop("The length of time_dim must be more than 2 to compute MSSS.") + } + + ############################### + # # Sort dimension + # name_exp <- names(dim(exp)) + # name_obs <- names(dim(obs)) + # order_obs <- match(name_exp, name_obs) + # obs <- Reorder(obs, order_obs) + + ############################### + ## Ensemble mean + if (!is.null(memb_dim)) { + if (memb_dim %in% names(dim(exp))) { + exp <- MeanDims(exp, memb_dim, na.rm = T) + } + if (memb_dim %in% names(dim(obs))) { + obs <- MeanDims(obs, memb_dim, na.rm = T) + } + if (memb_dim %in% names(dim(ref))) { + ref <- MeanDims(ref, memb_dim, na.rm = T) + } + } + + ############################### + # Calculate MSSS + + data <- list(exp = exp, obs = obs, ref = ref) + if (!is.null(dat_dim)) { + if (dat_dim %in% names(dim(ref))) { + target_dims <- list(exp = c(time_dim, dat_dim), + obs = c(time_dim, dat_dim), + ref = c(time_dim, dat_dim)) + } else { + target_dims <- list(exp = c(time_dim, dat_dim), + obs = c(time_dim, dat_dim), + ref = c(time_dim)) + } + } else { + target_dims <- list(exp = time_dim, obs = time_dim, ref = time_dim) + } + + res <- Apply(data, + target_dims = target_dims, + fun = .MSSS, + time_dim = time_dim, dat_dim = dat_dim, + pval = pval, sign = sign, alpha = alpha, + sig_method = sig_method, sig_method.type = sig_method.type, + ncores = ncores) + + return(res) +} + +.MSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = NULL, pval = TRUE, + sign = FALSE, alpha = 0.05, sig_method = 'one-sided Fisher', + sig_method.type = NULL) { + # exp: [sdate, (dat)] + # obs: [sdate, (dat)] + # ref: [sdate, (dat)] or NULL + + if (is.null(ref)) { + ref <- array(data = 0, dim = dim(obs)) + } else if (identical(ref, 0) | identical(ref, 1)) { + ref <- array(ref, dim = dim(exp)) + } + + if (is.null(dat_dim)) { + # exp: [sdate] + # obs: [sdate] + nexp <- 1 + nobs <- 1 + nref <- 1 + # Add dat dim back temporarily + dim(exp) <- c(dim(exp), dat = 1) + dim(obs) <- c(dim(obs), dat = 1) + dim(ref) <- c(dim(ref), dat = 1) + + } else { + # exp: [sdate, dat_exp] + # obs: [sdate, dat_obs] + nexp <- as.numeric(dim(exp)[2]) + nobs <- as.numeric(dim(obs)[2]) + if (dat_dim %in% names(dim(ref))) { + nref <- as.numeric(dim(ref)[2]) + } else { + dim(ref) <- c(dim(ref), dat = 1) + nref <- 1 + } + } + + nsdate <- as.numeric(dim(exp)[1]) + + # MSE of forecast + dif1 <- array(dim = c(nsdate, nexp, nobs)) + names(dim(dif1)) <- c(time_dim, 'nexp', 'nobs') + + for (i in 1:nobs) { + dif1[, , i] <- sapply(1:nexp, function(x) {exp[, x] - obs[, i]}) + } + + mse_exp <- colMeans(dif1^2, na.rm = TRUE) # [nexp, nobs] + + # MSE of reference + dif2 <- array(dim = c(nsdate, nref, nobs)) + names(dim(dif2)) <- c(time_dim, 'nexp', 'nobs') + for (i in 1:nobs) { + dif2[, , i] <- sapply(1:nref, function(x) {ref[, x] - obs[, i]}) + } + mse_ref <- colMeans(dif2^2, na.rm = TRUE) # [nref, nobs] + if (nexp != nref) { + # expand mse_ref to nexp (nref is 1) + mse_ref <- array(mse_ref, dim = c(nobs = nobs, nexp = nexp)) + mse_ref <- Reorder(mse_ref, c(2, 1)) + } + + msss <- 1 - mse_exp / mse_ref + + ################################################# + + if (sig_method == 'one-sided Fisher') { + p_val <- array(dim = c(nexp = nexp, nobs = nobs)) + ## pval and sign + if (pval || sign) { + eno1 <- Eno(dif1, time_dim) + if (is.null(ref)) { + eno2 <- Eno(obs, time_dim) + eno2 <- array(eno2, dim = c(nobs = nobs, nexp = nexp)) + eno2 <- Reorder(eno2, c(2, 1)) + } else { + eno2 <- Eno(dif2, time_dim) + if (nref != nexp) { + eno2 <- array(eno2, dim = c(nobs = nobs, nexp = nexp)) + eno2 <- Reorder(eno2, c(2, 1)) + } + } + + F.stat <- (eno2 * mse_ref^2 / (eno2 - 1)) / ((eno1 * mse_exp^2 / (eno1- 1))) + tmp <- !is.na(eno1) & !is.na(eno2) & eno1 > 2 & eno2 > 2 + p_val <- 1 - pf(F.stat, eno1 - 1, eno2 - 1) + if (sign) signif <- p_val <= alpha + # If there isn't enough valid data, return NA + p_val[which(!tmp)] <- NA + if (sign) signif[which(!tmp)] <- NA + + # change not enough valid data msss to NA + msss[which(!tmp)] <- NA + } + + } else if (sig_method == "Random Walk") { + + if (sign) signif <- array(dim = c(nexp = nexp, nobs = nobs)) + if (pval) p_val <- array(dim = c(nexp = nexp, nobs = nobs)) + + for (i in 1:nexp) { + for (j in 1:nobs) { + error_exp <- array(data = abs(exp[, i] - obs[, j]), dim = c(time = nsdate)) + if (nref == nexp) { + error_ref <- array(data = abs(ref[, i] - obs[, j]), dim = c(time = nsdate)) + } else { + # nref = 1 + error_ref <- array(data = abs(ref - obs[, j]), dim = c(time = nsdate)) + } + aux <- .RandomWalkTest(skill_A = error_exp, skill_B = error_ref, + test.type = sig_method.type, + pval = pval, sign = sign, alpha = alpha) + if (sign) signif[i, j] <- aux$sign + if (pval) p_val[i, j] <- aux$p.val + } + } + } + + ################################### + # Remove extra dimensions if dat_dim = NULL + if (is.null(dat_dim)) { + dim(msss) <- NULL + if (pval) dim(p_val) <- NULL + if (sign) dim(signif) <- NULL + } + ################################### + + # output + res <- list(msss = msss) + if (pval) res <- c(res, list(p.val = p_val)) + if (sign) res <- c(res, list(sign = signif)) + + return(res) +} diff --git a/R/Plot2VarsVsLTime.R b/R/Plot2VarsVsLTime.R index 1c784dd5fb430090d00378f04b180ac0e6349111..12e04ae22d676c670d8fde7afd188f61292316b0 100644 --- a/R/Plot2VarsVsLTime.R +++ b/R/Plot2VarsVsLTime.R @@ -63,15 +63,17 @@ #'leadtimes_per_startdate <- 60 #'rms <- RMS(MeanDims(smooth_ano_exp, dim_to_mean), #' MeanDims(smooth_ano_obs, dim_to_mean), -#' comp_dim = required_complete_row, +#' comp_dim = required_complete_row, dat_dim = 'dataset', #' limits = c(ceiling((runmean_months + 1) / 2), -#' leadtimes_per_startdate - floor(runmean_months / 2))) +#' leadtimes_per_startdate - floor(runmean_months / 2))) #'smooth_ano_exp_m_sub <- smooth_ano_exp - InsertDim(MeanDims(smooth_ano_exp, 'member', #' na.rm = TRUE), #' posdim = 3, #' lendim = dim(smooth_ano_exp)['member'], #' name = 'member') +#'suppressWarnings({ #'spread <- Spread(smooth_ano_exp_m_sub, compute_dim = c('member', 'sdate')) +#'}) #'#Combine rms outputs into one array #'rms_combine <- abind::abind(rms$conf.lower, rms$rms, rms$conf.upper, along = 0) #'rms_combine <- Reorder(rms_combine, c(2, 3, 1, 4)) diff --git a/R/PlotACC.R b/R/PlotACC.R index a674ff69d91e5de3dac876d97db8e112c26eb6e8..6ea518203dd07520ca52392c857bd0e65f671928 100644 --- a/R/PlotACC.R +++ b/R/PlotACC.R @@ -65,8 +65,9 @@ #'ano_exp <- Ano(sampleData$mod, clim$clim_exp) #'ano_obs <- Ano(sampleData$obs, clim$clim_obs) -#'acc <- ACC(ano_exp, ano_obs, lat = sampleData$lat) -#'acc_bootstrap <- ACC(ano_exp, ano_obs, lat = sampleData$lat, conftype = 'bootstrap') +#'acc <- ACC(ano_exp, ano_obs, lat = sampleData$lat, dat_dim = 'dataset') +#'acc_bootstrap <- ACC(ano_exp, ano_obs, lat = sampleData$lat, conftype = 'bootstrap', +#' dat_dim = 'dataset') #'# Combine acc results for PlotACC #'res <- array(c(acc$conf.lower, acc$acc, acc$conf.upper, acc$p.val), #' dim = c(dim(acc$acc), 4)) @@ -86,7 +87,8 @@ PlotACC <- function(ACC, sdates, toptitle = "", sizetit = 1, ytitle = "", width = 8, height = 5, size_units = 'in', res = 100, ...) { # Process the user graphical parameters that may be passed in the call ## Graphical parameters to exclude - excludedArgs <- c("cex", "cex.axis", "cex.lab", "cex.main", "col", "lab", "las", "lty", "lwd", "mai", "mgp", "new", "pch", "pin", "ps", "pty") + excludedArgs <- c("cex", "cex.axis", "cex.lab", "cex.main", "col", "lab", "las", "lty", + "lwd", "mai", "mgp", "new", "pch", "pin", "ps", "pty") userArgs <- .FilterUserGraphicArgs(excludedArgs, ...) # If there is any filenames to store the graphics, process them diff --git a/R/PlotEquiMap.R b/R/PlotEquiMap.R index 2c98430d27e6418021ea52af30344b4b43bfbe8a..3b8f861f8b7dced6b9ddcba7b99b202717e59da9 100644 --- a/R/PlotEquiMap.R +++ b/R/PlotEquiMap.R @@ -253,7 +253,7 @@ #' } #'PlotEquiMap(sampleData$mod[1, 1, 1, 1, , ], sampleData$lon, sampleData$lat, #' toptitle = 'Predicted sea surface temperature for Nov 1960 from 1st Nov', -#' sizetit = 0.5) +#' title_scale = 0.5) #'@import graphics maps #'@importFrom grDevices dev.cur dev.new dev.off gray #'@importFrom stats cor @@ -530,7 +530,26 @@ PlotEquiMap <- function(var, lon, lat, varu = NULL, varv = NULL, title_scale <- sizetit } - var_limits <- c(min(var, na.rm = TRUE), max(var, na.rm = TRUE)) + if (!all(is.na(var))) { + var_limits <- c(min(var, na.rm = TRUE), max(var, na.rm = TRUE)) + } else { + .warning("All the data are NAs. The map will be filled with colNA.") + if (!is.null(brks) && length(brks) > 1) { + #NOTE: var_limits be like this to avoid warnings from ColorBar + var_limits <- c(min(brks, na.rm = TRUE) + diff(brks)[1], + max(brks, na.rm = TRUE)) + } else if (!is.null(bar_limits)) { + var_limits <- c(bar_limits[1] + 0.01, bar_limits[2]) + } else { + var_limits <- c(-0.5, 0.5) # random range since colorbar is not going to be plotted + if (drawleg) { + drawleg <- FALSE + .warning("All data are NAs. Color bar won't be drawn. If you want to have ", + "color bar still, define parameter 'brks' or 'bar_limits'.") + } + } + } + # Check: brks, cols, subsampleg, bar_limits, color_fun, bar_extra_labels, draw_bar_ticks # draw_separators, triangle_ends_scale, label_scale, units, units_scale, # bar_label_digits diff --git a/R/PlotLayout.R b/R/PlotLayout.R index c442bf77311df2a1f6b5aa5149b2ffbb3f66ca25..6553f8a5067a85d83619bbcacfc16411c29aabd5 100644 --- a/R/PlotLayout.R +++ b/R/PlotLayout.R @@ -346,13 +346,26 @@ PlotLayout <- function(fun, plot_dims, var, ..., special_args = NULL, } # Check the rest of parameters (unless the user simply wants to build an empty layout) - var_limits <- NULL if (!all(sapply(var, is_single_na))) { - var_limits <- c(min(unlist(var), na.rm = TRUE), max(unlist(var), na.rm = TRUE)) - if ((any(is.infinite(var_limits)) || var_limits[1] == var_limits[2])) { - stop("Arrays in parameter 'var' must contain at least 2 different values.") + if (!all(is.na(unlist(var)))) { + var_limits <- c(min(unlist(var), na.rm = TRUE), max(unlist(var), na.rm = TRUE)) + } else { + if (!is.null(brks)) { + #NOTE: var_limits be like this to avoid warnings from ColorBar + var_limits <- c(min(brks, na.rm = TRUE) + diff(brks)[1], + max(brks, na.rm = TRUE)) + } else if (!is.null(bar_limits)) { + var_limits <- c(bar_limits[1] + 0.01, bar_limits[2]) + } else { + var_limits <- c(-0.5, 0.5) # random range since colorbar is not going to be plotted + if (!isFALSE(drawleg)) { + drawleg <- FALSE + .warning("All data are NAs. Color bar won't be drawn. If you want to have color bar still, define parameter 'brks' or 'bar_limits'.") + } + } } } + colorbar <- ColorBar(brks, cols, FALSE, subsampleg, bar_limits, var_limits, triangle_ends, col_inf, col_sup, color_fun, plot = FALSE, draw_bar_ticks, diff --git a/R/PlotVsLTime.R b/R/PlotVsLTime.R index 94c82e0735c61c4ecf0f8186e09fbe681560a8cb..c51e31b3c04cb2d302e282a14d9b22f7a83be47b 100644 --- a/R/PlotVsLTime.R +++ b/R/PlotVsLTime.R @@ -79,11 +79,12 @@ #'leadtimes_per_startdate <- 60 #'corr <- Corr(MeanDims(smooth_ano_exp, dim_to_mean), #' MeanDims(smooth_ano_obs, dim_to_mean), -#' comp_dim = required_complete_row, +#' comp_dim = required_complete_row, dat_dim = 'dataset', #' limits = c(ceiling((runmean_months + 1) / 2), #' leadtimes_per_startdate - floor(runmean_months / 2))) #'# Combine corr results for plotting -#'corr_combine <- abind::abind(corr$conf.lower, corr$corr, corr$conf.upper, corr$p.val, along = 0) +#'corr_combine <- abind::abind(corr$conf.lower, corr$corr, corr$conf.upper, corr$p.val, +#' along = 0) #'corr_combine <- Reorder(corr_combine, c(2, 3, 1, 4)) #'\donttest{ #'PlotVsLTime(corr_combine, toptitle = "correlations", ytitle = "correlation", diff --git a/R/RMS.R b/R/RMS.R index 645e34b0c753b5017716e20fb3eaa56162089eb1..8f7e58b71c50039a58d9d3a814a83815de80e533 100644 --- a/R/RMS.R +++ b/R/RMS.R @@ -2,7 +2,7 @@ #' #'Compute the root mean square error for an array of forecasts and an array of #'observations. The RMSEs are computed along time_dim, the dimension which -#'corresponds to the startdate dimension. If comp_dim is given, the RMSEs are +#'corresponds to the start date dimension. If comp_dim is given, the RMSEs are #'computed only if obs along the comp_dim dimension are complete between #'limits[1] and limits[2], i.e. there are no NAs between limits[1] and #'limits[2]. This option can be activated if the user wishes to account only @@ -10,18 +10,19 @@ #'all leadtimes.\cr #'The confidence interval is computed by the chi2 distribution.\cr #' -#'@param exp A named numeric array of experimental data, with at least two -#' dimensions 'time_dim' and 'dat_dim'. It can also be a vector with the -#' same length as 'obs', then the vector will automatically be 'time_dim' and -#' 'dat_dim' will be 1. +#'@param exp A named numeric array of experimental data, with at least +#' 'time_dim' dimension. It can also be a vector with the same length as 'obs'. #'@param obs A named numeric array of observational data, same dimensions as -#' parameter 'exp' except along dat_dim. It can also be a vector with the same -#' length as 'exp', then the vector will automatically be 'time_dim' and -#' 'dat_dim' will be 1. +#' parameter 'exp' except along 'dat_dim' and 'memb_dim'. It can also be a +#' vector with the same length as 'exp'. #'@param time_dim A character string indicating the name of dimension along #' which the correlations are computed. The default value is 'sdate'. -#'@param dat_dim A character string indicating the name of member (nobs/nexp) -#' dimension. The default value is 'dataset'. +#'@param memb_dim A character string indicating the name of the member dimension +#' to compute the ensemble mean; it should be set to NULL if the input data are +#' already the ensemble mean. The default value is NULL. +#'@param dat_dim A character string indicating the name of dataset or member +#' (nobs/nexp) dimension. The datasets of exp and obs will be paired and +#' computed RMS for each pair. The default value is NULL. #'@param comp_dim A character string indicating the name of dimension along which #' obs is taken into account only if it is complete. The default value #' is NULL. @@ -29,8 +30,8 @@ #' be completed. The default value is c(1, length(comp_dim dimension)). #'@param conf A logical value indicating whether to retrieve the confidence #' intervals or not. The default value is TRUE. -#'@param conf.lev A numeric indicating the confidence level for the -#' regression computation. The default value is 0.95. +#'@param alpha A numeric indicating the significance level for the statistical +#' significance test. The default value is 0.05. #'@param ncores An integer indicating the number of cores to use for parallel #' computation. The default value is NULL. #' @@ -38,7 +39,8 @@ #'A list containing the numeric arrays with dimension:\cr #' c(nexp, nobs, all other dimensions of exp except time_dim).\cr #'nexp is the number of experiment (i.e., dat_dim in exp), and nobs is the -#'number of observation (i.e., dat_dim in obs).\cr +#'number of observation (i.e., dat_dim in obs). If dat_dim is NULL, nexp and +#'nobs are omitted.\cr #'\item{$rms}{ #' The root mean square error. #'} @@ -51,24 +53,32 @@ #' #'@examples #'# Load sample data as in Load() example: -#' set.seed(1) -#' exp1 <- array(rnorm(120), dim = c(dataset = 3, sdate = 5, ftime = 2, lon = 1, lat = 4)) -#' set.seed(2) -#' obs1 <- array(rnorm(80), dim = c(dataset = 2, sdate = 5, ftime = 2, lon = 1, lat = 4)) -#' set.seed(2) -#' na <- floor(runif(10, min = 1, max = 80)) -#' obs1[na] <- NA -#' res <- RMS(exp1, obs1, comp_dim = 'ftime') -#' # Renew example when Ano and Smoothing are ready +#'example(Load) +#'clim <- Clim(sampleData$mod, sampleData$obs) +#'ano_exp <- Ano(sampleData$mod, clim$clim_exp) +#'ano_obs <- Ano(sampleData$obs, clim$clim_obs) +#'smooth_ano_exp <- Smoothing(ano_exp, runmeanlen = 12, time_dim = 'ftime') +#'smooth_ano_obs <- Smoothing(ano_obs, runmeanlen = 12, time_dim = 'ftime') +#'res <- RMS(smooth_ano_exp, smooth_ano_obs, memb_dim = 'member', +#' comp_dim = 'ftime', limits = c(7, 54)) #' -#'@rdname RMS +#'# Synthetic data: +#'exp1 <- array(rnorm(120), dim = c(dat = 3, sdate = 10, ftime = 4)) +#'obs1 <- array(rnorm(80), dim = c(dat = 2, sdate = 10, ftime = 4)) +#'na <- floor(runif(10, min = 1, max = 80)) +#'obs1[na] <- NA +#'res1 <- RMS(exp1, obs1, comp_dim = 'ftime', dat_dim = 'dat') +#' +#'exp2 <- array(rnorm(20), dim = c(sdate = 5, member = 4)) +#'obs2 <- array(rnorm(10), dim = c(sdate = 5, member = 2)) +#'res2 <- RMS(exp2, obs2, memb_dim = 'member') +#' #'@import multiApply #'@importFrom ClimProjDiags Subset #'@importFrom stats qchisq #'@export -RMS <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', - comp_dim = NULL, limits = NULL, - conf = TRUE, conf.lev = 0.95, ncores = NULL) { +RMS <- function(exp, obs, time_dim = 'sdate', memb_dim = NULL, dat_dim = NULL, + comp_dim = NULL, limits = NULL, conf = TRUE, alpha = 0.05, ncores = NULL) { # Check inputs ## exp and obs (1) if (is.null(exp) | is.null(obs)) { @@ -79,26 +89,22 @@ RMS <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', } if (is.null(dim(exp)) & is.null(dim(obs))) { #is vector if (length(exp) == length(obs)) { - exp <- array(exp, dim = c(length(exp), 1)) - names(dim(exp)) <- c(time_dim, dat_dim) - obs <- array(obs, dim = c(length(obs), 1)) - names(dim(obs)) <- c(time_dim, dat_dim) + exp <- array(exp, dim = c(length(exp))) + names(dim(exp)) <- c(time_dim) + obs <- array(obs, dim = c(length(obs))) + names(dim(obs)) <- c(time_dim) } else { - stop(paste0("Parameter 'exp' and 'obs' must be array with as least two ", - "dimensions time_dim and dat_dim, or vector of same length.")) + stop(paste0("Parameter 'exp' and 'obs' must be array with as least two ", + "dimensions time_dim and dat_dim, or vector of same length.")) } } else if (is.null(dim(exp)) | is.null(dim(obs))) { stop(paste0("Parameter 'exp' and 'obs' must be array with as least two ", "dimensions time_dim and dat_dim, or vector of same length.")) } - if(any(is.null(names(dim(exp))))| any(nchar(names(dim(exp))) == 0) | - any(is.null(names(dim(obs))))| any(nchar(names(dim(obs))) == 0)) { + if (any(is.null(names(dim(exp)))) | any(nchar(names(dim(exp))) == 0) | + any(is.null(names(dim(obs)))) | any(nchar(names(dim(obs))) == 0)) { stop("Parameter 'exp' and 'obs' must have dimension names.") } - if(!all(names(dim(exp)) %in% names(dim(obs))) | - !all(names(dim(obs)) %in% names(dim(exp)))) { - stop("Parameter 'exp' and 'obs' must have same dimension name") - } ## time_dim if (!is.character(time_dim) | length(time_dim) > 1) { stop("Parameter 'time_dim' must be a character string.") @@ -106,6 +112,15 @@ RMS <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', if (!time_dim %in% names(dim(exp)) | !time_dim %in% names(dim(obs))) { stop("Parameter 'time_dim' is not found in 'exp' or 'obs' dimension.") } + ## memb_dim + if (!is.null(memb_dim)) { + if (!is.character(memb_dim) | length(memb_dim) > 1) { + stop("Parameter 'memb_dim' must be a character string.") + } + if (!memb_dim %in% names(dim(exp)) & !memb_dim %in% names(dim(obs))) { + stop("Parameter 'memb_dim' is not found in 'exp' nor 'obs' dimension.") + } + } ## dat_dim if (!is.null(dat_dim)) { if (!is.character(dat_dim) | length(dat_dim) > 1) { @@ -140,9 +155,9 @@ RMS <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', if (!is.logical(conf) | length(conf) > 1) { stop("Parameter 'conf' must be one logical value.") } - ## conf.lev - if (!is.numeric(conf.lev) | conf.lev < 0 | conf.lev > 1 | length(conf.lev) > 1) { - stop("Parameter 'conf.lev' must be a numeric number between 0 and 1.") + ## alpha + if (!is.numeric(alpha) | alpha < 0 | alpha > 1 | length(alpha) > 1) { + stop("Parameter 'alpha' must be a numeric number between 0 and 1.") } ## ncores if (!is.null(ncores)) { @@ -151,22 +166,44 @@ RMS <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', stop("Parameter 'ncores' must be a positive integer.") } } + ## exp and obs (2) name_exp <- sort(names(dim(exp))) name_obs <- sort(names(dim(obs))) + if (!is.null(memb_dim)) { + if (memb_dim %in% name_exp) { + name_exp <- name_exp[-which(name_exp == memb_dim)] + } + if (memb_dim %in% name_obs) { + name_obs <- name_obs[-which(name_obs == memb_dim)] + } + } if (!is.null(dat_dim)) { name_exp <- name_exp[-which(name_exp == dat_dim)] name_obs <- name_obs[-which(name_obs == dat_dim)] } - if(!all(dim(exp)[name_exp] == dim(obs)[name_obs])) { + if (!all(name_exp == name_obs)) { + stop("Parameter 'exp' and 'obs' must have the same dimension names.") + } + if (!identical(dim(exp)[name_exp], dim(obs)[name_obs])) { stop(paste0("Parameter 'exp' and 'obs' must have same length of ", - "all dimension except 'dat_dim'.")) + "all dimensions except 'dat_dim' and 'memb_dim'.")) } if (dim(exp)[time_dim] < 2) { stop("The length of time_dim must be at least 2 to compute RMS.") } - - + + ############################### + ## Ensemble mean + if (!is.null(memb_dim)) { + if (memb_dim %in% names(dim(exp))) { + exp <- MeanDims(exp, memb_dim, na.rm = T) + } + if (memb_dim %in% names(dim(obs))) { + obs <- MeanDims(obs, memb_dim, na.rm = T) + } + } + ############################### # Sort dimension name_exp <- names(dim(exp)) @@ -195,21 +232,20 @@ RMS <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', c(time_dim, dat_dim)), fun = .RMS, time_dim = time_dim, dat_dim = dat_dim, - conf = conf, conf.lev = conf.lev, ncores_input = ncores, + conf = conf, alpha = alpha, ncores = ncores) return(res) } -.RMS <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', - conf = TRUE, conf.lev = 0.95, ncores_input = NULL) { +.RMS <- function(exp, obs, time_dim = 'sdate', dat_dim = NULL, conf = TRUE, alpha = 0.05) { if (is.null(dat_dim)) { # exp: [sdate] # obs: [sdate] nexp <- 1 nobs <- 1 ini_dims <- dim(exp) - dim(exp) <- c(ini_dims, dat_dim = 1) - dim(obs) <- c(ini_dims, dat_dim = 1) + dim(exp) <- c(ini_dims, dat = 1) + dim(obs) <- c(ini_dims, dat = 1) } else { # exp: [sdate, dat_exp] # obs: [sdate, dat_obs] @@ -218,10 +254,9 @@ RMS <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', } dif <- array(dim = c(dim(exp)[1], nexp = nexp, nobs = nobs)) - chi <- array(dim = c(nexp = nexp, nobs = nobs)) if (conf) { - conflow <- (1 - conf.lev) / 2 + conflow <- alpha / 2 confhigh <- 1 - conflow conf.lower <- array(dim = c(nexp = nexp, nobs = nobs)) conf.upper <- array(dim = c(nexp = nexp, nobs = nobs)) @@ -232,12 +267,17 @@ RMS <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', dif[, , i] <- sapply(1:nexp, function(x) {exp[, x] - obs[, i]}) } - rms <- apply(dif^2, c(2, 3), mean, na.rm = TRUE)^0.5 #array(dim = c(_exp, nobs)) - - if (conf) { - #eno <- Eno(dif, 1) #count effective sample along sdate. dim = c(nexp, nobs) - eno <- Eno(dif, time_dim, ncores = ncores_input) #change to this line when Eno() is done + rms <- colMeans(dif^2, na.rm = TRUE)^0.5 # [nexp, nobs] + if (conf) { #NOTE: pval and sign also need + #count effective sample along sdate. eno: c(nexp, nobs) +# eno <- Eno(dif, time_dim) # slower than for loop below? + eno <- array(dim = c(nexp = nexp, nobs = nobs)) + for (n_obs in 1:nobs) { + for (n_exp in 1:nexp) { + eno[n_exp, n_obs] <- .Eno(dif[, n_exp, n_obs], na.action = na.pass) + } + } # conf.lower chi <- sapply(1:nobs, function(i) { qchisq(confhigh, eno[, i] - 1) @@ -251,6 +291,16 @@ RMS <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', conf.upper <- (eno * rms ** 2 / chi) ** 0.5 } +#NOTE: Not sure if the calculation is correct. p_val is reasonable compared to the chi-square chart though. +# if (pval | sign) { +# chi <- array(dim = c(nexp = nexp, nobs = nobs)) +# for (i in 1:nobs) { +# chi[, i] <- sapply(1:nexp, function(x) {sum((obs[, i] - exp[, x])^2 / exp[, x])}) +# } +# p_val <- pchisq(chi, eno - 1, lower.tail = FALSE) +# if (sign) signif <- p_val <= alpha +# } + ################################### # Remove nexp and nobs if dat_dim = NULL if (is.null(dat_dim)) { @@ -262,13 +312,9 @@ RMS <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset', } ################################### - - if (conf) { - res <- list(rms = rms, conf.lower = conf.lower, conf.upper = conf.upper) - } else { - res <- list(rms = rms) - } + res <- list(rms = rms) + if (conf) res <- c(res, list(conf.lower = conf.lower, conf.upper = conf.upper)) return(res) -} \ No newline at end of file +} diff --git a/R/RMSSS.R b/R/RMSSS.R index b8b3cc0eb5b7e85923f9684739888d07dddd2cc7..c33a40e5a85b77246ab9f8fff35ded3e732a4374 100644 --- a/R/RMSSS.R +++ b/R/RMSSS.R @@ -2,27 +2,23 @@ #' #'Compute the root mean square error skill score (RMSSS) between an array of #'forecast 'exp' and an array of observation 'obs'. The two arrays should -#'have the same dimensions except along dat_dim, where the length can be -#'different, with the number of experiments/models (nexp) and the number of -#'observational datasets (nobs).\cr -#'RMSSS computes the root mean square error skill score of each jexp in 1:nexp -#'against each job in 1:nobs which gives nexp * nobs RMSSS for each grid point +#'have the same dimensions except along 'dat_dim' and 'memb_dim'. The RMSSSs +#'are computed along 'time_dim', the dimension which corresponds to the start +#'date dimension. +#'RMSSS computes the root mean square error skill score of each exp in 1:nexp +#'against each obs in 1:nobs which gives nexp * nobs RMSSS for each grid point #'of the array.\cr -#'The RMSSS are computed along the time_dim dimension which should correspond -#'to the start date dimension.\cr #'The p-value and significance test are optionally provided by an one-sided #'Fisher test or Random Walk test.\cr #' #'@param exp A named numeric array of experimental data which contains at least -#' two dimensions for dat_dim and time_dim. It can also be a vector with the -#' same length as 'obs', then the vector will automatically be 'time_dim' and -#' 'dat_dim' will be 1. +#' time dimension (time_dim). It can also be a vector with the same length as +#' 'obs', then the vector will automatically be 'time_dim'. #'@param obs A named numeric array of observational data which contains at least -#' two dimensions for dat_dim and time_dim. The dimensions should be the same -#' as paramter 'exp' except the length of 'dat_dim' dimension. The order of -#' dimension can be different. It can also be a vector with the same length as -#' 'exp', then the vector will automatically be 'time_dim' and 'dat_dim' will -#' be 1. +#' time dimension (time_dim). The dimensions should be the same as parameter +#' 'exp' except the length of 'dat_dim' and 'memb_dim' dimension. It can also +#' be a vector with the same length as 'exp', then the vector will +#' automatically be 'time_dim'. #'@param ref A named numerical array of the reference forecast data with at #' least time dimension, or 0 (typical climatological forecast) or 1 #' (normalized climatological forecast). If it is an array, the dimensions must @@ -30,15 +26,15 @@ #' reference dataset, it should not have dataset dimension. If there is #' corresponding reference for each experiment, the dataset dimension must #' have the same length as in 'exp'. If 'ref' is NULL, the typical -#' climatological forecast is used as reference forecast (equivelant to 0.) +#' climatological forecast is used as reference forecast (equivalent to 0.) #' The default value is NULL. #'@param dat_dim A character string indicating the name of dataset (nobs/nexp) -#' dimension. The default value is 'dataset'. +#' dimension. The default value is NULL. #'@param time_dim A character string indicating the name of dimension along #' which the RMSSS are computed. The default value is 'sdate'. #'@param memb_dim A character string indicating the name of the member dimension -#' to compute the ensemble mean; it should be set to NULL if the parameter 'exp' -#' and 'ref' are already the ensemble mean. The default value is NULL. +#' to compute the ensemble mean; it should be set to NULL if the data are +#' already the ensemble mean. The default value is NULL. #'@param pval A logical value indicating whether to compute or not the p-value #' of the test Ho: RMSSS = 0. The default value is TRUE. #'@param sign A logical value indicating whether to compute or not the @@ -48,6 +44,11 @@ #' statistical significance test. The default value is 0.05. #'@param sig_method A character string indicating the significance method. The #' options are "one-sided Fisher" (default) and "Random Walk". +#'@param sig_method.type A character string indicating the test type of the +#' significance method. Check \code{RandomWalkTest()} parameter +#' \code{test.type} for details if parameter "sig_method" is "Random Walk". The +#' default is NULL (since "one-sided Fisher" doesn't have different test +#' types.) #'@param ncores An integer indicating the number of cores to use for parallel #' computation. The default value is NULL. #' @@ -70,19 +71,34 @@ #'} #' #'@examples +#'# Load sample data as in Load() example: +#'example(Load) +#'clim <- Clim(sampleData$mod, sampleData$obs) +#'ano_exp <- Ano(sampleData$mod, clim$clim_exp) +#'ano_obs <- Ano(sampleData$obs, clim$clim_obs) +#'rmsss <- RMSSS(ano_exp, ano_obs, dat_dim = 'dataset', memb_dim = 'member') +#' #' set.seed(1) -#' exp <- array(rnorm(30), dim = c(dataset = 2, time = 3, memb = 5)) +#' exp1 <- array(rnorm(30), dim = c(dataset = 2, time = 3, memb = 5)) #' set.seed(2) -#' obs <- array(rnorm(15), dim = c(time = 3, memb = 5, dataset = 1)) -#' res <- RMSSS(exp, obs, time_dim = 'time', dat_dim = 'dataset') +#' obs1 <- array(rnorm(15), dim = c(time = 3, memb = 5, dataset = 1)) +#' res1 <- RMSSS(exp1, obs1, time_dim = 'time', dat_dim = 'dataset') +#' +#' exp2 <- array(rnorm(30), dim = c(lat = 2, time = 3, memb = 5)) +#' obs2 <- array(rnorm(15), dim = c(time = 3, lat = 2)) +#' res2 <- RMSSS(exp2, obs2, time_dim = 'time', memb_dim = 'memb') +#' +#' exp3 <- array(rnorm(30), dim = c(lat = 2, time = 3)) +#' obs3 <- array(rnorm(15), dim = c(time = 3, lat = 2)) +#' res3 <- RMSSS(exp3, obs3, time_dim = 'time') #' #'@rdname RMSSS #'@import multiApply #'@importFrom stats pf #'@export -RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = 'dataset', +RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = NULL, memb_dim = NULL, pval = TRUE, sign = FALSE, alpha = 0.05, - sig_method = 'one-sided Fisher', ncores = NULL) { + sig_method = 'one-sided Fisher', sig_method.type = NULL, ncores = NULL) { # Check inputs ## exp, obs, and ref (1) @@ -94,26 +110,22 @@ RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = 'dataset', } if (is.null(dim(exp)) & is.null(dim(obs))) { #is vector if (length(exp) == length(obs)) { - exp <- array(exp, dim = c(length(exp), 1)) - names(dim(exp)) <- c(time_dim, dat_dim) - obs <- array(obs, dim = c(length(obs), 1)) - names(dim(obs)) <- c(time_dim, dat_dim) + exp <- array(exp, dim = c(length(exp))) + names(dim(exp)) <- c(time_dim) + obs <- array(obs, dim = c(length(obs))) + names(dim(obs)) <- c(time_dim) } else { - stop(paste0("Parameter 'exp' and 'obs' must be array with as least two ", - "dimensions time_dim and dat_dim, or vector of same length.")) + stop(paste0("Parameter 'exp' and 'obs' must be array with as least two ", + "dimensions time_dim and dat_dim, or vector of same length.")) } } else if (is.null(dim(exp)) | is.null(dim(obs))) { stop(paste0("Parameter 'exp' and 'obs' must be array with as least two ", "dimensions time_dim and dat_dim, or vector of same length.")) } - if(any(is.null(names(dim(exp))))| any(nchar(names(dim(exp))) == 0) | - any(is.null(names(dim(obs))))| any(nchar(names(dim(obs))) == 0)) { + if (any(is.null(names(dim(exp)))) | any(nchar(names(dim(exp))) == 0) | + any(is.null(names(dim(obs)))) | any(nchar(names(dim(obs))) == 0)) { stop("Parameter 'exp' and 'obs' must have dimension names.") } - if(!all(names(dim(exp)) %in% names(dim(obs))) | - !all(names(dim(obs)) %in% names(dim(exp)))) { - stop("Parameter 'exp' and 'obs' must have same dimension name.") - } if (!is.null(ref)) { if (!is.numeric(ref)) { stop("Parameter 'ref' must be numeric.") @@ -125,6 +137,11 @@ RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = 'dataset', } else if (length(ref) != 1 | any(!ref %in% c(0, 1))) { stop("Parameter 'ref' must be a numeric array or number 0 or 1.") } + } else { + ref <- 0 + } + if (!is.array(ref)) { # 0 or 1 + ref <- array(data = ref, dim = dim(exp)) } ## time_dim @@ -152,14 +169,6 @@ RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = 'dataset', if (!memb_dim %in% names(dim(exp))) { stop("Parameter 'memb_dim' is not found in 'exp' dimension.") } - if (memb_dim %in% names(dim(obs))) { - if (identical(as.numeric(dim(obs)[memb_dim]), 1)) { - obs <- ClimProjDiags::Subset(x = obs, along = memb_dim, indices = 1, drop = 'selected') - } else { - stop("Not implemented for observations with members ('obs' can have 'memb_dim', ", - "but it should be of length = 1).") - } - } } ## pval if (!is.logical(pval) | length(pval) > 1) { @@ -177,78 +186,100 @@ RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = 'dataset', if (length(sig_method) != 1 | !any(sig_method %in% c('one-sided Fisher', 'Random Walk'))) { stop("Parameter 'sig_method' must be one of 'one-sided Fisher' or 'Random Walk'.") } - if (sig_method == "Random Walk" & pval == T) { - warning("p-value cannot be calculated by significance method 'Random Walk'.") - pval <- FALSE + ## sig_method.type + if (sig_method == 'Random Walk') { + if (is.null(sig_method.type)) { + .warning("Parameter 'sig_method.type' must be specified if 'sig_method' is ", + "Random Walk. Assign it as 'two.sided'.") + .warning("Note that in s2dv <= 1.4.1, Random Walk uses 'two.sided.approx' method.", + "If you want to retain the same functionality, please specify parameter ", + "'sig_method.type' as 'two.sided.approx'.") + sig_method.type <- "two.sided" + } + if (!any(sig_method.type %in% c('two.sided.approx', 'two.sided', 'greater', 'less'))) { + stop("Parameter 'sig_method.type' must be a method accepted by RandomWalkTest() parameter 'test.type'.") + } + if (sig_method.type == 'two.sided.approx' & pval == T) { + .warning("p-value cannot be calculated by Random Walk 'two.sided.approx' method.") + pval <- FALSE + if (alpha != 0.05) { + .warning("DelSole and Tippett (2016) aproximation is valid for alpha ", + "= 0.05 only. Returning the significance at the 0.05 significance level.") + alpha <- 0.05 + } + } } ## ncores if (!is.null(ncores)) { - if (!is.numeric(ncores) | ncores %% 1 != 0 | ncores <= 0 | - length(ncores) > 1) { + if (!is.numeric(ncores) | ncores %% 1 != 0 | ncores <= 0 | length(ncores) > 1) { stop("Parameter 'ncores' must be a positive integer.") } } - ## exp and obs (2) + ## exp, obs, and ref (2) name_exp <- sort(names(dim(exp))) name_obs <- sort(names(dim(obs))) if (!is.null(memb_dim)) { - name_exp <- name_exp[-which(name_exp == memb_dim)] + if (memb_dim %in% name_exp) { + name_exp <- name_exp[-which(name_exp == memb_dim)] + } + if (memb_dim %in% name_obs) { + name_obs <- name_obs[-which(name_obs == memb_dim)] + } } if (!is.null(dat_dim)) { - name_exp <- name_exp[-which(name_exp == dat_dim)] - name_obs <- name_obs[-which(name_obs == dat_dim)] + name_exp <- name_exp[-which(name_exp == dat_dim)] + name_obs <- name_obs[-which(name_obs == dat_dim)] + } + if (!all(name_exp == name_obs)) { + stop("Parameter 'exp' and 'obs' must have the same dimension names.") } - if(!all(dim(exp)[name_exp] == dim(obs)[name_obs])) { + if (!all(dim(exp)[name_exp] == dim(obs)[name_obs])) { stop(paste0("Parameter 'exp' and 'obs' must have same length of ", - "all dimension except 'memb_dim' and 'dat_dim'.")) + "all dimensions except 'dat_dim' and 'memb_dim'.")) } - if (!is.null(ref)) { - name_ref <- sort(names(dim(ref))) - if (!is.null(memb_dim) && memb_dim %in% name_ref) { - name_ref <- name_ref[-which(name_ref == memb_dim)] - } - if (!is.null(dat_dim)) { - if (dat_dim %in% name_ref) { - if (!identical(dim(exp)[dat_dim], dim(ref)[dat_dim])) { - stop(paste0("If parameter 'ref' has dataset dimension, it must be ", - "equal to dataset dimension of 'exp'.")) - } - name_ref <- name_ref[-which(name_ref == dat_dim)] + + name_ref <- sort(names(dim(ref))) + if (!is.null(memb_dim) && memb_dim %in% name_ref) { + name_ref <- name_ref[-which(name_ref == memb_dim)] + } + if (!is.null(dat_dim)) { + if (dat_dim %in% name_ref) { + if (!identical(dim(exp)[dat_dim], dim(ref)[dat_dim])) { + stop(paste0("If parameter 'ref' has dataset dimension, it must be ", + "equal to dataset dimension of 'exp'.")) } + name_ref <- name_ref[-which(name_ref == dat_dim)] } - if (!identical(length(name_exp), length(name_ref)) | - !identical(dim(exp)[name_exp], dim(ref)[name_ref])) { - stop(paste0("Parameter 'exp' and 'ref' must have the same length of ", - "all dimensions except 'memb_dim' and 'dat_dim' if there is ", - "only one reference dataset.")) - } + } + if (!identical(length(name_exp), length(name_ref)) | + !identical(dim(exp)[name_exp], dim(ref)[name_ref])) { + stop(paste0("Parameter 'exp' and 'ref' must have the same length of ", + "all dimensions except 'memb_dim' and 'dat_dim' if there is ", + "only one reference dataset.")) } if (dim(exp)[time_dim] <= 2) { stop("The length of time_dim must be more than 2 to compute RMSSS.") } - - - ############################### -# # Sort dimension -# name_exp <- names(dim(exp)) -# name_obs <- names(dim(obs)) -# order_obs <- match(name_exp, name_obs) -# obs <- Reorder(obs, order_obs) - - + + ############################### - # Create ref array if needed - if (is.null(ref)) ref <- 0 - if (!is.array(ref)) { - ref <- array(data = ref, dim = dim(exp)) - } - + # # Sort dimension + # name_exp <- names(dim(exp)) + # name_obs <- names(dim(obs)) + # order_obs <- match(name_exp, name_obs) + # obs <- Reorder(obs, order_obs) + ############################### ## Ensemble mean if (!is.null(memb_dim)) { - exp <- MeanDims(exp, memb_dim, na.rm = T) - if (!is.null(ref) & memb_dim %in% names(dim(ref))) { + if (memb_dim %in% names(dim(exp))) { + exp <- MeanDims(exp, memb_dim, na.rm = T) + } + if (memb_dim %in% names(dim(obs))) { + obs <- MeanDims(obs, memb_dim, na.rm = T) + } + if (memb_dim %in% names(dim(ref))) { ref <- MeanDims(ref, memb_dim, na.rm = T) } } @@ -256,21 +287,6 @@ RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = 'dataset', ############################### # Calculate RMSSS -# if (!is.null(ref)) { # use "ref" as reference forecast -# if (!is.null(dat_dim) && dat_dim %in% names(dim(ref))) { -# target_dims_ref <- c(time_dim, dat_dim) -# } else { -# target_dims_ref <- c(time_dim) -# } -# data <- list(exp = exp, obs = obs, ref = ref) -# target_dims = list(exp = c(time_dim, dat_dim), -# obs = c(time_dim, dat_dim), -# ref = target_dims_ref) -# } else { -# data <- list(exp = exp, obs = obs) -# target_dims = list(exp = c(time_dim, dat_dim), -# obs = c(time_dim, dat_dim)) -# } data <- list(exp = exp, obs = obs, ref = ref) if (!is.null(dat_dim)) { if (dat_dim %in% names(dim(ref))) { @@ -291,14 +307,15 @@ RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = 'dataset', fun = .RMSSS, time_dim = time_dim, dat_dim = dat_dim, pval = pval, sign = sign, alpha = alpha, - sig_method = sig_method, + sig_method = sig_method, sig_method.type = sig_method.type, ncores = ncores) return(res) } -.RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = 'dataset', pval = TRUE, - sign = FALSE, alpha = 0.05, sig_method = 'one-sided Fisher') { +.RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = NULL, pval = TRUE, + sign = FALSE, alpha = 0.05, sig_method = 'one-sided Fisher', + sig_method.type = NULL) { # exp: [sdate, (dat)] # obs: [sdate, (dat)] # ref: [sdate, (dat)] or NULL @@ -321,8 +338,8 @@ RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = 'dataset', dim(ref) <- c(dim(ref), dat = 1) } else { - # exp: [sdate, dat_exp] - # obs: [sdate, dat_obs] + # exp: [sdate, dat_exp] + # obs: [sdate, dat_obs] nexp <- as.numeric(dim(exp)[2]) nobs <- as.numeric(dim(obs)[2]) if (dat_dim %in% names(dim(ref))) { @@ -343,40 +360,25 @@ RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = 'dataset', dif1[, , i] <- sapply(1:nexp, function(x) {exp[, x] - obs[, i]}) } - rms_exp <- apply(dif1^2, c(2, 3), mean, na.rm = TRUE)^0.5 #array(dim = c(nexp, nobs)) + rms_exp <- colMeans(dif1^2, na.rm = TRUE)^0.5 # [nexp, nobs] # RMS of reference -# if (!is.null(ref)) { - dif2 <- array(dim = c(nsdate, nref, nobs)) - names(dim(dif2)) <- c(time_dim, 'nexp', 'nobs') - for (i in 1:nobs) { - dif2[, , i] <- sapply(1:nref, function(x) {ref[, x] - obs[, i]}) - } - rms_ref <- apply(dif2^2, c(2, 3), mean, na.rm = TRUE)^0.5 #array(dim = c(nref, nobs)) - if (nexp != nref) { - # expand rms_ref to nexp (nref is 1) - rms_ref <- array(rms_ref, dim = c(nobs = nobs, nexp = nexp)) - rms_ref <- Reorder(rms_ref, c(2, 1)) - } -# } else { -# rms_ref <- array(colMeans(obs^2, na.rm = TRUE)^0.5, dim = c(nobs = nobs, nexp = nexp)) -## rms_ref[which(abs(rms_ref) <= (max(abs(rms_ref), na.rm = TRUE) / 1000))] <- max(abs( -## rms_ref), na.rm = TRUE) / 1000 -# rms_ref <- Reorder(rms_ref, c(2, 1)) -# #rms_ref above: [nexp, nobs] -# } + dif2 <- array(dim = c(nsdate, nref, nobs)) + names(dim(dif2)) <- c(time_dim, 'nexp', 'nobs') + for (i in 1:nobs) { + dif2[, , i] <- sapply(1:nref, function(x) {ref[, x] - obs[, i]}) + } + rms_ref <- colMeans(dif2^2, na.rm = TRUE)^0.5 # [nref, nobs] + if (nexp != nref) { + # expand rms_ref to nexp (nref is 1) + rms_ref <- array(rms_ref, dim = c(nobs = nobs, nexp = nexp)) + rms_ref <- Reorder(rms_ref, c(2, 1)) + } rmsss <- 1 - rms_exp / rms_ref ################################################# -# if (conf) { -# conflow <- (1 - conf.lev) / 2 -# confhigh <- 1 - conflow -# conf_low <- array(dim = c(nexp = nexp, nobs = nobs)) -# conf_high <- array(dim = c(nexp = nexp, nobs = nobs)) -# } - if (sig_method == 'one-sided Fisher') { p_val <- array(dim = c(nexp = nexp, nobs = nobs)) ## pval and sign @@ -407,11 +409,12 @@ RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = 'dataset', } } else if (sig_method == "Random Walk") { - signif <- array(dim = c(nexp = nexp, nobs = nobs)) + + if (sign) signif <- array(dim = c(nexp = nexp, nobs = nobs)) + if (pval) p_val <- array(dim = c(nexp = nexp, nobs = nobs)) + for (i in 1:nexp) { for (j in 1:nobs) { - - # Error error_exp <- array(data = abs(exp[, i] - obs[, j]), dim = c(time = nsdate)) if (nref == nexp) { error_ref <- array(data = abs(ref[, i] - obs[, j]), dim = c(time = nsdate)) @@ -419,7 +422,11 @@ RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = 'dataset', # nref = 1 error_ref <- array(data = abs(ref - obs[, j]), dim = c(time = nsdate)) } - signif[i, j] <- .RandomWalkTest(skill_A = error_exp, skill_B = error_ref)$sign + aux <- .RandomWalkTest(skill_A = error_exp, skill_B = error_ref, + test.type = sig_method.type, + pval = pval, sign = sign, alpha = alpha) + if (sign) signif[i, j] <- aux$sign + if (pval) p_val[i, j] <- aux$p.val } } } @@ -428,7 +435,7 @@ RMSSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = 'dataset', # Remove extra dimensions if dat_dim = NULL if (is.null(dat_dim)) { dim(rmsss) <- NULL - dim(p_val) <- NULL + if (pval) dim(p_val) <- NULL if (sign) dim(signif) <- NULL } ################################### diff --git a/R/ROCSS.R b/R/ROCSS.R index 7831a8830099d64f39ba2934ae21763d198fa742..2ca078279228c9e7297a5c7c7a1440b97a622768 100644 --- a/R/ROCSS.R +++ b/R/ROCSS.R @@ -6,24 +6,36 @@ #'curve can be summarized with the area under the ROC curve, known as the ROC #'score, to provide a skill value for each category. The ROCSS ranges between #'minus infinite and 1. A positive ROCSS value indicates that the forecast has -#'higher skill than the reference forecasts, meaning the contrary otherwise. -#'@param exp A named numerical array of the forecast with at least time and -#' member dimension. -#'@param obs A named numerical array of the observation with at least time -#' dimension. The dimensions must be the same as 'exp' except 'memb_dim' and -#' 'dat_dim'. -#'@param ref A named numerical array of the reference forecast data with at -#' least time and member dimension. The dimensions must be the same as 'exp' -#' except 'memb_dim' and 'dat_dim'. If there is only one reference dataset, -#' it should not have dataset dimension. If there is corresponding reference -#' for each experiement, the dataset dimension must have the same length as in -#' 'exp'. If 'ref' is NULL, the random forecast is used as reference forecast. -#' The default value is NULL. +#'higher skill than the reference forecast, meaning the contrary otherwise.\cr +#'The function accepts either the data or the probabilities of each data as +#'inputs. If there is more than one dataset, RPSS will be computed for each pair +#'of exp and obs data. +#' +#'@param exp A named numerical array of either the forecast with at least time +#' and member dimensions, or the probabilities with at least time and category +#' dimensions. The probabilities can be generated by \code{s2dv::GetProbs}. +#'@param obs A named numerical array of either the observations with at least +#' time dimension, or the probabilities with at least time and category +#' dimensions. The probabilities can be generated by \code{s2dv::GetProbs}. The +#' dimensions must be the same as 'exp' except 'memb_dim' and 'dat_dim'. +#'@param ref A named numerical array of the reference forecast with at least +#' time and member dimensions, or the probabilities with at least time and +#' category dimensions. The probability can be generated by +#' \code{s2dv::GetProbs}. The dimensions must be the same as 'exp' except +#' 'memb_dim' and 'dat_dim'. If there is only one reference dataset, it should +#' not have dataset dimension. If there is corresponding reference for each +#' experiment, the dataset dimension must have the same length as in 'exp'. +#' If 'ref' is NULL, the random forecast is used as reference forecast. The +#' default value is NULL. #'@param time_dim A character string indicating the name of the time dimension. #' The default value is 'sdate'. #'@param memb_dim A character string indicating the name of the member dimension #' to compute the probabilities of the forecast and the reference forecast. The -#' default value is 'member'. +#' default value is 'member'. If the data are probabilities, set memb_dim as +#' NULL. +#'@param cat_dim A character string indicating the name of the category +#' dimension that is needed when exp, obs, and ref are probabilities. The +#' default value is NULL, which means that the data are not probabilities. #'@param dat_dim A character string indicating the name of dataset dimension. #' The length of this dimension can be different between 'exp' and 'obs'. #' The default value is NULL. @@ -40,27 +52,34 @@ #' computation. The default value is NULL. #' #'@return -#'A numerical array of ROCSS with the same dimensions as 'exp' excluding -#''time_dim' and 'memb_dim' dimensions and including 'cat' dimension, which is -#'each category. The length if 'cat' dimension corresponds to the number of -#'probabilistic categories, i.e., 1 + length(prob_thresholds). If there are -#'multiple datasets, two additional dimensions 'nexp' and 'nobs' are added. +#'A numerical array of ROCSS with dimensions c(nexp, nobs, cat, the rest +#'dimensions of 'exp' except 'time_dim' and 'memb_dim' dimensions). nexp is the +#'number of experiment (i.e., dat_dim in exp), and nobs is the number of +#'observation (i.e., dat_dim in obs). If dat_dim is NULL, nexp and nobs are +#'omitted. dimension 'cat' refers to the probabilistic category, i.e., +#'\code{1 + length(prob_thresholds)}. #' #'@references #'Kharin, V. V. and Zwiers, F. W. (2003): #' https://doi.org/10.1175/1520-0442(2003)016%3C4145:OTRSOP%3E2.0.CO;2 #' #'@examples +#'# Use data as input #'exp <- array(rnorm(1000), dim = c(lon = 3, lat = 2, sdate = 60, member = 10)) #'ref <- array(rnorm(1000), dim = c(lon = 3, lat = 2, sdate = 60, member = 10)) #'obs <- array(rnorm(1000), dim = c(lon = 3, lat = 2, sdate = 60)) #'ROCSS(exp = exp, obs = obs) ## random forecast as reference forecast #'ROCSS(exp = exp, obs = obs, ref = ref) ## ref as reference forecast +#'# Use probs as input +#'exp_probs <- GetProbs(exp, memb_dim = 'member') +#'obs_probs <- GetProbs(obs, memb_dim = NULL) +#'ref_probs <- GetProbs(ref, memb_dim = 'member') +#'ROCSS(exp = exp_probs, obs = obs_probs, ref = ref_probs, memb_dim = NULL, cat_dim = 'bin') #' #'@import multiApply #'@importFrom easyVerification EnsRoca #'@export -ROCSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', +ROCSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', cat_dim = NULL, dat_dim = NULL, prob_thresholds = c(1/3, 2/3), indices_for_clim = NULL, cross.val = FALSE, ncores = NULL) { @@ -93,15 +112,31 @@ ROCSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', if (!is.null(ref) & !time_dim %in% names(dim(ref))) { stop("Parameter 'time_dim' is not found in 'ref' dimension.") } - ## memb_dim - if (!is.character(memb_dim) | length(memb_dim) > 1) { - stop("Parameter 'memb_dim' must be a character string.") + ## memb_dim & cat_dim + if (is.null(memb_dim) + is.null(cat_dim) != 1) { + stop("Only one of the two parameters 'memb_dim' and 'cat_dim' can have value.") } - if (!memb_dim %in% names(dim(exp))) { - stop("Parameter 'memb_dim' is not found in 'exp' dimension.") + ## memb_dim + if (!is.null(memb_dim)) { + if (!is.character(memb_dim) | length(memb_dim) > 1) { + stop("Parameter 'memb_dim' must be a character string.") + } + if (!memb_dim %in% names(dim(exp))) { + stop("Parameter 'memb_dim' is not found in 'exp' dimension.") + } + if (!is.null(ref) & !memb_dim %in% names(dim(ref))) { + stop("Parameter 'memb_dim' is not found in 'ref' dimension.") + } } - if (!is.null(ref) & !memb_dim %in% names(dim(ref))) { - stop("Parameter 'memb_dim' is not found in 'ref' dimension.") + ## cat_dim + if (!is.null(cat_dim)) { + if (!is.character(cat_dim) | length(cat_dim) > 1) { + stop("Parameter 'cat_dim' must be a character string.") + } + if (!cat_dim %in% names(dim(exp)) | !cat_dim %in% names(dim(obs)) | + (!is.null(ref) & !cat_dim %in% names(dim(ref)))) { + stop("Parameter 'cat_dim' is not found in 'exp', 'obs', or 'ref' dimension.") + } } ## dat_dim if (!is.null(dat_dim)) { @@ -116,9 +151,11 @@ ROCSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', ## exp, obs, and ref (2) name_exp <- sort(names(dim(exp))) name_obs <- sort(names(dim(obs))) - name_exp <- name_exp[-which(name_exp == memb_dim)] - if (memb_dim %in% name_obs) { - name_obs <- name_obs[-which(name_obs == memb_dim)] + if (!is.null(memb_dim)) { + name_exp <- name_exp[-which(name_exp == memb_dim)] + if (memb_dim %in% name_obs) { + name_obs <- name_obs[-which(name_obs == memb_dim)] + } } if (!is.null(dat_dim)) { name_exp <- name_exp[-which(name_exp == dat_dim)] @@ -131,7 +168,9 @@ ROCSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', } if (!is.null(ref)) { name_ref <- sort(names(dim(ref))) - name_ref <- name_ref[-which(name_ref == memb_dim)] + if (!is.null(memb_dim)) { + name_ref <- name_ref[-which(name_ref == memb_dim)] + } if (!is.null(dat_dim)) { if (dat_dim %in% name_ref) { if (!identical(dim(exp)[dat_dim], dim(ref)[dat_dim])) { @@ -187,41 +226,54 @@ ROCSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', output_dims <- c('nexp', 'nobs', 'cat') } ## target_dims - if (!memb_dim %in% names(dim(obs))) { - target_dims_obs <- c(time_dim, dat_dim) - } else { - target_dims_obs <- c(time_dim, memb_dim, dat_dim) + if (!is.null(memb_dim)) { + target_dims_exp <- c(time_dim, memb_dim, dat_dim) + if (!memb_dim %in% names(dim(obs))) { + target_dims_obs <- c(time_dim, dat_dim) + } else { + target_dims_obs <- c(time_dim, memb_dim, dat_dim) + } + } else { # cat_dim + target_dims_exp <- target_dims_obs <- c(time_dim, cat_dim, dat_dim) } - ## If ref doesn't have & dat_dim is not NULL - if (!is.null(ref) && !is.null(dat_dim) &&!dat_dim %in% names(dim(ref))) { - target_dims_ref <- c(time_dim, memb_dim) - } else { - target_dims_ref <- c(time_dim, memb_dim, dat_dim) + + if (!is.null(ref)) { # use "ref" as reference forecast + if (!is.null(memb_dim)) { + if (!is.null(dat_dim) && (dat_dim %in% names(dim(ref)))) { + target_dims_ref <- c(time_dim, memb_dim, dat_dim) + } else { + target_dims_ref <- c(time_dim, memb_dim) + } + } else { + target_dims_ref <- c(time_dim, cat_dim, dat_dim) + } } if (!is.null(ref)) { ## reference forecast is provided res <- Apply(data = list(exp = exp, obs = obs, ref = ref), - target_dims = list(exp = c(time_dim, memb_dim, dat_dim), + target_dims = list(exp = target_dims_exp, obs = target_dims_obs, ref = target_dims_ref), output_dims = output_dims, fun = .ROCSS, prob_thresholds = prob_thresholds, indices_for_clim = indices_for_clim, - time_dim = time_dim, dat_dim = dat_dim, + time_dim = time_dim, memb_dim = memb_dim, + cat_dim = cat_dim, dat_dim = dat_dim, cross.val = cross.val, ncores = ncores)$output1 } else { ## Random forecast as reference forecast res <- Apply(data = list(exp = exp, obs = obs), - target_dims = list(exp = c(time_dim, memb_dim, dat_dim), + target_dims = list(exp = target_dims_exp, obs = target_dims_obs), output_dims = output_dims, fun = .ROCSS, ref = ref, prob_thresholds = prob_thresholds, indices_for_clim = indices_for_clim, - time_dim = time_dim, dat_dim = dat_dim, + time_dim = time_dim, memb_dim = memb_dim, + cat_dim = cat_dim, dat_dim = dat_dim, cross.val = cross.val, ncores = ncores)$output1 } @@ -229,13 +281,19 @@ ROCSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', return(res) } -.ROCSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', dat_dim = NULL, prob_thresholds = c(1/3, 2/3), +.ROCSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', + cat_dim = NULL, dat_dim = NULL, prob_thresholds = c(1/3, 2/3), indices_for_clim = NULL, cross.val = FALSE) { + #--- if memb_dim: # exp: [sdate, memb, (dat)] # obs: [sdate, (dat)] - # ref: [sdate, memb, (dat)] or NULL - + # ref: [sdate, memb, (dat)] or NULL + #--- if cat_dim: + # exp: [sdate, bin, (dat)] + # obs: [sdate, bin, (dat)] + # ref: [sdate, bin, (dat)] or NULL + if (is.null(dat_dim)) { nexp <- 1 nobs <- 1 @@ -262,25 +320,34 @@ ROCSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', for (exp_i in 1:nexp) { for (obs_i in 1:nobs) { - # Input dim for .get_probs - ## if exp: [sdate, memb] - ## if obs: [sdate, (memb)] - exp_probs <- .get_probs(ClimProjDiags::Subset(exp, dat_dim, exp_i, drop = 'selected'), - indices_for_quantiles = indices_for_clim, - prob_thresholds = prob_thresholds, cross.val = cross.val) - obs_probs <- .get_probs(data = ClimProjDiags::Subset(obs, dat_dim, obs_i, drop = 'selected'), - indices_for_quantiles = indices_for_clim, - prob_thresholds = prob_thresholds, cross.val = cross.val) - ## exp_probs and obs_probs: [bin, sdate] - + if (is.null(cat_dim)) { # calculate probs + # Input dim for .GetProbs + ## if exp: [sdate, memb] + ## if obs: [sdate, (memb)] + exp_probs <- .GetProbs(data = ClimProjDiags::Subset(exp, dat_dim, exp_i, drop = 'selected'), + indices_for_quantiles = indices_for_clim, + prob_thresholds = prob_thresholds, cross.val = cross.val) + obs_probs <- .GetProbs(data = ClimProjDiags::Subset(obs, dat_dim, obs_i, drop = 'selected'), + indices_for_quantiles = indices_for_clim, + prob_thresholds = prob_thresholds, cross.val = cross.val) + ## exp_probs and obs_probs: [bin, sdate] + } else { + exp_probs <- exp[, , exp_i] + obs_probs <- obs[, , obs_i] + } + ## ROCS (exp) rocs_exp[exp_i, obs_i, ] <- unlist(EnsRoca(ens = Reorder(exp_probs, c(time_dim, 'bin')), obs = Reorder(obs_probs, c(time_dim, 'bin')))[1:ncats]) if (!is.null(ref)) { - ref_probs <- .get_probs(ClimProjDiags::Subset(ref, dat_dim, exp_i, drop = 'selected'), - indices_for_quantiles = indices_for_clim, - prob_thresholds = prob_thresholds, cross.val = cross.val) + if (is.null(cat_dim)) { # calculate probs + ref_probs <- .GetProbs(ClimProjDiags::Subset(ref, dat_dim, exp_i, drop = 'selected'), + indices_for_quantiles = indices_for_clim, + prob_thresholds = prob_thresholds, cross.val = cross.val) + } else { + ref_probs <- ref[, , exp_i] + } rocs_ref[exp_i, obs_i, ] <- unlist(EnsRoca(ens = Reorder(ref_probs, c(time_dim, 'bin')), obs = Reorder(obs_probs, c(time_dim, 'bin')))[1:ncats]) } diff --git a/R/RPS.R b/R/RPS.R index 32d88a42b56b1c93742310552edaa2c3f7bb5383..c385f10cf32097c58e94386515762fcb72200405 100644 --- a/R/RPS.R +++ b/R/RPS.R @@ -8,22 +8,32 @@ #'(perfect forecast) and n-1 (worst possible forecast), where n is the number of #'categories. In the case of a forecast divided into two categories (the lowest #'number of categories that a probabilistic forecast can have), the RPS -#'corresponds to the Brier Score (BS; Wilks, 2011), therefore, ranges between 0 -#'and 1. If there is more than one dataset, RPS will be computed for each pair -#'of exp and obs data. +#'corresponds to the Brier Score (BS; Wilks, 2011), therefore ranging between 0 +#'and 1.\cr +#'The function first calculates the probabilities for forecasts and observations, +#'then use them to calculate RPS. Or, the probabilities of exp and obs can be +#'provided directly to compute the score. If there is more than one dataset, RPS +#'will be computed for each pair of exp and obs data. The fraction of acceptable +#'NAs can be adjusted. #' -#'@param exp A named numerical array of the forecast with at least time and -#' member dimension. -#'@param obs A named numerical array of the observation with at least time -#' dimension. The dimensions must be the same as 'exp' except 'memb_dim' and -#' 'dat_dim'. +#'@param exp A named numerical array of either the forecasts with at least time +#' and member dimensions, or the probabilities with at least time and category +#' dimensions. The probabilities can be generated by \code{s2dv::GetProbs}. +#'@param obs A named numerical array of either the observation with at least +#' time dimension, or the probabilities with at least time and category +#' dimensions. The probabilities can be generated by \code{s2dv::GetProbs}. The +#' dimensions must be the same as 'exp' except 'memb_dim' and 'dat_dim'. #'@param time_dim A character string indicating the name of the time dimension. #' The default value is 'sdate'. +#'@param memb_dim A character string indicating the name of the member dimension +#' to compute the probabilities of the forecast. The default value is 'member'. +#' If the data are probabilities, set memb_dim as NULL. +#'@param cat_dim A character string indicating the name of the category +#' dimension that is needed when the exp and obs are probabilities. The default +#' value is NULL, which means that the data are not probabilities. #'@param dat_dim A character string indicating the name of dataset dimension. #' The length of this dimension can be different between 'exp' and 'obs'. #' The default value is NULL. -#'@param memb_dim A character string indicating the name of the member dimension -#' to compute the probabilities of the forecast. The default value is 'member'. #'@param prob_thresholds A numeric vector of the relative thresholds (from 0 to #' 1) between the categories. The default value is c(1/3, 2/3), which #' corresponds to tercile equiprobable categories. @@ -33,15 +43,20 @@ #'@param Fair A logical indicating whether to compute the FairRPS (the #' potential RPS that the forecast would have with an infinite ensemble size). #' The default value is FALSE. -#'@param weights A named numerical array of the weights for 'exp'. If 'dat_dim' -#' is NULL, the dimension should include 'memb_dim' and 'time_dim'. Else, the -#' dimension should also include 'dat_dim'. The default value is NULL. The -#' ensemble should have at least 70 members or span at least 10 time steps and -#' have more than 45 members if consistency between the weighted and unweighted -#' methodologies is desired. -#'@param cross.val A logical indicating whether to compute the thresholds between -#' probabilistic categories in cross-validation. -#' The default value is FALSE. +#'@param weights A named numerical array of the weights for 'exp' probability +#' calculation. If 'dat_dim' is NULL, the dimensions should include 'memb_dim' +#' and 'time_dim'. Else, the dimension should also include 'dat_dim'. The +#' default value is NULL. The ensemble should have at least 70 members or span +#' at least 10 time steps and have more than 45 members if consistency between +#' the weighted and unweighted methodologies is desired. +#'@param cross.val A logical indicating whether to compute the thresholds +#' between probabilistic categories in cross-validation. The default value is +#' FALSE. +#'@param na.rm A logical or numeric value between 0 and 1. If it is numeric, it +#' means the lower limit for the fraction of the non-NA values. 1 is equal to +#' FALSE (no NA is acceptable), 0 is equal to TRUE (all NAs are acceptable). +# The function returns NA if the fraction of non-NA values in the data is less +#' than na.rm. Otherwise, RPS will be calculated. The default value is FALSE. #'@param ncores An integer indicating the number of cores to use for parallel #' computation. The default value is NULL. #' @@ -55,16 +70,22 @@ #'Wilks, 2011; https://doi.org/10.1016/B978-0-12-385022-5.00008-7 #' #'@examples +#'# Use synthetic data #'exp <- array(rnorm(1000), dim = c(lat = 3, lon = 2, member = 10, sdate = 50)) #'obs <- array(rnorm(1000), dim = c(lat = 3, lon = 2, sdate = 50)) #'res <- RPS(exp = exp, obs = obs) +#'# Use probabilities as inputs +#'exp_probs <- GetProbs(exp, time_dim = 'sdate', memb_dim = 'member') +#'obs_probs <- GetProbs(obs, time_dim = 'sdate', memb_dim = NULL) +#'res2 <- RPS(exp = exp_probs, obs = obs_probs, memb_dim = NULL, cat_dim = 'bin') +#' #' #'@import multiApply #'@importFrom easyVerification convert2prob #'@export -RPS <- function(exp, obs, time_dim = 'sdate', memb_dim = 'member', dat_dim = NULL, - prob_thresholds = c(1/3, 2/3), indices_for_clim = NULL, Fair = FALSE, - weights = NULL, cross.val = FALSE, ncores = NULL) { +RPS <- function(exp, obs, time_dim = 'sdate', memb_dim = 'member', cat_dim = NULL, + dat_dim = NULL, prob_thresholds = c(1/3, 2/3), indices_for_clim = NULL, + Fair = FALSE, weights = NULL, cross.val = FALSE, na.rm = FALSE, ncores = NULL) { # Check inputs ## exp and obs (1) @@ -82,12 +103,27 @@ RPS <- function(exp, obs, time_dim = 'sdate', memb_dim = 'member', dat_dim = NUL if (!time_dim %in% names(dim(exp)) | !time_dim %in% names(dim(obs))) { stop("Parameter 'time_dim' is not found in 'exp' or 'obs' dimension.") } + ## memb_dim & cat_dim + if (is.null(memb_dim) + is.null(cat_dim) != 1) { + stop("Only one of the two parameters 'memb_dim' and 'cat_dim' can have value.") + } ## memb_dim - if (!is.character(memb_dim) | length(memb_dim) > 1) { - stop("Parameter 'memb_dim' must be a character string.") + if (!is.null(memb_dim)) { + if (!is.character(memb_dim) | length(memb_dim) > 1) { + stop("Parameter 'memb_dim' must be a character string.") + } + if (!memb_dim %in% names(dim(exp))) { + stop("Parameter 'memb_dim' is not found in 'exp' dimension.") + } } - if (!memb_dim %in% names(dim(exp))) { - stop("Parameter 'memb_dim' is not found in 'exp' dimension.") + ## cat_dim + if (!is.null(cat_dim)) { + if (!is.character(cat_dim) | length(cat_dim) > 1) { + stop("Parameter 'cat_dim' must be a character string.") + } + if (!cat_dim %in% names(dim(exp)) | !cat_dim %in% names(dim(obs))) { + stop("Parameter 'cat_dim' is not found in 'exp' or 'obs' dimension.") + } } ## dat_dim if (!is.null(dat_dim)) { @@ -102,9 +138,11 @@ RPS <- function(exp, obs, time_dim = 'sdate', memb_dim = 'member', dat_dim = NUL ## exp and obs (2) name_exp <- sort(names(dim(exp))) name_obs <- sort(names(dim(obs))) - name_exp <- name_exp[-which(name_exp == memb_dim)] - if (memb_dim %in% name_obs) { - name_obs <- name_obs[-which(name_obs == memb_dim)] + if (!is.null(memb_dim)) { + name_exp <- name_exp[-which(name_exp == memb_dim)] + if (memb_dim %in% name_obs) { + name_obs <- name_obs[-which(name_obs == memb_dim)] + } } if (!is.null(dat_dim)) { name_exp <- name_exp[-which(name_exp == dat_dim)] @@ -141,7 +179,7 @@ RPS <- function(exp, obs, time_dim = 'sdate', memb_dim = 'member', dat_dim = NUL stop("Parameter 'cross.val' must be either TRUE or FALSE.") } ## weights - if (!is.null(weights)) { + if (!is.null(weights) & is.null(cat_dim)) { if (!is.array(weights) | !is.numeric(weights)) stop("Parameter 'weights' must be a named numeric array.") if (is.null(dat_dim)) { @@ -166,6 +204,14 @@ RPS <- function(exp, obs, time_dim = 'sdate', memb_dim = 'member', dat_dim = NUL weights <- Reorder(weights, c(time_dim, memb_dim, dat_dim)) } + } else if (!is.null(weights) & !is.null(cat_dim)) { + .warning(paste0("Parameter 'exp' and 'obs' are probabilities already, so parameter ", + "'weights' is not used. Change 'weights' to NULL.")) + weights <- NULL + } + ## na.rm + if (!isTRUE(na.rm) & !isFALSE(na.rm) & !(is.numeric(na.rm) & na.rm >= 0 & na.rm <= 1)) { + stop('"na.rm" should be TRUE, FALSE or a numeric between 0 and 1') } ## ncores if (!is.null(ncores)) { @@ -178,38 +224,54 @@ RPS <- function(exp, obs, time_dim = 'sdate', memb_dim = 'member', dat_dim = NUL ############################### # Compute RPS - if (!memb_dim %in% names(dim(obs))) { - target_dims_obs <- c(time_dim, dat_dim) - } else { - target_dims_obs <- c(time_dim, memb_dim, dat_dim) + + ## Decide target_dims + if (!is.null(memb_dim)) { + target_dims_exp <- c(time_dim, memb_dim, dat_dim) + if (!memb_dim %in% names(dim(obs))) { + target_dims_obs <- c(time_dim, dat_dim) + } else { + target_dims_obs <- c(time_dim, memb_dim, dat_dim) + } + } else { # cat_dim + target_dims_exp <- target_dims_obs <- c(time_dim, cat_dim, dat_dim) } + rps <- Apply(data = list(exp = exp, obs = obs), - target_dims = list(exp = c(time_dim, memb_dim, dat_dim), + target_dims = list(exp = target_dims_exp, obs = target_dims_obs), fun = .RPS, dat_dim = dat_dim, time_dim = time_dim, - memb_dim = memb_dim, + memb_dim = memb_dim, cat_dim = cat_dim, prob_thresholds = prob_thresholds, indices_for_clim = indices_for_clim, Fair = Fair, - weights = weights, cross.val = cross.val, ncores = ncores)$output1 + weights = weights, cross.val = cross.val, + na.rm = na.rm, ncores = ncores)$output1 # Return only the mean RPS - rps <- MeanDims(rps, time_dim, na.rm = FALSE) + rps <- MeanDims(rps, time_dim, na.rm = TRUE) return(rps) } -.RPS <- function(exp, obs, time_dim = 'sdate', memb_dim = 'member', dat_dim = NULL, - prob_thresholds = c(1/3, 2/3), indices_for_clim = NULL, Fair = FALSE, weights = NULL, - cross.val = FALSE) { - +.RPS <- function(exp, obs, time_dim = 'sdate', memb_dim = 'member', cat_dim = NULL, + dat_dim = NULL, prob_thresholds = c(1/3, 2/3), indices_for_clim = NULL, + Fair = FALSE, weights = NULL, cross.val = FALSE, na.rm = FALSE) { + #--- if memb_dim: # exp: [sdate, memb, (dat)] # obs: [sdate, (memb), (dat)] # weights: NULL or same as exp + #--- if cat_dim: + # exp: [sdate, bin, (dat)] + # obs: [sdate, bin, (dat)] # Adjust dimensions to be [sdate, memb, dat] for both exp and obs - if (!memb_dim %in% names(dim(obs))) obs <- InsertDim(obs, posdim = 2, lendim = 1, name = memb_dim) + if (!is.null(memb_dim)) { + if (!memb_dim %in% names(dim(obs))) { + obs <- InsertDim(obs, posdim = 2, lendim = 1, name = memb_dim) + } + } if (is.null(dat_dim)) { nexp <- 1 @@ -232,33 +294,71 @@ RPS <- function(exp, obs, time_dim = 'sdate', memb_dim = 'member', dat_dim = NUL if (is.null(dim(exp_data))) dim(exp_data) <- c(dim(exp)[1:2]) if (is.null(dim(obs_data))) dim(obs_data) <- c(dim(obs)[1:2]) - if (!is.null(weights)) { - weights_data <- weights[ , , i] - if (is.null(dim(weights_data))) dim(weights_data) <- c(dim(weights)[1:2]) + # Find the fraction of NAs + ## If any member/bin is NA at this time step, it is not good value. + exp_mean <- rowMeans(exp_data) + obs_mean <- rowMeans(obs_data) + good_values <- !is.na(exp_mean) & !is.na(obs_mean) + + if (isTRUE(na.rm)) { + f_NAs <- 0 + } else if (isFALSE(na.rm)) { + f_NAs <- 1 } else { - weights_data <- weights + f_NAs <- na.rm } - exp_probs <- .get_probs(data = exp_data, indices_for_quantiles = indices_for_clim, - prob_thresholds = prob_thresholds, weights = weights_data, cross.val = cross.val) - # exp_probs: [bin, sdate] - obs_probs <- .get_probs(data = obs_data, indices_for_quantiles = indices_for_clim, - prob_thresholds = prob_thresholds, weights = NULL, cross.val = cross.val) - # obs_probs: [bin, sdate] - probs_exp_cumsum <- apply(exp_probs, 2, cumsum) - probs_obs_cumsum <- apply(obs_probs, 2, cumsum) + if (f_NAs <= sum(good_values) / length(obs_mean)) { + + exp_data <- exp_data[good_values, , drop = F] + obs_data <- obs_data[good_values, , drop = F] + + # If the data inputs are forecast/observation, calculate probabilities + if (is.null(cat_dim)) { + if (!is.null(weights)) { + weights_data <- weights[which(good_values) , , i] + if (is.null(dim(weights_data))) dim(weights_data) <- c(dim(weights)[1:2]) + } else { + weights_data <- weights #NULL + } - # rps: [sdate, nexp, nobs] - rps[ , i, j] <- apply((probs_exp_cumsum - probs_obs_cumsum)^2, 2, sum) - - if (Fair) { # FairRPS - ## adjustment <- rowSums(-1 * (1/R - 1/R.new) * ens.cum * (R - ens.cum)/R/(R - 1)) [formula taken from SpecsVerification::EnsRps] - R <- dim(exp)[2] #memb - R_new <- Inf - adjustment <- (-1) / (R - 1) * probs_exp_cumsum * (1 - probs_exp_cumsum) - adjustment <- apply(adjustment, 2, sum) - rps[ , i, j] <- rps[ , i, j] + adjustment + # Subset indices_for_clim + dum <- match(indices_for_clim, which(good_values)) + good_indices_for_clim <- dum[!is.na(dum)] + + exp_probs <- .GetProbs(data = exp_data, indices_for_quantiles = good_indices_for_clim, + prob_thresholds = prob_thresholds, weights = weights_data, cross.val = cross.val) + # exp_probs: [bin, sdate] + obs_probs <- .GetProbs(data = obs_data, indices_for_quantiles = good_indices_for_clim, + prob_thresholds = prob_thresholds, weights = NULL, cross.val = cross.val) + # obs_probs: [bin, sdate] + + } else { # inputs are probabilities already + exp_probs <- t(exp_data) + obs_probs <- t(obs_data) + } + + probs_exp_cumsum <- apply(exp_probs, 2, cumsum) + probs_obs_cumsum <- apply(obs_probs, 2, cumsum) + + # rps: [sdate, nexp, nobs] + rps [good_values, i, j] <- colSums((probs_exp_cumsum - probs_obs_cumsum)^2) + + if (Fair) { # FairRPS + ## adjustment <- rowSums(-1 * (1/R - 1/R.new) * ens.cum * (R - ens.cum)/R/(R - 1)) [formula taken from SpecsVerification::EnsRps] + R <- dim(exp)[2] #memb + R_new <- Inf + adjustment <- (-1) / (R - 1) * probs_exp_cumsum * (1 - probs_exp_cumsum) + adjustment <- colSums(adjustment) + rps[ , i, j] <- rps[ , i, j] + adjustment + } + + } else { ## not enough values different from NA + + rps[ , i, j] <- as.numeric(NA) + } + } } @@ -269,3 +369,5 @@ RPS <- function(exp, obs, time_dim = 'sdate', memb_dim = 'member', dat_dim = NUL return(rps) } + + diff --git a/R/RPSS.R b/R/RPSS.R index efd79500892e59c474460506dd95d054f67a5471..91ca8c21acd8a877f17d3d0cf5d1db5d67ea0d3c 100644 --- a/R/RPSS.R +++ b/R/RPSS.R @@ -5,32 +5,46 @@ #'assess whether a forecast presents an improvement or worsening with respect to #'a reference forecast. The RPSS ranges between minus infinite and 1. If the #'RPSS is positive, it indicates that the forecast has higher skill than the -#'reference forecast, while a negative value means that it has a lower skill. +#'reference forecast, while a negative value means that it has a lower skill.\cr #'Examples of reference forecasts are the climatological forecast (same #'probabilities for all categories for all time steps), persistence, a previous #'model version, and another model. It is computed as #'\code{RPSS = 1 - RPS_exp / RPS_ref}. The statistical significance is obtained -#'based on a Random Walk test at the 95% confidence level (DelSole and Tippett, -#'2016). If there is more than one dataset, RPS will be computed for each pair -#'of exp and obs data. +#'based on a Random Walk test at the specified confidence level (DelSole and +#'Tippett, 2016).\cr +#'The function accepts either the ensemble members or the probabilities of +#'each data as inputs. If there is more than one dataset, RPSS will be +#'computed for each pair of exp and obs data. The NA ratio of data will be +#'examined before the calculation. If the ratio is higher than the threshold +#'(assigned by parameter \code{na.rm}), NA will be returned directly. NAs are +#'counted by per-pair method, which means that only the time steps that all the +#'datasets have values count as non-NA values. #' -#'@param exp A named numerical array of the forecast with at least time and -#' member dimension. -#'@param obs A named numerical array of the observation with at least time -#' dimension. The dimensions must be the same as 'exp' except 'memb_dim' and -#' 'dat_dim'. -#'@param ref A named numerical array of the reference forecast data with at -#' least time and member dimension. The dimensions must be the same as 'exp' -#' except 'memb_dim' and 'dat_dim'. If there is only one reference dataset, -#' it should not have dataset dimension. If there is corresponding reference -#' for each experiement, the dataset dimension must have the same length as in -#' 'exp'. If 'ref' is NULL, the climatological forecast is used as reference -#' forecast. The default value is NULL. +#'@param exp A named numerical array of either the forecast with at least time +#' and member dimensions, or the probabilities with at least time and category +#' dimensions. The probabilities can be generated by \code{s2dv::GetProbs}. +#'@param obs A named numerical array of either the observation with at least +#' time dimension, or the probabilities with at least time and category +#' dimensions. The probabilities can be generated by \code{s2dv::GetProbs}. The +#' dimensions must be the same as 'exp' except 'memb_dim' and 'dat_dim'. +#'@param ref A named numerical array of either the reference forecast with at +#' least time and member dimensions, or the probabilities with at least time and +#' category dimensions. The probabilities can be generated by +#' \code{s2dv::GetProbs}. The dimensions must be the same as 'exp' except +#' 'memb_dim' and 'dat_dim'. If there is only one reference dataset, it should +#' not have dataset dimension. If there is corresponding reference for each +#' experiment, the dataset dimension must have the same length as in 'exp'. If +#' 'ref' is NULL, the climatological forecast is used as reference forecast. +#' The default value is NULL. #'@param time_dim A character string indicating the name of the time dimension. #' The default value is 'sdate'. #'@param memb_dim A character string indicating the name of the member dimension #' to compute the probabilities of the forecast and the reference forecast. The -#' default value is 'member'. +#' default value is 'member'. If the data are probabilities, set memb_dim as +#' NULL. +#'@param cat_dim A character string indicating the name of the category +#' dimension that is needed when exp, obs, and ref are probabilities. The +#' default value is NULL, which means that the data are not probabilities. #'@param dat_dim A character string indicating the name of dataset dimension. #' The length of this dimension can be different between 'exp' and 'obs'. #' The default value is NULL. @@ -43,19 +57,28 @@ #'@param Fair A logical indicating whether to compute the FairRPSS (the #' potential RPSS that the forecast would have with an infinite ensemble size). #' The default value is FALSE. -#'@param weights Deprecated and will be removed in the next release. Please use -#' 'weights_exp' and 'weights_ref' instead. -#'@param weights_exp A named numerical array of the forecast ensemble weights. -#' The dimension should include 'memb_dim', 'time_dim' and 'dat_dim' if there -#' are multiple datasets. All dimension lengths must be equal to 'exp' -#' dimension lengths. The default value is NULL, which means no weighting is -#' applied. The ensemble should have at least 70 members or span at least 10 -#' time steps and have more than 45 members if consistency between the weighted -#' and unweighted methodologies is desired. +#'@param weights_exp A named numerical array of the forecast ensemble weights +#' for probability calculation. The dimension should include 'memb_dim', +#' 'time_dim' and 'dat_dim' if there are multiple datasets. All dimension +#' lengths must be equal to 'exp' dimension lengths. The default value is NULL, +#' which means no weighting is applied. The ensemble should have at least 70 +#' members or span at least 10 time steps and have more than 45 members if +#' consistency between the weighted and unweighted methodologies is desired. #'@param weights_ref Same as 'weights_exp' but for the reference forecast. -#'@param cross.val A logical indicating whether to compute the thresholds between -#' probabilistics categories in cross-validation. -#' The default value is FALSE. +#'@param cross.val A logical indicating whether to compute the thresholds +#' between probabilistics categories in cross-validation. The default value is +#' FALSE. +#'@param na.rm A logical or numeric value between 0 and 1. If it is numeric, it +#' means the lower limit for the fraction of the non-NA values. 1 is equal to +#' FALSE (no NA is acceptable), 0 is equal to TRUE (all NAs are acceptable). +# The function returns NA if the fraction of non-NA values in the data is less +#' than na.rm. Otherwise, RPS will be calculated. The default value is FALSE. +#'@param sig_method.type A character string indicating the test type of the +#' significance method. Check \code{RandomWalkTest()} parameter +#' \code{test.type} for details. The default is 'two.sided.approx', which is +#' the default of \code{RandomWalkTest()}. +#'@param alpha A numeric of the significance level to be used in the statistical +#' significance test. The default value is 0.05. #'@param ncores An integer indicating the number of cores to use for parallel #' computation. The default value is NULL. #' @@ -87,15 +110,26 @@ #' n/sum(n) #' }) #'dim(weights) <- c(member = 10, sdate = 50) +#'# Use data as input #'res <- RPSS(exp = exp, obs = obs) ## climatology as reference forecast #'res <- RPSS(exp = exp, obs = obs, ref = ref) ## ref as reference forecast #'res <- RPSS(exp = exp, obs = obs, ref = ref, weights_exp = weights, weights_ref = weights) +#'res <- RPSS(exp = exp, obs = obs, alpha = 0.01, sig_method.type = 'two.sided') +#' +#'# Use probs as input +#'exp_probs <- GetProbs(exp, memb_dim = 'member') +#'obs_probs <- GetProbs(obs, memb_dim = NULL) +#'ref_probs <- GetProbs(ref, memb_dim = 'member') +#'res <- RPSS(exp = exp_probs, obs = obs_probs, ref = ref_probs, memb_dim = NULL, +#' cat_dim = 'bin') +#' #'@import multiApply #'@export -RPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', +RPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', cat_dim = NULL, dat_dim = NULL, prob_thresholds = c(1/3, 2/3), indices_for_clim = NULL, - Fair = FALSE, weights = NULL, weights_exp = NULL, weights_ref = NULL, - cross.val = FALSE, ncores = NULL) { + Fair = FALSE, weights_exp = NULL, weights_ref = NULL, + cross.val = FALSE, na.rm = FALSE, + sig_method.type = 'two.sided.approx', alpha = 0.05, ncores = NULL) { # Check inputs ## exp, obs, and ref (1) @@ -126,15 +160,31 @@ RPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', if (!is.null(ref) & !time_dim %in% names(dim(ref))) { stop("Parameter 'time_dim' is not found in 'ref' dimension.") } - ## memb_dim - if (!is.character(memb_dim) | length(memb_dim) > 1) { - stop("Parameter 'memb_dim' must be a character string.") + ## memb_dim & cat_dim + if (is.null(memb_dim) + is.null(cat_dim) != 1) { + stop("Only one of the two parameters 'memb_dim' and 'cat_dim' can have value.") } - if (!memb_dim %in% names(dim(exp))) { - stop("Parameter 'memb_dim' is not found in 'exp' dimension.") + ## memb_dim + if (!is.null(memb_dim)) { + if (!is.character(memb_dim) | length(memb_dim) > 1) { + stop("Parameter 'memb_dim' must be a character string.") + } + if (!memb_dim %in% names(dim(exp))) { + stop("Parameter 'memb_dim' is not found in 'exp' dimension.") + } + if (!is.null(ref) & !memb_dim %in% names(dim(ref))) { + stop("Parameter 'memb_dim' is not found in 'ref' dimension.") + } } - if (!is.null(ref) & !memb_dim %in% names(dim(ref))) { - stop("Parameter 'memb_dim' is not found in 'ref' dimension.") + ## cat_dim + if (!is.null(cat_dim)) { + if (!is.character(cat_dim) | length(cat_dim) > 1) { + stop("Parameter 'cat_dim' must be a character string.") + } + if (!cat_dim %in% names(dim(exp)) | !cat_dim %in% names(dim(obs)) | + (!is.null(ref) & !cat_dim %in% names(dim(ref)))) { + stop("Parameter 'cat_dim' is not found in 'exp', 'obs', or 'ref' dimension.") + } } ## dat_dim if (!is.null(dat_dim)) { @@ -149,9 +199,11 @@ RPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', ## exp, obs, and ref (2) name_exp <- sort(names(dim(exp))) name_obs <- sort(names(dim(obs))) - name_exp <- name_exp[-which(name_exp == memb_dim)] - if (memb_dim %in% name_obs) { - name_obs <- name_obs[-which(name_obs == memb_dim)] + if (!is.null(memb_dim)) { + name_exp <- name_exp[-which(name_exp == memb_dim)] + if (memb_dim %in% name_obs) { + name_obs <- name_obs[-which(name_obs == memb_dim)] + } } if (!is.null(dat_dim)) { name_exp <- name_exp[-which(name_exp == dat_dim)] @@ -164,7 +216,9 @@ RPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', } if (!is.null(ref)) { name_ref <- sort(names(dim(ref))) - name_ref <- name_ref[-which(name_ref == memb_dim)] + if (!is.null(memb_dim)) { + name_ref <- name_ref[-which(name_ref == memb_dim)] + } if (!is.null(dat_dim)) { if (dat_dim %in% name_ref) { if (!identical(dim(exp)[dat_dim], dim(ref)[dat_dim])) { @@ -206,16 +260,8 @@ RPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', if (!is.logical(cross.val) | length(cross.val) > 1) { stop("Parameter 'cross.val' must be either TRUE or FALSE.") } - ## weights - if (!is.null(weights)) { - .warning(paste0("Parameter 'weights' is deprecated and will be removed in the next release. ", - "Use 'weights_exp' and 'weights_ref' instead. The value will be assigned ", - "to these two parameters now if they are NULL."), tag = '! Deprecation: ') - if (is.null(weights_exp)) weights_exp <- weights - if (is.null(weights_ref)) weights_ref <- weights - } ## weights_exp - if (!is.null(weights_exp)) { + if (!is.null(weights_exp) & is.null(cat_dim)) { if (!is.array(weights_exp) | !is.numeric(weights_exp)) stop("Parameter 'weights_exp' must be a named numeric array.") @@ -238,13 +284,16 @@ RPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', "as 'memb_dim', 'time_dim' and 'dat_dim' in 'exp'.")) } weights_exp <- Reorder(weights_exp, c(time_dim, memb_dim, dat_dim)) - } - + } + } else if (!is.null(weights_exp) & !is.null(cat_dim)) { + .warning(paste0("Parameter 'exp' is probability already, so parameter ", + "'weights_exp' is not used. Change 'weights_exp' to NULL.")) + weights_exp <- NULL } ## weights_ref - if (!is.null(weights_ref)) { + if (!is.null(weights_ref) & is.null(cat_dim)) { if (!is.array(weights_ref) | !is.numeric(weights_ref)) - stop('Parameter "weights_ref" must be a named numeric array.') + stop("Parameter 'weights_ref' must be a named numeric array.") if (is.null(dat_dim) | ((!is.null(dat_dim)) && (!dat_dim %in% names(dim(ref))))) { if (length(dim(weights_ref)) != 2 | any(!names(dim(weights_ref)) %in% c(memb_dim, time_dim))) @@ -266,7 +315,29 @@ RPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', } weights_ref <- Reorder(weights_ref, c(time_dim, memb_dim, dat_dim)) } - + } else if (!is.null(weights_ref) & !is.null(cat_dim)) { + .warning(paste0("Parameter 'ref' is probability already, so parameter ", + "'weights_ref' is not used. Change 'weights_ref' to NULL.")) + weights_ref <- NULL + } + ## na.rm + if (!isTRUE(na.rm) & !isFALSE(na.rm) & !(is.numeric(na.rm) & na.rm >= 0 & na.rm <= 1)) { + stop('"na.rm" should be TRUE, FALSE or a numeric between 0 and 1') + } + ## alpha + if (any(!is.numeric(alpha) | alpha <= 0 | alpha >= 1 | length(alpha) > 1)) { + stop("Parameter 'alpha' must be a number between 0 and 1.") + } + ## sig_method.type + #NOTE: These are the types of RandomWalkTest() + if (!sig_method.type %in% c('two.sided.approx', 'two.sided', 'greater', 'less')) { + stop("Parameter 'sig_method.type' must be 'two.sided.approx', 'two.sided', 'greater', or 'less'.") + } + if (sig_method.type == 'two.sided.approx') { + if (alpha != 0.05) { + .warning("DelSole and Tippett (2016) aproximation is valid for alpha ", + "= 0.05 only. Returning the significance at the 0.05 significance level.") + } } ## ncores if (!is.null(ncores)) { @@ -279,50 +350,76 @@ RPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', ############################### # Compute RPSS - if (!memb_dim %in% names(dim(obs))) { - target_dims_obs <- c(time_dim, dat_dim) - } else { - target_dims_obs <- c(time_dim, memb_dim, dat_dim) + + ## Decide target_dims + if (!is.null(memb_dim)) { + target_dims_exp <- c(time_dim, memb_dim, dat_dim) + if (!memb_dim %in% names(dim(obs))) { + target_dims_obs <- c(time_dim, dat_dim) + } else { + target_dims_obs <- c(time_dim, memb_dim, dat_dim) + } + } else { # cat_dim + target_dims_exp <- target_dims_obs <- c(time_dim, cat_dim, dat_dim) } if (!is.null(ref)) { # use "ref" as reference forecast - if (!is.null(dat_dim) && (dat_dim %in% names(dim(ref)))) { - target_dims_ref <- c(time_dim, memb_dim, dat_dim) - } else { + if (!is.null(memb_dim)) { + if (!is.null(dat_dim) && (dat_dim %in% names(dim(ref)))) { + target_dims_ref <- c(time_dim, memb_dim, dat_dim) + } else { target_dims_ref <- c(time_dim, memb_dim) + } + } else { + target_dims_ref <- c(time_dim, cat_dim, dat_dim) } data <- list(exp = exp, obs = obs, ref = ref) - target_dims = list(exp = c(time_dim, memb_dim, dat_dim), + target_dims = list(exp = target_dims_exp, obs = target_dims_obs, ref = target_dims_ref) } else { data <- list(exp = exp, obs = obs) - target_dims = list(exp = c(time_dim, memb_dim, dat_dim), + target_dims = list(exp = target_dims_exp, obs = target_dims_obs) } + output <- Apply(data, target_dims = target_dims, fun = .RPSS, time_dim = time_dim, memb_dim = memb_dim, - dat_dim = dat_dim, + cat_dim = cat_dim, dat_dim = dat_dim, prob_thresholds = prob_thresholds, indices_for_clim = indices_for_clim, Fair = Fair, weights_exp = weights_exp, weights_ref = weights_ref, cross.val = cross.val, + na.rm = na.rm, sig_method.type = sig_method.type, alpha = alpha, ncores = ncores) return(output) } -.RPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', dat_dim = NULL, - prob_thresholds = c(1/3, 2/3), indices_for_clim = NULL, Fair = FALSE, - weights_exp = NULL, weights_ref = NULL, cross.val = FALSE) { - +.RPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', cat_dim = NULL, + dat_dim = NULL, prob_thresholds = c(1/3, 2/3), indices_for_clim = NULL, + Fair = FALSE, weights_exp = NULL, weights_ref = NULL, cross.val = FALSE, + na.rm = FALSE, sig_method.type = 'two.sided.approx', alpha = 0.05) { + #--- if memb_dim: # exp: [sdate, memb, (dat)] # obs: [sdate, (memb), (dat)] # ref: [sdate, memb, (dat)] or NULL + #--- if cat_dim: + # exp: [sdate, bin, (dat)] + # obs: [sdate, bin, (dat)] + # ref: [sdate, bin, (dat)] or NULL + + if (isTRUE(na.rm)) { + f_NAs <- 0 + } else if (isFALSE(na.rm)) { + f_NAs <- 1 + } else { + f_NAs <- na.rm + } if (is.null(dat_dim)) { nexp <- 1 @@ -331,99 +428,181 @@ RPSS <- function(exp, obs, ref = NULL, time_dim = 'sdate', memb_dim = 'member', nexp <- as.numeric(dim(exp)[dat_dim]) nobs <- as.numeric(dim(obs)[dat_dim]) } - - # RPS of the forecast - rps_exp <- .RPS(exp = exp, obs = obs, time_dim = time_dim, memb_dim = memb_dim, dat_dim = dat_dim, - prob_thresholds = prob_thresholds, indices_for_clim = indices_for_clim, - Fair = Fair, weights = weights_exp, cross.val = cross.val) - # RPS of the reference forecast - if (is.null(ref)) { ## using climatology as reference forecast - if (!memb_dim %in% names(dim(obs))) { - obs <- InsertDim(obs, posdim = 2, lendim = 1, name = memb_dim) + # Calculate RPS + + if (!is.null(ref)) { + + # Adjust dimensions to be [sdate, memb, dat] for both exp, obs, and ref + ## Insert memb_dim in obs + if (!is.null(memb_dim)) { + if (!memb_dim %in% names(dim(obs))) { + obs <- InsertDim(obs, posdim = 2, lendim = 1, name = memb_dim) + } } + ## Insert dat_dim if (is.null(dat_dim)) { - dim(obs) <- c(dim(obs), nobs = nobs) + dim(obs) <- c(dim(obs), dat = nobs) + dim(exp) <- c(dim(exp), dat = nexp) + if (!is.null(weights_exp)) dim(weights_exp) <- c(dim(weights_exp), dat = nexp) + } + if (is.null(dat_dim) || (!is.null(dat_dim) && !dat_dim %in% names(dim(ref)))) { + nref <- 1 + dim(ref) <- c(dim(ref), dat = nref) + if (!is.null(weights_ref)) dim(weights_ref) <- c(dim(weights_ref), dat = nref) + } else { + nref <- as.numeric(dim(ref)[dat_dim]) # should be the same as nexp + } + + # Find good values then calculate RPS + rps_exp <- array(NA, dim = c(dim(exp)[time_dim], nexp = nexp, nobs = nobs)) + rps_ref <- array(NA, dim = c(dim(exp)[time_dim], nexp = nexp, nobs = nobs)) + for (i in 1:nexp) { + for (j in 1:nobs) { + for (k in 1:nref) { + if (nref != 1 & k!=i) { # if nref is 1 or equal to nexp, calculate rps + next + } + exp_data <- exp[, , i, drop = F] + obs_data <- obs[, , j, drop = F] + ref_data <- ref[, , k, drop = F] + exp_mean <- rowMeans(exp_data) + obs_mean <- rowMeans(obs_data) + ref_mean <- rowMeans(ref_data) + good_values <- !is.na(exp_mean) & !is.na(obs_mean) & !is.na(ref_mean) + dum <- match(indices_for_clim, which(good_values)) + good_indices_for_clim <- dum[!is.na(dum)] + + if (f_NAs <= sum(good_values) / length(good_values)) { + rps_exp[good_values,i,j] <- .RPS(exp = exp[good_values, , i], obs = obs[good_values, , j], + time_dim = time_dim, memb_dim = memb_dim, + cat_dim = cat_dim, dat_dim = NULL, + prob_thresholds = prob_thresholds, + indices_for_clim = good_indices_for_clim, + Fair = Fair, weights = weights_exp[good_values, , i], + cross.val = cross.val, na.rm = na.rm) + rps_ref[good_values,i,j] <- .RPS(exp = ref[good_values, , k], obs = obs[good_values, , j], + time_dim = time_dim, memb_dim = memb_dim, + cat_dim = cat_dim, dat_dim = NULL, + prob_thresholds = prob_thresholds, + indices_for_clim = good_indices_for_clim, + Fair = Fair, weights = weights_ref[good_values, , k], + na.rm = na.rm, cross.val = cross.val) + } + } + } } - rps_ref <- array(dim = c(dim(obs)[time_dim], nobs = nobs)) - - for (j in 1:nobs) { - obs_data <- obs[ , , j] - if (is.null(dim(obs_data))) dim(obs_data) <- c(dim(obs)[1:2]) - # obs_probs: [bin, sdate] - obs_probs <- .get_probs(data = obs_data, indices_for_quantiles = indices_for_clim, - prob_thresholds = prob_thresholds, weights = NULL, cross.val = cross.val) - # clim_probs: [bin, sdate] - clim_probs <- c(prob_thresholds[1], diff(prob_thresholds), 1 - prob_thresholds[length(prob_thresholds)]) - clim_probs <- array(clim_probs, dim = dim(obs_probs)) - - # Calculate RPS for each time step - probs_clim_cumsum <- apply(clim_probs, 2, cumsum) - probs_obs_cumsum <- apply(obs_probs, 2, cumsum) - rps_ref[ , j] <- apply((probs_clim_cumsum - probs_obs_cumsum)^2, 2, sum) - - # if (Fair) { # FairRPS - # ## adjustment <- rowSums(-1 * (1/R - 1/R.new) * ens.cum * (R - ens.cum)/R/(R - 1)) [formula taken from SpecsVerification::EnsRps] - # R <- dim(exp)[2] #memb - # R_new <- Inf - # adjustment <- (-1) / (R - 1) * probs_clim_cumsum * (1 - probs_clim_cumsum) - # adjustment <- apply(adjustment, 2, sum) - # rps_ref <- rps_ref + adjustment - # } + } else { # ref is NULL + rps_exp <- .RPS(exp = exp, obs = obs, time_dim = time_dim, memb_dim = memb_dim, + cat_dim = cat_dim, dat_dim = dat_dim, prob_thresholds = prob_thresholds, + indices_for_clim = indices_for_clim, Fair = Fair, weights = weights_exp, + cross.val = cross.val, na.rm = na.rm) + + # RPS of the reference forecast + if (!is.null(memb_dim)) { + if (!memb_dim %in% names(dim(obs))) { + obs <- InsertDim(obs, posdim = 2, lendim = 1, name = memb_dim) + } } + + rps_ref <- array(NA, dim = c(dim(obs)[time_dim], nexp = nexp, nobs = nobs)) + if (is.null(dat_dim)) { - dim(rps_ref) <- dim(exp)[time_dim] + dim(obs) <- c(dim(obs), nobs = nobs) + dim(exp) <- c(dim(exp), nexp = nexp) + dim(rps_exp) <- dim(rps_ref) } - } else { # use "ref" as reference forecast - if (!is.null(dat_dim) && (!dat_dim %in% names(dim(ref)))) { - remove_dat_dim <- TRUE - ref <- InsertDim(ref, posdim = 3, lendim = 1, name = dat_dim) - if (!is.null(weights_ref)) { - weights_ref <- InsertDim(weights_ref, posdim = 3, lendim = 1, name = dat_dim) + for (i in 1:nexp) { + for (j in 1:nobs) { + # Use good values only + good_values <- !is.na(rps_exp[, i, j]) + if (f_NAs <= sum(good_values) / length(good_values)) { + obs_data <- obs[good_values, , j] + if (is.null(dim(obs_data))) dim(obs_data) <- c(length(obs_data), 1) + + if (is.null(cat_dim)) { # calculate probs + # Subset indices_for_clim + dum <- match(indices_for_clim, which(good_values)) + good_indices_for_clim <- dum[!is.na(dum)] + obs_probs <- .GetProbs(data = obs_data, indices_for_quantiles = good_indices_for_clim, + prob_thresholds = prob_thresholds, weights = NULL, cross.val = cross.val) + } else { + obs_probs <- t(obs_data) + } + # obs_probs: [bin, sdate] + + clim_probs <- c(prob_thresholds[1], diff(prob_thresholds), + 1 - prob_thresholds[length(prob_thresholds)]) + clim_probs <- array(clim_probs, dim = dim(obs_probs)) + # clim_probs: [bin, sdate] + + # Calculate RPS for each time step + probs_clim_cumsum <- apply(clim_probs, 2, cumsum) + probs_obs_cumsum <- apply(obs_probs, 2, cumsum) + rps_ref[good_values, i, j] <- colSums((probs_clim_cumsum - probs_obs_cumsum)^2) } - } else { - remove_dat_dim <- FALSE - } + # if (Fair) { # FairRPS + # ## adjustment <- rowSums(-1 * (1/R - 1/R.new) * ens.cum * (R - ens.cum)/R/(R - 1)) [formula taken from SpecsVerification::EnsRps] + # R <- dim(exp)[2] #memb + # R_new <- Inf + # adjustment <- (-1) / (R - 1) * probs_clim_cumsum * (1 - probs_clim_cumsum) + # adjustment <- apply(adjustment, 2, sum) + # rps_ref <- rps_ref + adjustment + # } - rps_ref <- .RPS(exp = ref, obs = obs, time_dim = time_dim, memb_dim = memb_dim, dat_dim = dat_dim, - prob_thresholds = prob_thresholds, indices_for_clim = indices_for_clim, - Fair = Fair, weights = weights_ref, cross.val = cross.val) - if (!is.null(dat_dim)) { - if (isTRUE(remove_dat_dim)) { - dim(rps_ref) <- dim(rps_ref)[-2] } } } - - if (!is.null(dat_dim)) { - rps_exp_mean <- MeanDims(rps_exp, time_dim, na.rm = FALSE) - rps_ref_mean <- MeanDims(rps_ref, time_dim, na.rm = FALSE) + if (is.null(dat_dim)) { + dim(rps_ref) <- dim(rps_exp) <- dim(exp)[time_dim] + } + +#---------------------------------------------- + # Calculate RPSS + + if (!is.null(dat_dim)) { + # rps_exp and rps_ref: [sdate, nexp, nobs] + rps_exp_mean <- colMeans(rps_exp, na.rm = TRUE) + rps_ref_mean <- colMeans(rps_ref, na.rm = TRUE) rpss <- array(dim = c(nexp = nexp, nobs = nobs)) sign <- array(dim = c(nexp = nexp, nobs = nobs)) - if (length(dim(rps_ref_mean)) == 1) { - for (i in 1:nexp) { - for (j in 1:nobs) { - rpss[i, j] <- 1 - rps_exp_mean[i, j] / rps_ref_mean[j] - sign[i, j] <- .RandomWalkTest(skill_A = rps_exp[, i, j], skill_B = rps_ref[, j])$sign - } - } - } else { + if (any(!is.na(rps_exp_mean))) { for (i in 1:nexp) { for (j in 1:nobs) { rpss[i, j] <- 1 - rps_exp_mean[i, j] / rps_ref_mean[i, j] - sign[i, j] <- .RandomWalkTest(skill_A = rps_exp[, i, j], skill_B = rps_ref[, i, j])$sign + ind_nonNA <- !is.na(rps_exp[, i, j]) + if (!any(ind_nonNA)) { + sign[i, j] <- NA + } else { + sign[i, j] <- .RandomWalkTest(skill_A = rps_exp[ind_nonNA, i, j], skill_B = rps_ref[ind_nonNA, i, j], + test.type = sig_method.type, alpha = alpha, + sign = T, pval = F)$sign + } } } } - } else { - rpss <- 1 - mean(rps_exp) / mean(rps_ref) - # Significance - sign <- .RandomWalkTest(skill_A = rps_exp, skill_B = rps_ref, sign = T, pval = F)$sign + + # Turn NaN into NA + if (any(is.nan(rpss))) rpss[which(is.nan(rpss))] <- NA + + } else { # dat_dim is NULL + + ind_nonNA <- !is.na(rps_exp) + if (!any(ind_nonNA)) { + rpss <- NA + sign <- NA + } else { + # rps_exp and rps_ref: [sdate] + rpss <- 1 - mean(rps_exp, na.rm = TRUE) / mean(rps_ref, na.rm = TRUE) + sign <- .RandomWalkTest(skill_A = rps_exp[ind_nonNA], skill_B = rps_ref[ind_nonNA], + test.type = sig_method.type, alpha = alpha, + sign = T, pval = F)$sign + } } - + return(list(rpss = rpss, sign = sign)) } diff --git a/R/RatioPredictableComponents.R b/R/RatioPredictableComponents.R index 163d18fcdbd10524d24b6111a7b759527e5f611f..3d5cae58fc08aaf79847c080a18bedb958f95fc6 100644 --- a/R/RatioPredictableComponents.R +++ b/R/RatioPredictableComponents.R @@ -2,13 +2,13 @@ #' #'This function computes the ratio of predictable components (RPC; Eade et al., 2014). #' -#'@param exp A numerical array with, at least, 'time_dim' and 'member_dim' +#'@param exp A numerical array with, at least, 'time_dim' and 'memb_dim' #' dimensions. #'@param obs A numerical array with the same dimensions than 'exp' except the -#' 'member_dim' dimension. +#' 'memb_dim' dimension. #'@param time_dim A character string indicating the name of the time dimension. #' The default value is 'year'. -#'@param member_dim A character string indicating the name of the member +#'@param memb_dim A character string indicating the name of the member #' dimension. The default value is 'member'. #'@param na.rm A logical value indicating whether to remove NA values during #' the computation. The default value is FALSE. @@ -16,7 +16,7 @@ #' computation. The default value is NULL. #' #'@return An array of the ratio of the predictable components. it has the same -#' dimensions as 'exp' except 'time_dim' and 'member_dim' dimensions. +#' dimensions as 'exp' except 'time_dim' and 'memb_dim' dimensions. #' #'@examples #'exp <- array(data = runif(600), dim = c(year = 15, member = 10, lat = 2, lon = 2)) @@ -25,7 +25,7 @@ #' #'@import multiApply stats #'@export -RatioPredictableComponents <- function(exp, obs, time_dim = 'year', member_dim = 'member', na.rm = FALSE, ncores = NULL) { +RatioPredictableComponents <- function(exp, obs, time_dim = 'year', memb_dim = 'member', na.rm = FALSE, ncores = NULL) { ## Checkings if (is.null(exp)) { @@ -43,14 +43,14 @@ RatioPredictableComponents <- function(exp, obs, time_dim = 'year', member_dim = if (!(is.character(time_dim) & length(time_dim) == 1)) { stop("Parameter 'time_dim' must be a character string.") } - if (!(is.character(member_dim) & length(member_dim) == 1)) { - stop("Parameter 'member_dim' must be a character string.") + if (!(is.character(memb_dim) & length(memb_dim) == 1)) { + stop("Parameter 'memb_dim' must be a character string.") } if (!time_dim %in% names(dim(exp))) { stop("'exp' must have 'time_dim' dimension.") } - if (!member_dim %in% names(dim(exp))) { - stop("'exp' must have 'member_dim' dimension.") + if (!memb_dim %in% names(dim(exp))) { + stop("'exp' must have 'memb_dim' dimension.") } if (!time_dim %in% names(dim(obs))) { stop("'obs' must have 'time_dim' dimension.") @@ -68,7 +68,7 @@ RatioPredictableComponents <- function(exp, obs, time_dim = 'year', member_dim = } RPC <- multiApply::Apply(data = list(exp, obs), - target_dims = list(exp = c(time_dim, member_dim), + target_dims = list(exp = c(time_dim, memb_dim), obs = time_dim), output_dims = NULL, fun = .RatioPredictableComponents, diff --git a/R/RatioSDRMS.R b/R/RatioSDRMS.R index d527625c2daabfc5fc0fa2366ebec4a5292a46df..6040410b3b0e3eaffdc8f77fea75d4eaad806ed8 100644 --- a/R/RatioSDRMS.R +++ b/R/RatioSDRMS.R @@ -3,7 +3,7 @@ #'Compute the ratio between the standard deviation of the members around the #'ensemble mean in experimental data and the RMSE between the ensemble mean of #'experimental and observational data. The p-value is provided by a one-sided -#'Fischer test. +#'Fisher's test. #' #'@param exp A named numeric array of experimental data with at least two #' dimensions 'memb_dim' and 'time_dim'. @@ -11,8 +11,7 @@ #' dimensions 'memb_dim' and 'time_dim'. It should have the same dimensions as #' parameter 'exp' except along 'dat_dim' and 'memb_dim'. #'@param dat_dim A character string indicating the name of dataset (nobs/nexp) -#' dimension. If there is no dataset dimension, set as NULL. The default value -#' is 'dataset'. +#' dimension. The default value is NULL (no dataset). #'@param memb_dim A character string indicating the name of the member #' dimension. It must be one dimension in 'exp' and 'obs'. The default value #' is 'member'. @@ -26,20 +25,19 @@ #'@return A list of two arrays with dimensions c(nexp, nobs, the rest of #' dimensions of 'exp' and 'obs' except memb_dim and time_dim), which nexp is #' the length of dat_dim of 'exp' and nobs is the length of dat_dim of 'obs'. -#' (only present if \code{pval = TRUE}) of the one-sided Fisher test with -#'Ho: SD/RMSE = 1.\cr\cr +#' If dat_dim is NULL, nexp and nobs are omitted. \cr #'\item{$ratio}{ #' The ratio of the ensemble spread and RMSE. #'} #'\item{$p_val}{ -#' The p-value of the one-sided Fisher test with Ho: SD/RMSE = 1. Only present +#' The p-value of the one-sided Fisher's test with Ho: SD/RMSE = 1. Only present #' if \code{pval = TRUE}. #'} #' #'@examples #'# Load sample data as in Load() example: #'example(Load) -#'rsdrms <- RatioSDRMS(sampleData$mod, sampleData$obs) +#'rsdrms <- RatioSDRMS(sampleData$mod, sampleData$obs, dat_dim = 'dataset') #'# Reorder the data in order to plot it with PlotVsLTime #'rsdrms_plot <- array(dim = c(dim(rsdrms$ratio)[1:2], 4, dim(rsdrms$ratio)[3])) #'rsdrms_plot[, , 2, ] <- rsdrms$ratio @@ -52,7 +50,7 @@ #' #'@import multiApply #'@export -RatioSDRMS <- function(exp, obs, dat_dim = 'dataset', memb_dim = 'member', +RatioSDRMS <- function(exp, obs, dat_dim = NULL, memb_dim = 'member', time_dim = 'sdate', pval = TRUE, ncores = NULL) { # Check inputs @@ -67,14 +65,10 @@ RatioSDRMS <- function(exp, obs, dat_dim = 'dataset', memb_dim = 'member', stop(paste0("Parameter 'exp' and 'obs' must be array with as least two ", "dimensions memb_dim and time_dim.")) } - if(any(is.null(names(dim(exp))))| any(nchar(names(dim(exp))) == 0) | - any(is.null(names(dim(obs))))| any(nchar(names(dim(obs))) == 0)) { + if (any(is.null(names(dim(exp))))| any(nchar(names(dim(exp))) == 0) | + any(is.null(names(dim(obs))))| any(nchar(names(dim(obs))) == 0)) { stop("Parameter 'exp' and 'obs' must have dimension names.") } - if(!all(names(dim(exp)) %in% names(dim(obs))) | - !all(names(dim(obs)) %in% names(dim(exp)))) { - stop("Parameter 'exp' and 'obs' must have the same dimension names.") - } ## dat_dim if (!is.null(dat_dim)) { if (!is.character(dat_dim) | length(dat_dim) > 1) { @@ -88,8 +82,18 @@ RatioSDRMS <- function(exp, obs, dat_dim = 'dataset', memb_dim = 'member', if (!is.character(memb_dim) | length(memb_dim) > 1) { stop("Parameter 'memb_dim' must be a character string.") } - if (!memb_dim %in% names(dim(exp)) | !memb_dim %in% names(dim(obs))) { - stop("Parameter 'memb_dim' is not found in 'exp' or 'obs' dimension.") + if (!memb_dim %in% names(dim(exp)) & !memb_dim %in% names(dim(obs))) { + stop("Parameter 'memb_dim' is not found in 'exp' nor 'obs' dimension. ", + "Set it as NULL if there is no member dimension.") + } + # Add [member = 1] + if (memb_dim %in% names(dim(exp)) & !memb_dim %in% names(dim(obs))) { + dim(obs) <- c(dim(obs), 1) + names(dim(obs))[length(dim(obs))] <- memb_dim + } + if (!memb_dim %in% names(dim(exp)) & memb_dim %in% names(dim(obs))) { + dim(exp) <- c(dim(exp), 1) + names(dim(exp))[length(dim(exp))] <- memb_dim } ## time_dim if (!is.character(time_dim) | length(time_dim) > 1) { @@ -107,7 +111,7 @@ RatioSDRMS <- function(exp, obs, dat_dim = 'dataset', memb_dim = 'member', } name_exp <- name_exp[-which(name_exp == memb_dim)] name_obs <- name_obs[-which(name_obs == memb_dim)] - if(!all(dim(exp)[name_exp] == dim(obs)[name_obs])) { + if (!identical(dim(exp)[name_exp], dim(obs)[name_obs])) { stop(paste0("Parameter 'exp' and 'obs' must have same length of ", "all the dimensions except 'dat_dim' and 'memb_dim'.")) } diff --git a/R/Regression.R b/R/Regression.R index 1cd12e6e206bfda316bca0ac5d98f7da1f703228..535f179c6ac15f46b3a23c9a58dd9f35e0c9ba62 100644 --- a/R/Regression.R +++ b/R/Regression.R @@ -19,8 +19,10 @@ #' or not. The default value is TRUE. #'@param conf A logical value indicating whether to retrieve the confidence #' intervals or not. The default value is TRUE. -#'@param conf.lev A numeric indicating the confidence level for the -#' regression computation. The default value is 0.95. +#'@param sign A logical value indicating whether to compute or not the +#' statistical significance of the test The default value is FALSE. +#'@param alpha A numeric of the significance level to be used in the +#' statistical significance test. The default value is 0.05. #'@param na.action A function or an integer. A function (e.g., na.omit, #' na.exclude, na.fail, na.pass) indicates what should happen when the data #' contain NAs. A numeric indicates the maximum number of NA position (it @@ -60,6 +62,10 @@ #' A numeric array with same dimensions as parameter 'daty' and 'datax' except #' the 'reg_dim' dimension, The array contains the p-value. #'} +#'\item{sign}{ +#' A logical array of the statistical significance of the regression with the +#' same dimensions as $regression. Only present if \code{sign = TRUE}. +#'} #'\item{$filtered}{ #' A numeric array with the same dimension as paramter 'datay' and 'datax', #' the filtered datay from the regression onto datax along the 'reg_dim' @@ -74,13 +80,13 @@ #'datax <- sampleData$obs[, 1, , ] #'names(dim(datax)) <- c('sdate', 'ftime') #'res1 <- Regression(datay, datax, formula = y~poly(x, 2, raw = TRUE)) -#'res2 <- Regression(datay, datax, conf.lev = 0.9) +#'res2 <- Regression(datay, datax, alpha = 0.1) #' #'@importFrom stats lm na.omit confint #'@import multiApply #'@export Regression <- function(datay, datax, reg_dim = 'sdate', formula = y ~ x, - pval = TRUE, conf = TRUE, conf.lev = 0.95, + pval = TRUE, conf = TRUE, sign = FALSE, alpha = 0.05, na.action = na.omit, ncores = NULL) { # Check inputs @@ -134,9 +140,13 @@ Regression <- function(datay, datax, reg_dim = 'sdate', formula = y ~ x, if (!is.logical(conf) | length(conf) > 1) { stop("Parameter 'conf' must be one logical value.") } - ##conf.lev - if (!is.numeric(conf.lev) | conf.lev < 0 | conf.lev > 1 | length(conf.lev) > 1) { - stop("Parameter 'conf.lev' must be a numeric number between 0 and 1.") + ## sign + if (!is.logical(sign) | length(sign) > 1) { + stop("Parameter 'sign' must be one logical value.") + } + ## alpha + if (!is.numeric(alpha) | alpha < 0 | alpha > 1 | length(alpha) > 1) { + stop("Parameter 'alpha' must be a numeric number between 0 and 1.") } ## na.action if (!is.function(na.action) & !is.numeric(na.action)) { @@ -169,33 +179,27 @@ Regression <- function(datay, datax, reg_dim = 'sdate', formula = y ~ x, ############################### # Calculate Regression - if (conf & pval) { - output_dims <- list(regression = 'stats', conf.lower = 'stats', - conf.upper = 'stats', p.val = NULL, filtered = reg_dim) - } else if (conf & !pval) { - output_dims <- list(regression = 'stats', conf.lower = 'stats', - conf.upper = 'stats', filtered = reg_dim) - } else if (!conf & pval) { - output_dims <- list(regression = 'stats', - p.val = NULL, filtered = reg_dim) - } else if (!conf & !pval) { - output_dims <- list(regression = 'stats', filtered = reg_dim) - } - + + ## output_dims + output_dims <- list(regression = 'stats', filtered = reg_dim) + if (conf) output_dims <- c(output_dims, list(conf.lower = 'stats', conf.upper = 'stats')) + if (pval) output_dims <- c(output_dims, list(p.val = NULL)) + if (sign) output_dims <- c(output_dims, list(sign = NULL)) + res <- Apply(list(datay, datax), target_dims = reg_dim, output_dims = output_dims, fun = .Regression, - formula = formula, pval = pval, conf = conf, - conf.lev = conf.lev, na.action = na.action, + formula = formula, pval = pval, conf = conf, sign = sign, + alpha = alpha, na.action = na.action, ncores = ncores) return(invisible(res)) } -.Regression <- function(y, x, formula = y~x, pval = TRUE, conf = TRUE, - conf.lev = 0.95, na.action = na.omit) { +.Regression <- function(y, x, formula = y~x, pval = TRUE, conf = TRUE, + sign = FALSE, alpha = 0.05, na.action = na.omit) { NApos <- 1:length(x) NApos[which(is.na(x) | is.na(y))] <- NA @@ -211,12 +215,13 @@ Regression <- function(datay, datax, reg_dim = 'sdate', formula = y ~ x, lm.out <- lm(formula, data = data.frame(x = x, y = y), na.action = na.action) coeff <- lm.out$coefficients if (conf) { - conf.lower <- confint(lm.out, level = conf.lev)[, 1] - conf.upper <- confint(lm.out, level = conf.lev)[, 2] + conf.lower <- confint(lm.out, level = 1 - alpha)[, 1] + conf.upper <- confint(lm.out, level = 1 - alpha)[, 2] } - if (pval) { + if (pval | sign) { f <- summary(lm.out)$fstatistic - p.val <- pf(f[1], f[2], f[3],lower.tail = F) + p.val <- pf(f[1], f[2], f[3], lower.tail = F) + if (sign) signif <- !is.na(p.val) & p.val <= alpha } filtered[!is.na(NApos)] <- y[!is.na(NApos)] - lm.out$fitted.values @@ -228,25 +233,17 @@ Regression <- function(datay, datax, reg_dim = 'sdate', formula = y ~ x, conf.lower[which(!is.na(conf.lower))] <- NA conf.upper[which(!is.na(conf.upper))] <- NA } - if (pval) { - p.val[which(!is.na(p.val))] <- NA - } + if (pval) p.val[which(!is.na(p.val))] <- NA + if (sign) signif[which(!is.na(signif))] <- NA filtered[which(!is.na(filtered))] <- NA } } - if (conf & pval) { - return(list(regression = coeff, conf.lower = conf.lower, conf.upper = conf.upper, - p.val = p.val, filtered = filtered)) - } else if (conf & !pval) { - return(list(regression = coeff, conf.lower = conf.lower, conf.upper = conf.upper, - filtered = filtered)) - } else if (!conf & pval) { - return(list(regression = coeff, - p.val = p.val, filtered = filtered)) - } else if (!conf & !pval) { - return(list(regression = coeff, filtered = filtered)) - } + res <- list(regression = coeff, filtered = filtered) + if (conf) res <- c(res, list(conf.lower = conf.lower, conf.upper = conf.upper)) + if (pval) res <- c(res, list(p.val = p.val)) + if (sign) res <- c(res, list(sign = signif)) + return(res) } diff --git a/R/ResidualCorr.R b/R/ResidualCorr.R index 6f03ecee1cbf31fa30aa010aeda20aaff7019313..18ca539fb9d0ed540af062999d61fac2b942c3b6 100644 --- a/R/ResidualCorr.R +++ b/R/ResidualCorr.R @@ -37,14 +37,18 @@ #' computed ("pearson", "kendall", or "spearman"). The default value is #' "pearson". #'@param alpha A numeric of the significance level to be used in the statistical -#' significance test. If it is a numeric, "sign" will be returned. If NULL, the -#' p-value will be returned instead. The default value is NULL. +#' significance test (output "sign"). The default value is 0.05. #'@param handle.na A charcater string indicating how to handle missing values. #' If "return.na", NAs will be returned for the cases that contain at least one #' NA in "exp", "ref", or "obs". If "only.complete.triplets", only the time #' steps with no missing values in all "exp", "ref", and "obs" will be used. If #' "na.fail", an error will arise if any of "exp", "ref", or "obs" contains any #' NA. The default value is "return.na". +#'@param pval A logical value indicating whether to return the p-value of the +#' significance test Ho: DiffCorr = 0. The default value is TRUE. +#'@param sign A logical value indicating whether to return the statistical +#' significance of the test Ho: DiffCorr = 0 based on 'alpha'. The default +#' value is FALSE. #'@param ncores An integer indicating the number of cores to use for parallel #' computation. The default value is NULL. #' @@ -56,12 +60,12 @@ #'\item{$sign}{ #' A logical array indicating whether the residual correlation is statistically #' significant or not with the same dimensions as the input arrays except "time_dim" -#' (and "memb_dim" if provided). Returned only if "alpha" is a numeric. +#' (and "memb_dim" if provided). Returned only if "sign" is TRUE. #'} #'\item{$p.val}{ #' A numeric array of the p-values with the same dimensions as the input arrays -#' except "time_dim" (and "memb_dim" if provided). Returned only if "alpha" is -#' NULL. +#' except "time_dim" (and "memb_dim" if provided). Returned only if "pval" is +#' TRUE. #'} #' #'@examples @@ -73,8 +77,8 @@ #'@import multiApply #'@export ResidualCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', - memb_dim = NULL, method = 'pearson', alpha = NULL, - handle.na = 'return.na', ncores = NULL) { + memb_dim = NULL, method = 'pearson', alpha = 0.05, + handle.na = 'return.na', pval = TRUE, sign = FALSE, ncores = NULL) { # Check inputs ## exp, ref, and obs (1) @@ -132,16 +136,21 @@ ResidualCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', stop('Parameter "method" must be "pearson", "kendall", or "spearman".') } ## alpha - if (!is.null(alpha)) { - if (any(!is.numeric(alpha) | alpha <= 0 | alpha >= 1 | - length(alpha) > 1)) { - stop('Parameter "alpha" must be NULL or a number between 0 and 1.') - } + if (sign & any(!is.numeric(alpha) | alpha <= 0 | alpha >= 1 | length(alpha) > 1)) { + stop('Parameter "alpha" must be a number between 0 and 1.') } ## handle.na if (!handle.na %in% c('return.na', 'only.complete.triplets', 'na.fail')) { stop('Parameter "handle.na" must be "return.na", "only.complete.triplets" or "na.fail".') } + ## pval + if (!is.logical(pval) | length(pval) > 1) { + stop("Parameter 'pval' must be one logical value.") + } + ## sign + if (!is.logical(sign) | length(sign) > 1) { + stop("Parameter 'sign' must be one logical value.") + } ## ncores if (!is.null(ncores)) { if (any(!is.numeric(ncores) | ncores %% 1 != 0 | ncores <= 0 | @@ -169,14 +178,15 @@ ResidualCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', if (is.null(dim(exp))) exp <- array(exp, dim = c(dim_exp[time_dim])) if (is.null(dim(ref))) ref <- array(ref, dim = c(dim_ref[time_dim])) } - + # output_dims - if (is.null(alpha)) { - output_dims <- list(res.corr = NULL, p.val = NULL) - } else { - output_dims <- list(res.corr = NULL, sign = NULL) + output_dims <- list(res.corr = NULL) + if (pval) { + output_dims <- c(output_dims, list(p.val = NULL)) } - + if (sign) { + output_dims <- c(output_dims, list(sign = NULL)) + } # Residual correlation if (is.array(N.eff)) { @@ -186,23 +196,26 @@ ResidualCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', ref = time_dim, N.eff = NULL), output_dims = output_dims, fun = .ResidualCorr, method = method, - alpha = alpha, handle.na = handle.na, ncores = ncores) + alpha = alpha, handle.na = handle.na, pval = pval, sign = sign, + ncores = ncores) } else { output <- Apply(data = list(exp = exp, obs = obs, ref = ref), target_dims = list(exp = time_dim, obs = time_dim, ref = time_dim), output_dims = output_dims, N.eff = N.eff, fun = .ResidualCorr, method = method, - alpha = alpha, handle.na = handle.na, ncores = ncores) + alpha = alpha, handle.na = handle.na, pval = pval, sign = sign, + ncores = ncores) } return(output) } -.ResidualCorr <- function(exp, obs, ref, N.eff = NA, method = 'pearson', alpha = NULL, - handle.na = 'return.na') { +.ResidualCorr <- function(exp, obs, ref, N.eff = NA, method = 'pearson', alpha = 0.05, + handle.na = 'return.na', pval = TRUE, sign = FALSE) { # exp and ref and obs: [time] - .residual.corr <- function(exp, obs, ref, method, N.eff, alpha) { + .residual.corr <- function(exp, obs, ref, N.eff = NA, method = 'pearson', alpha = 0.05, + pval = TRUE, sign = FALSE) { # Residuals of 'exp' and 'obs' (regressing 'ref' out in both 'exp' and 'obs') exp_res <- lm(formula = y ~ x, data = list(y = exp, x = ref), na.action = NULL)$residuals @@ -218,9 +231,13 @@ ResidualCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', } t <- abs(output$res.corr) * sqrt(N.eff - 2) / sqrt(1 - output$res.corr^2) - if (is.null(alpha)) { # p-value - output$p.val <- pt(q = t, df = N.eff - 2, lower.tail = FALSE) - } else { + if (pval | sign) { # p-value + p.value <- pt(q = t, df = N.eff - 2, lower.tail = FALSE) + } + if (pval) { + output$p.val <- p.value + } + if (sign) { t_alpha2_n2 <- qt(p = alpha / 2, df = N.eff - 2, lower.tail = FALSE) if (!anyNA(c(t, t_alpha2_n2)) & t >= t_alpha2_n2) { output$sign <- TRUE @@ -245,20 +262,22 @@ ResidualCorr <- function(exp, obs, ref, N.eff = NA, time_dim = 'sdate', ref <- ref[!nna] output <- .residual.corr(exp = exp, obs = obs, ref = ref, method = method, - N.eff = N.eff, alpha = alpha) + N.eff = N.eff, alpha = alpha, pval = pval, sign = sign) } else if (handle.na == 'return.na') { - # Data contain NA, return NAs directly without passing to .diff.corr - if (is.null(alpha)) { - output <- list(res.corr = NA, p.val = NA) - } else { - output <- list(res.corr = NA, sign = NA) + # Data contain NA, return NAs directly without passing to .residual.corr + output <- list(res.corr = NA) + if (pval) { + output <- c(output, list(p.val = NA)) + } + if (sign) { + output <- c(output, list(sign = NA)) } } } else { ## There is no NA output <- .residual.corr(exp = exp, obs = obs, ref = ref, method = method, - N.eff = N.eff, alpha = alpha) + N.eff = N.eff, alpha = alpha, pval = pval, sign = sign) } return(output) diff --git a/R/Spectrum.R b/R/Spectrum.R index 2cbb16793d42fb398bc45ca7173fcd0f09283567..a75ead6d340c4a555e42619711889f3e50c59145 100644 --- a/R/Spectrum.R +++ b/R/Spectrum.R @@ -15,8 +15,8 @@ #' evenly spaced in time. #'@param time_dim A character string indicating the dimension along which to #' compute the frequency spectrum. The default value is 'ftime'. -#'@param conf.lev A numeric indicating the confidence level for the Monte-Carlo -#' significance test. The default value is 0.95. +#'@param alpha A numeric indicating the significance level for the Monte-Carlo +#' significance test. The default value is 0.05. #'@param ncores An integer indicating the number of cores to use for parallel #' computation. The default value is NULL. #' @@ -45,7 +45,7 @@ #'@import multiApply #'@importFrom stats spectrum cor rnorm sd quantile #'@export -Spectrum <- function(data, time_dim = 'ftime', conf.lev = 0.95, ncores = NULL) { +Spectrum <- function(data, time_dim = 'ftime', alpha = 0.05, ncores = NULL) { # Check inputs ## data @@ -69,9 +69,9 @@ Spectrum <- function(data, time_dim = 'ftime', conf.lev = 0.95, ncores = NULL) { if (!time_dim %in% names(dim(data))) { stop("Parameter 'time_dim' is not found in 'data' dimension.") } - ## conf.lev - if (!is.numeric(conf.lev) | conf.lev < 0 | conf.lev > 1 | length(conf.lev) > 1) { - stop("Parameter 'conf.lev' must be a numeric number between 0 and 1.") + ## alpha + if (!is.numeric(alpha) | alpha < 0 | alpha > 1 | length(alpha) > 1) { + stop("Parameter 'alpha' must be a numeric number between 0 and 1.") } ## ncores if (!is.null(ncores)) { @@ -88,13 +88,13 @@ Spectrum <- function(data, time_dim = 'ftime', conf.lev = 0.95, ncores = NULL) { target_dims = time_dim, fun = .Spectrum, output_dims = c(time_dim, 'stats'), - conf.lev = conf.lev, + alpha = alpha, ncores = ncores)$output1 return(output) } -.Spectrum <- function(data, conf.lev = 0.95) { +.Spectrum <- function(data, alpha = 0.05) { # data: [time] data <- data[is.na(data) == FALSE] @@ -119,7 +119,7 @@ Spectrum <- function(data, time_dim = 'ftime', conf.lev = 0.95, ncores = NULL) { store[jt, ] <- toto2$spec } for (jx in 1:length(tmp$spec)) { - output[jx, 3] <- quantile(store[, jx], conf.lev) + output[jx, 3] <- quantile(store[, jx], 1 - alpha) } } else { output <- NA diff --git a/R/Spread.R b/R/Spread.R index d1d8f6d159bceab7781f9d131ad47589381ed708..5fba8cab793ba4eccb6c05801a15a1c3535e47fc 100644 --- a/R/Spread.R +++ b/R/Spread.R @@ -16,8 +16,8 @@ #' kept (FALSE) for computation. The default value is TRUE. #'@param conf A logical value indicating whether to compute the confidence #' intervals or not. The default value is TRUE. -#'@param conf.lev A numeric value of the confidence level for the computation. -#' The default value is 0.95. +#'@param alpha A numeric of the significance level to be used in the +#' statistical significance test. The default value is 0.05. #'@param ncores An integer indicating the number of cores to use for parallel #' computation. The default value is NULL. #' @@ -52,7 +52,9 @@ #' posdim = 3, #' lendim = dim(smooth_ano_exp)['member'], #' name = 'member') +#'suppressWarnings({ #'spread <- Spread(smooth_ano_exp_m_sub, compute_dim = c('member', 'sdate')) +#'}) #' #'\dontrun{ #'PlotVsLTime(Reorder(spread$iqr, c('dataset', 'stats', 'ftime')), @@ -81,7 +83,7 @@ #'@importFrom stats IQR sd mad runif quantile #'@export Spread <- function(data, compute_dim = 'member', na.rm = TRUE, - conf = TRUE, conf.lev = 0.95, ncores = NULL) { + conf = TRUE, alpha = 0.05, ncores = NULL) { # Check inputs ## data @@ -113,9 +115,9 @@ Spread <- function(data, compute_dim = 'member', na.rm = TRUE, if (!is.logical(conf) | length(conf) > 1) { stop("Parameter 'conf' must be one logical value.") } - ## conf.lev - if (!is.numeric(conf.lev) | any(conf.lev < 0) | any(conf.lev > 1) | length(conf.lev) > 1) { - stop("Parameter 'conf.lev' must be a numeric number between 0 and 1.") + ## alpha + if (any(!is.numeric(alpha) | alpha < 0 | alpha > 1) | length(alpha) > 1) { + stop("Parameter 'alpha' must be a numeric number between 0 and 1.") } ## ncores if (!is.null(ncores)) { @@ -134,14 +136,14 @@ Spread <- function(data, compute_dim = 'member', na.rm = TRUE, output_dims = list(iqr = 'stats', maxmin = 'stats', sd = 'stats', mad = 'stats'), na.rm = na.rm, - conf = conf, conf.lev = conf.lev, + conf = conf, alpha = alpha, ncores = ncores) return(output) } .Spread <- function(data, compute_dim = 'member', na.rm = TRUE, - conf = TRUE, conf.lev = 0.95) { + conf = TRUE, alpha = 0.05) { # data: compute_dim. [member] or [member, sdate] for example @@ -159,24 +161,24 @@ Spread <- function(data, compute_dim = 'member', na.rm = TRUE, res_sd <- rep(res_sd, 3) res_mad <- rep(res_mad, 3) - conf_low <- (1 - conf.lev) / 2 + conf_low <- alpha / 2 conf_high <- 1 - conf_low # Create vector for saving bootstrap result - iqr_bs <- c() - maxmin_bs <- c() - sd_bs <- c() - mad_bs <- c() + iqr_bs <- rep(NA, 100) + maxmin_bs <- rep(NA, 100) + sd_bs <- rep(NA, 100) + mad_bs <- rep(NA, 100) # bootstrapping for 100 times num <- length(data) for (jmix in 1:100) { drawings <- round(runif(num, 0.5, num + 0.5)) - iqr_bs <- c(iqr_bs, IQR(data[drawings], na.rm = na.rm)) - maxmin_bs <- c(maxmin_bs, max(data[drawings], na.rm = na.rm) - - min(data[drawings], na.rm = na.rm)) - sd_bs <- c(sd_bs, sd(data[drawings], na.rm = na.rm)) - mad_bs <- c(mad_bs, mad(data[drawings], na.rm = na.rm)) + iqr_bs[jmix] <- IQR(data[drawings], na.rm = na.rm) + maxmin_bs[jmix] <- max(data[drawings], na.rm = na.rm) - + min(data[drawings], na.rm = na.rm) + sd_bs[jmix] <- sd(data[drawings], na.rm = na.rm) + mad_bs[jmix] <- mad(data[drawings], na.rm = na.rm) } # Calculate confidence interval with the bootstrapping results diff --git a/R/StatSeasAtlHurr.R b/R/StatSeasAtlHurr.R index 9d0ec3865ace345c0f17407ee6f7294a081cd9b0..764215a4fe2081d4a2be80d184c4dcfe29063f7d 100644 --- a/R/StatSeasAtlHurr.R +++ b/R/StatSeasAtlHurr.R @@ -7,7 +7,7 @@ #'large (bounded by 30N-30S). The anomalies are for the JJASON season.\cr #'The estimated seasonal average is either 1) number of hurricanes, 2) number #'of tropical cyclones with lifetime >=48h or 3) power dissipation index -#'(PDI; in 10^11 m^3 s^{-2}).\cr +#'(PDI; in 10^11 m^3 s^(-2)).\cr #'The statistical models used in this function are described in references. #' #'@param atlano A numeric array with named dimensions of Atlantic sea surface diff --git a/R/Trend.R b/R/Trend.R index d709101a681eec1bab3104081bb184356d2fd0ba..e10fe19901a581d92c2c05bde61c9f350395c3df 100644 --- a/R/Trend.R +++ b/R/Trend.R @@ -15,12 +15,14 @@ #' points along 'time_dim' dimension. The default value is 1. #'@param polydeg A positive integer indicating the degree of polynomial #' regression. The default value is 1. +#'@param alpha A numeric indicating the significance level for the statistical +#' significance test. The default value is 0.05. #'@param conf A logical value indicating whether to retrieve the confidence #' intervals or not. The default value is TRUE. -#'@param conf.lev A numeric indicating the confidence level for the -#' regression computation. The default value is 0.95. #'@param pval A logical value indicating whether to compute the p-value or not. #' The default value is TRUE. +#'@param sign A logical value indicating whether to retrieve the statistical +#' significance based on 'alpha'. The default value is FALSE. #'@param ncores An integer indicating the number of cores to use for parallel #' computation. The default value is NULL. #' @@ -37,7 +39,7 @@ #' A numeric array with the first dimension 'stats', followed by the same #' dimensions as parameter 'data' except the 'time_dim' dimension. The length #' of the 'stats' dimension should be \code{polydeg + 1}, containing the -#' lower limit of the \code{conf.lev}\% confidence interval for all the +#' lower limit of the \code{(1-alpha)}\% confidence interval for all the #' regression coefficients with the same order as \code{$trend}. Only present #' \code{conf = TRUE}. #'} @@ -45,7 +47,7 @@ #' A numeric array with the first dimension 'stats', followed by the same #' dimensions as parameter 'data' except the 'time_dim' dimension. The length #' of the 'stats' dimension should be \code{polydeg + 1}, containing the -#' upper limit of the \code{conf.lev}\% confidence interval for all the +#' upper limit of the \code{(1-alpha)}\% confidence interval for all the #' regression coefficients with the same order as \code{$trend}. Only present #' \code{conf = TRUE}. #'} @@ -54,6 +56,9 @@ #' 'stats' is 1, followed by the same dimensions as parameter 'data' except #' the 'time_dim' dimension. Only present if \code{pval = TRUE}. #'} +#'\item{$sign}{ +#' The statistical significance. Only present if \code{sign = TRUE}. +#'} #'\item{$detrended}{ #' A numeric array with the same dimensions as paramter 'data', containing the #' detrended values along the 'time_dim' dimension. @@ -69,8 +74,8 @@ #'@import multiApply #'@importFrom stats anova #'@export -Trend <- function(data, time_dim = 'ftime', interval = 1, polydeg = 1, - conf = TRUE, conf.lev = 0.95, pval = TRUE, ncores = NULL) { +Trend <- function(data, time_dim = 'ftime', interval = 1, polydeg = 1, alpha = 0.05, + conf = TRUE, pval = TRUE, sign = FALSE, ncores = NULL) { # Check inputs ## data @@ -103,18 +108,22 @@ Trend <- function(data, time_dim = 'ftime', interval = 1, polydeg = 1, length(polydeg) > 1) { stop("Parameter 'polydeg' must be a positive integer.") } + ## alpha + if (!is.numeric(alpha) | any(alpha < 0) | any(alpha > 1) | length(alpha) > 1) { + stop("Parameter 'alpha' must be a numeric number between 0 and 1.") + } ## conf if (!is.logical(conf) | length(conf) > 1) { stop("Parameter 'conf' must be one logical value.") } - ## conf.lev - if (!is.numeric(conf.lev) | conf.lev < 0 | conf.lev > 1 | length(conf.lev) > 1) { - stop("Parameter 'conf.lev' must be a numeric number between 0 and 1.") - } ## pval if (!is.logical(pval) | length(pval) > 1) { stop("Parameter 'pval' must be one logical value.") } + ## sign + if (!is.logical(sign) | length(sign) > 1) { + stop("Parameter 'sign' must be one logical value.") + } ## ncores if (!is.null(ncores)) { if (!is.numeric(ncores)) { @@ -126,32 +135,29 @@ Trend <- function(data, time_dim = 'ftime', interval = 1, polydeg = 1, ############################### # Calculate Trend - if (conf & pval) { - output_dims <- list(trend = 'stats', conf.lower = 'stats', - conf.upper = 'stats', p.val = 'stats', detrended = time_dim) - } else if (conf & !pval) { - output_dims <- list(trend = 'stats', conf.lower = 'stats', - conf.upper = 'stats', detrended = time_dim) - } else if (!conf & pval) { - output_dims <- list(trend = 'stats', p.val = 'stats', detrended = time_dim) - } else { - output_dims <- list(trend = 'stats', detrended = time_dim) - } + + ## output_dims + output_dims <- list(trend = 'stats') + if (conf) output_dims <- c(output_dims, list(conf.lower = 'stats', conf.upper = 'stats')) + if (pval) output_dims <- c(output_dims, list(p.val = 'stats')) + if (sign) output_dims <- c(output_dims, list(sign = 'stats')) + + output_dims <- c(output_dims, list(detrended = time_dim)) output <- Apply(list(data), target_dims = time_dim, fun = .Trend, output_dims = output_dims, interval = interval, - polydeg = polydeg, conf = conf, - conf.lev = conf.lev, pval = pval, + polydeg = polydeg, alpha = alpha, conf = conf, + pval = pval, sign = sign, ncores = ncores) return(invisible(output)) } -.Trend <- function(x, interval = 1, polydeg = 1, - conf = TRUE, conf.lev = 0.95, pval = TRUE) { +.Trend <- function(x, interval = 1, polydeg = 1, alpha = 0.05, + conf = TRUE, pval = TRUE, sign = FALSE) { # x: [ftime] mon <- seq(x) * interval @@ -168,12 +174,14 @@ Trend <- function(data, time_dim = 'ftime', interval = 1, polydeg = 1, trend <- lm.out$coefficients #intercept, slope1, slope2,... if (conf) { - conf.lower <- confint(lm.out, level = conf.lev)[, 1] - conf.upper <- confint(lm.out, level = conf.lev)[, 2] + conf.lower <- confint(lm.out, level = (1 - alpha))[, 1] + conf.upper <- confint(lm.out, level = (1 - alpha))[, 2] } - if (pval) { - p.val <- as.array(stats::anova(lm.out)$'Pr(>F)'[1]) + if (pval | sign) { + p.value <- as.array(stats::anova(lm.out)$'Pr(>F)'[1]) + if (pval) p.val <- p.value + if (sign) signif <- !is.na(p.value) & p.value <= alpha } detrended <- c() @@ -189,21 +197,16 @@ Trend <- function(data, time_dim = 'ftime', interval = 1, polydeg = 1, conf.upper <- rep(NA, polydeg + 1) } - if (pval) { - p.val <- as.array(NA) - } + if (pval) p.val <- as.array(NA) + if (sign) signif <- as.array(FALSE) } - if (conf & pval) { - return(list(trend = trend, conf.lower = conf.lower, conf.upper = conf.upper, - p.val = p.val, detrended = detrended)) - } else if (conf & !pval) { - return(list(trend = trend, conf.lower = conf.lower, conf.upper = conf.upper, - detrended = detrended)) - } else if (!conf & pval) { - return(list(trend = trend, p.val = p.val, detrended = detrended)) - } else { - return(list(trend = trend, detrended = detrended)) - } + output <- list(trend = trend) + if (conf) output <- c(output, list(conf.lower = conf.lower, conf.upper = conf.upper)) + if (pval) output <- c(output, list(p.val = p.val)) + if (sign) output <- c(output, list(sign = signif)) + output <- c(output, list(detrended = detrended)) + + return(output) } diff --git a/R/UltimateBrier.R b/R/UltimateBrier.R index d2c4ac936016ef4f663f6005877b6a3220a06e88..44498a365633d16d612869768b567e63c6a2eb18 100644 --- a/R/UltimateBrier.R +++ b/R/UltimateBrier.R @@ -12,7 +12,7 @@ #' 'memb_dim', the length must be 1. The dimensions should be consistent with #' 'exp' except 'dat_dim' and 'memb_dim'. It can be provided by \code{Ano()}. #'@param dat_dim A character string indicating the name of the dataset -#' dimension in 'exp' and 'obs'. The default value is 'dataset'. If there is no dataset +#' dimension in 'exp' and 'obs'. The default value is NULL (no dataset). #' dimension, set NULL. #'@param memb_dim A character string indicating the name of the member #' dimension in 'exp' (and 'obs') for ensemble mean calculation. The default @@ -81,12 +81,12 @@ #'clim <- Clim(sampleData$mod, sampleData$obs) #'exp <- Ano(sampleData$mod, clim$clim_exp) #'obs <- Ano(sampleData$obs, clim$clim_obs) -#'bs <- UltimateBrier(exp, obs) -#'bss <- UltimateBrier(exp, obs, type = 'BSS') +#'bs <- UltimateBrier(exp, obs, dat_dim = 'dataset') +#'bss <- UltimateBrier(exp, obs, type = 'BSS', dat_dim = 'dataset') #' #'@import SpecsVerification plyr multiApply #'@export -UltimateBrier <- function(exp, obs, dat_dim = 'dataset', memb_dim = 'member', time_dim = 'sdate', +UltimateBrier <- function(exp, obs, dat_dim = NULL, memb_dim = 'member', time_dim = 'sdate', quantile = TRUE, thr = c(5/100, 95/100), type = 'BS', decomposition = TRUE, ncores = NULL) { @@ -223,7 +223,7 @@ UltimateBrier <- function(exp, obs, dat_dim = 'dataset', memb_dim = 'member', ti return(res) } -.UltimateBrier <- function(exp, obs, dat_dim = 'dataset', memb_dim = 'member', thr = c(5/100, 95/100), +.UltimateBrier <- function(exp, obs, dat_dim = NULL, memb_dim = 'member', thr = c(5/100, 95/100), type = 'BS', decomposition = TRUE) { # If exp and obs are probablistics # exp: [sdate, nexp] diff --git a/R/Utils.R b/R/Utils.R index 8770af99f12205401ee8d5727f3e44a4e1f987d8..362bdf8ff412601fd19d3ac14b7304c9f457a4a3 100644 --- a/R/Utils.R +++ b/R/Utils.R @@ -100,44 +100,7 @@ } position } - .t2nlatlon <- function(t) { - ## As seen in cdo's griddes.c: ntr2nlat() - nlats <- (t * 3 + 1) / 2 - if ((nlats > 0) && (nlats - trunc(nlats) >= 0.5)) { - nlats <- ceiling(nlats) - } else { - nlats <- round(nlats) - } - if (nlats %% 2 > 0) { - nlats <- nlats + 1 - } - ## As seen in cdo's griddes.c: compNlon(), and as specified in ECMWF - nlons <- 2 * nlats - keep_going <- TRUE - while (keep_going) { - n <- nlons - if (n %% 8 == 0) n <- trunc(n / 8) - while (n %% 6 == 0) n <- trunc(n / 6) - while (n %% 5 == 0) n <- trunc(n / 5) - while (n %% 4 == 0) n <- trunc(n / 4) - while (n %% 3 == 0) n <- trunc(n / 3) - if (n %% 2 == 0) n <- trunc(n / 2) - if (n <= 8) { - keep_going <- FALSE - } else { - nlons <- nlons + 2 - if (nlons > 9999) { - stop("Error: pick another gaussian grid truncation. It doesn't fulfill the standards to apply FFT.") - } - } - } - c(nlats, nlons) - } - .nlat2t <- function(nlats) { - trunc((nlats * 2 - 1) / 3) - } - found_file <- NULL dims <- NULL grid_name <- units <- var_long_name <- NULL @@ -291,6 +254,10 @@ } else { grid_name <- paste0('t', .nlat2t(grid_lats), 'grid') } + if (is.null(work_piece[['grid']])) { + .warning(paste0("Detect the grid type to be '", grid_name, "'. ", + "If it is not expected, assign parameter 'grid' to avoid wrong result.")) + } } # If a common grid is requested, we will also calculate its size which we will use # later on. @@ -1809,89 +1776,3 @@ GradientCatsColorBar <- function(nmap, brks = NULL, cols = NULL, vertical = TRUE } -.get_probs <- function(data, indices_for_quantiles, prob_thresholds, weights = NULL, cross.val = FALSE) { - # if exp: [sdate, memb] - # if obs: [sdate, (memb)] - - # Add dim [memb = 1] to obs if it doesn't have memb_dim - if (length(dim(data)) == 1) dim(data) <- c(dim(data), 1) - - # Absolute thresholds - if (cross.val) { - quantiles <- array(NA, dim = c(bin = length(prob_thresholds), sdate = dim(data)[1])) - for (i in 1:dim(data)[1]) { - if (is.null(weights)) { - quantiles[,i] <- quantile(x = as.vector(data[indices_for_quantiles[which(indices_for_quantiles != i)], ]), - probs = prob_thresholds, type = 8, na.rm = TRUE) - } else { - # weights: [sdate, memb] - sorted_arrays <- .sorted_distributions(data[indices_for_quantiles[which(indices_for_quantiles != i)], ], - weights[indices_for_quantiles[which(indices_for_quantiles != i)], ]) - sorted_data <- sorted_arrays$data - cumulative_weights <- sorted_arrays$cumulative_weights - quantiles[,i] <- approx(cumulative_weights, sorted_data, prob_thresholds, "linear")$y - } - } - } else { - if (is.null(weights)) { - quantiles <- quantile(x = as.vector(data[indices_for_quantiles, ]), probs = prob_thresholds, type = 8, na.rm = TRUE) - } else { - # weights: [sdate, memb] - sorted_arrays <- .sorted_distributions(data[indices_for_quantiles, ], weights[indices_for_quantiles, ]) - sorted_data <- sorted_arrays$data - cumulative_weights <- sorted_arrays$cumulative_weights - quantiles <- approx(cumulative_weights, sorted_data, prob_thresholds, "linear")$y - } - quantiles <- array(rep(quantiles, dim(data)[1]),dim = c(bin = length(quantiles), dim(data)[1])) - } - - # quantiles: [bin-1, sdate] - # Probabilities - probs <- array(dim = c(dim(quantiles)[1] + 1, dim(data)[1])) # [bin, sdate] - for (i_time in 1:dim(data)[1]) { - if (anyNA(data[i_time, ])) { - probs[, i_time] <- rep(NA, dim = dim(quantiles)[1] + 1) - } else { - if (is.null(weights)) { - probs[, i_time] <- colMeans(easyVerification::convert2prob(data[i_time, ], - threshold = quantiles[,i_time])) - } else { - sorted_arrays <- .sorted_distributions(data[i_time, ], weights[i_time, ]) - sorted_data <- sorted_arrays$data - cumulative_weights <- sorted_arrays$cumulative_weights - # find any quantiles that are outside the data range - integrated_probs <- array(dim = dim(quantiles)) - for (i_quant in 1:dim(quantiles)[1]) { - # for thresholds falling under the distribution - if (quantiles[i_quant, i_time] < min(sorted_data)) { - integrated_probs[i_quant, i_time] <- 0 - # for thresholds falling over the distribution - } else if (max(sorted_data) < quantiles[i_quant, i_time]) { - integrated_probs[i_quant, i_time] <- 1 - } else { - integrated_probs[i_quant, i_time] <- approx(sorted_data, cumulative_weights, quantiles[i_quant, i_time], - "linear")$y - } - } - probs[, i_time] <- append(integrated_probs[,i_time], 1) - append(0, integrated_probs[,i_time]) - if (min(probs[, i_time]) < 0 | max(probs[, i_time]) > 1) { - stop(paste0("Probability in i_time = ", i_time, " is out of [0, 1].")) - } - } - } - } - return(probs) -} - -.sorted_distributions <- function(data_vector, weights_vector) { - weights_vector <- as.vector(weights_vector) - data_vector <- as.vector(data_vector) - weights_vector <- weights_vector / sum(weights_vector) # normalize to 1 - sorter <- order(data_vector) - sorted_weights <- weights_vector[sorter] - cumulative_weights <- cumsum(sorted_weights) - 0.5 * sorted_weights - cumulative_weights <- cumulative_weights - cumulative_weights[1] # fix the 0 - cumulative_weights <- cumulative_weights / cumulative_weights[length(cumulative_weights)] # fix the 1 - return(list("data" = data_vector[sorter], "cumulative_weights" = cumulative_weights)) -} - diff --git a/man/ACC.Rd b/man/ACC.Rd index 7df6abb21bbcade409685f8b4798edbcc472bdd6..e1a8fb2e0d566e388c3b2afe210b9f95ea6c2f8f 100644 --- a/man/ACC.Rd +++ b/man/ACC.Rd @@ -7,19 +7,19 @@ ACC( exp, obs, - dat_dim = "dataset", + dat_dim = NULL, lat_dim = "lat", lon_dim = "lon", - space_dim = c("lat", "lon"), avg_dim = "sdate", memb_dim = "member", lat = NULL, lon = NULL, lonlatbox = NULL, + alpha = 0.05, + pval = TRUE, + sign = FALSE, conf = TRUE, conftype = "parametric", - conf.lev = 0.95, - pval = TRUE, ncores = NULL ) } @@ -32,8 +32,7 @@ The dimension should be the same as 'exp' except the length of 'dat_dim' and 'memb_dim'.} \item{dat_dim}{A character string indicating the name of dataset (nobs/nexp) -dimension. The default value is 'dataset'. If there is no dataset -dimension, set NULL.} +dimension. The default value is NULL (no dataset).} \item{lat_dim}{A character string indicating the name of the latitude dimension of 'exp' and 'obs' along which ACC is computed. The default value @@ -43,11 +42,6 @@ is 'lat'.} dimension of 'exp' and 'obs' along which ACC is computed. The default value is 'lon'.} -\item{space_dim}{A character string vector of 2 indicating the name of the -latitude and longitude dimensions (in order) along which ACC is computed. -The default value is c('lat', 'lon'). This argument has been deprecated. -Use 'lat_dim' and 'lon_dim' instead.} - \item{avg_dim}{A character string indicating the name of the dimension to be averaged, which is usually the time dimension. If no need to calculate mean ACC, set as NULL. The default value is 'sdate'.} @@ -67,6 +61,16 @@ NULL.} interested: c(lonmin, lonmax, latmin, latmax). The default value is NULL and the whole data will be used.} +\item{alpha}{A numeric indicating the significance level for the statistical +significance test. The default value is 0.05.} + +\item{pval}{A logical value indicating whether to compute the p-value or not. +The default value is TRUE.} + +\item{sign}{A logical value indicating whether to retrieve the statistical +significance of the test Ho: ACC = 0 based on 'alpha'. The default value is +FALSE.} + \item{conf}{A logical value indicating whether to retrieve the confidence intervals or not. The default value is TRUE.} @@ -80,12 +84,6 @@ make sure that your experiment and observation always have the same number of members. "bootstrap" requires 'memb_dim' has value. The default value is 'parametric'.} -\item{conf.lev}{A numeric indicating the confidence level for the -regression computation. The default value is 0.95.} - -\item{pval}{A logical value indicating whether to compute the p-value or not. -The default value is TRUE.} - \item{ncores}{An integer indicating the number of cores to use for parallel computation. The default value is NULL.} } @@ -97,6 +95,12 @@ A list containing the numeric arrays:\cr exp), and nobs is the number of observation (i.e., dat_dim in obs). If dat_dim is NULL, nexp and nobs are omitted. } +\item{macc}{ + The mean anomaly correlation coefficient with dimensions + c(nexp, nobs, the rest of the dimension except lat_dim, lon_dim, memb_dim, and + avg_dim). Only present if 'avg_dim' is not NULL. If dat_dim is NULL, nexp + and nobs are omitted. +} \item{conf.lower (if conftype = "parametric") or acc_conf.lower (if conftype = "bootstrap")}{ The lower confidence interval of ACC with the same dimensions as ACC. Only @@ -109,13 +113,10 @@ A list containing the numeric arrays:\cr } \item{p.val}{ The p-value with the same dimensions as ACC. Only present if - \code{pval = TRUE} and code{conftype = "parametric"}. + \code{pval = TRUE} and \code{conftype = "parametric"}. } -\item{macc}{ - The mean anomaly correlation coefficient with dimensions - c(nexp, nobs, the rest of the dimension except lat_dim, lon_dim, memb_dim, and - avg_dim). Only present if 'avg_dim' is not NULL. If dat_dim is NULL, nexp - and nobs are omitted. +\item{$sign}{ + The statistical significance. Only present if \code{sign = TRUE}. } \item{macc_conf.lower}{ The lower confidence interval of MACC with the same dimensions as MACC. @@ -153,8 +154,9 @@ sampleData$obs <- Season(sampleData$obs, monini = 11, moninf = 12, monsup = 2) clim <- Clim(sampleData$mod, sampleData$obs) ano_exp <- Ano(sampleData$mod, clim$clim_exp) ano_obs <- Ano(sampleData$obs, clim$clim_obs) -acc <- ACC(ano_exp, ano_obs, lat = sampleData$lat) -acc_bootstrap <- ACC(ano_exp, ano_obs, conftype = 'bootstrap', lat = sampleData$lat) +acc <- ACC(ano_exp, ano_obs, lat = sampleData$lat, dat_dim = 'dataset') +acc_bootstrap <- ACC(ano_exp, ano_obs, conftype = 'bootstrap', + lat = sampleData$lat, dat_dim = 'dataset') # Combine acc results for PlotACC res <- array(c(acc$conf.lower, acc$acc, acc$conf.upper, acc$p.val), dim = c(dim(acc$acc), 4)) diff --git a/man/AbsBiasSS.Rd b/man/AbsBiasSS.Rd index 029101d06d6c3f31b69d8936ef3852b2702e2c8c..ac4ca4a040e76731a61fbe16cf577ed12e1f0429 100644 --- a/man/AbsBiasSS.Rd +++ b/man/AbsBiasSS.Rd @@ -12,6 +12,8 @@ AbsBiasSS( memb_dim = NULL, dat_dim = NULL, na.rm = FALSE, + sig_method.type = "two.sided.approx", + alpha = 0.05, ncores = NULL ) } @@ -45,6 +47,14 @@ The default value is NULL.} \item{na.rm}{A logical value indicating if NAs should be removed (TRUE) or kept (FALSE) for computation. The default value is FALSE.} +\item{sig_method.type}{A character string indicating the test type of the +significance method. Check \code{RandomWalkTest()} parameter +\code{test.type} for details. The default is 'two.sided.approx', which is +the default of \code{RandomWalkTest()}.} + +\item{alpha}{A numeric of the significance level to be used in the statistical +significance test. The default value is 0.05.} + \item{ncores}{An integer indicating the number of cores to use for parallel computation. The default value is NULL.} } @@ -71,9 +81,9 @@ forecast, while negative values indicate that it has a lower skill. Examples of reference forecasts are the climatological forecast (average of the observations), a previous model version, or another model. It is computed as \code{AbsBiasSS = 1 - AbsBias_exp / AbsBias_ref}. The statistical significance -is obtained based on a Random Walk test at the 95% confidence level (DelSole -and Tippett, 2016). If there is more than one dataset, the result will be -computed for each pair of exp and obs data. +is obtained based on a Random Walk test at the confidence level specified +(DelSole and Tippett, 2016). If there is more than one dataset, the result +will be computed for each pair of exp and obs data. } \examples{ exp <- array(rnorm(1000), dim = c(dat = 1, lat = 3, lon = 5, member = 10, sdate = 50)) diff --git a/man/CDORemap.Rd b/man/CDORemap.Rd index 61f61011114884b2efb5ba07abc2971f10216f68..d7eee21405d3ac4822e88d67a19266b040a2f464 100644 --- a/man/CDORemap.Rd +++ b/man/CDORemap.Rd @@ -13,7 +13,8 @@ CDORemap( avoid_writes = TRUE, crop = TRUE, force_remap = FALSE, - write_dir = tempdir() + write_dir = tempdir(), + ncores = NULL ) } \arguments{ @@ -85,6 +86,10 @@ is already on the target grid.} \item{write_dir}{Path to the directory where to create the intermediate files for CDO to work. By default, the R session temporary directory is used (\code{tempdir()}).} + +\item{ncores}{An integer indicating the number of theads used for +interpolation (i.e., \code{-P} in cdo command.) The default value is NULL +and \code{-P} is not used.} } \value{ A list with the following components: diff --git a/man/CRPSS.Rd b/man/CRPSS.Rd index 31bf501ec19771f8a86703bfa83fc567daaf4b0f..b6091880d4d632f724aa71bbbb1e2f6d459c69c1 100644 --- a/man/CRPSS.Rd +++ b/man/CRPSS.Rd @@ -12,6 +12,9 @@ CRPSS( memb_dim = "member", dat_dim = NULL, Fair = FALSE, + clim.cross.val = TRUE, + sig_method.type = "two.sided.approx", + alpha = 0.05, ncores = NULL ) } @@ -27,9 +30,12 @@ and 'dat_dim'.} least time and member dimension. The dimensions must be the same as 'exp' except 'memb_dim' and 'dat_dim'. If there is only one reference dataset, it should not have dataset dimension. If there is corresponding reference -for each experiement, the dataset dimension must have the same length as in +for each experiment, the dataset dimension must have the same length as in 'exp'. If 'ref' is NULL, the climatological forecast is used as reference -forecast. The default value is NULL.} +forecast. To build the climatological forecast, the observed values along +the whole time period are used as different members for all time steps. The +parameter 'clim.cross.val' controls whether to build it using +cross-validation. The default value is NULL.} \item{time_dim}{A character string indicating the name of the time dimension. The default value is 'sdate'.} @@ -46,6 +52,19 @@ The default value is NULL.} potential CRPSS that the forecast would have with an infinite ensemble size). The default value is FALSE.} +\item{clim.cross.val}{A logical indicating whether to build the climatological +forecast in cross-validation (i.e. excluding the observed value of the time +step when building the probabilistic distribution function for that +particular time step). Only used if 'ref' is NULL. The default value is TRUE.} + +\item{sig_method.type}{A character string indicating the test type of the +significance method. Check \code{RandomWalkTest()} parameter +\code{test.type} for details. The default is 'two.sided.approx', which is +the default of \code{RandomWalkTest()}.} + +\item{alpha}{A numeric of the significance level to be used in the statistical +significance test. The default value is 0.05.} + \item{ncores}{An integer indicating the number of cores to use for parallel computation. The default value is NULL.} } @@ -70,11 +89,10 @@ worsening with respect to a reference forecast. The CRPSS ranges between minus infinite and 1. If the CRPSS is positive, it indicates that the forecast has higher skill than the reference forecast, while a negative value means that it has a lower skill. Examples of reference forecasts are the climatological -forecast (same probabilities for all categories for all time steps), -persistence, a previous model version, or another model. It is computed as -CRPSS = 1 - CRPS_exp / CRPS_ref. The statistical significance is obtained -based on a Random Walk test at the 95% confidence level (DelSole and Tippett, -2016). +forecast, persistence, a previous model version, or another model. It is +computed as 'CRPSS = 1 - CRPS_exp / CRPS_ref. The statistical significance is +obtained based on a Random Walk test at the specified confidence level +(DelSole and Tippett, 2016). } \examples{ exp <- array(rnorm(1000), dim = c(lat = 3, lon = 2, member = 10, sdate = 50)) diff --git a/man/Corr.Rd b/man/Corr.Rd index bbb1e34d9edfbc82790b675e94c52c4a763d9a46..9fc2d3117e122fc7acbd91ff37570cac53dc0d01 100644 --- a/man/Corr.Rd +++ b/man/Corr.Rd @@ -8,7 +8,7 @@ Corr( exp, obs, time_dim = "sdate", - dat_dim = "dataset", + dat_dim = NULL, comp_dim = NULL, limits = NULL, method = "pearson", @@ -18,7 +18,6 @@ Corr( conf = TRUE, sign = FALSE, alpha = 0.05, - conf.lev = NULL, ncores = NULL ) } @@ -33,8 +32,7 @@ parameter 'exp' except along 'dat_dim' and 'memb_dim'.} which the correlations are computed. The default value is 'sdate'.} \item{dat_dim}{A character string indicating the name of dataset (nobs/nexp) -dimension. The default value is 'dataset'. If there is no dataset -dimension, set NULL.} +dimension. The default value is NULL (no dataset).} \item{comp_dim}{A character string indicating the name of dimension along which obs is taken into account only if it is complete. The default value @@ -67,8 +65,6 @@ FALSE.} \item{alpha}{A numeric indicating the significance level for the statistical significance test. The default value is 0.05.} -\item{conf.lev}{Deprecated. Use alpha now instead. alpha = 1 - conf.lev.} - \item{ncores}{An integer indicating the number of cores to use for parallel computation. The default value is NULL.} } @@ -129,13 +125,14 @@ required_complete_row <- 3 # Discard start dates which contain any NA lead-time leadtimes_per_startdate <- 60 corr <- Corr(MeanDims(smooth_ano_exp, 'member'), MeanDims(smooth_ano_obs, 'member'), - comp_dim = 'ftime', + comp_dim = 'ftime', dat_dim = 'dataset', limits = c(ceiling((runmean_months + 1) / 2), leadtimes_per_startdate - floor(runmean_months / 2))) # Case 2: Keep member dimension -corr <- Corr(smooth_ano_exp, smooth_ano_obs, memb_dim = 'member') +corr <- Corr(smooth_ano_exp, smooth_ano_obs, memb_dim = 'member', dat_dim = 'dataset') # ensemble mean -corr <- Corr(smooth_ano_exp, smooth_ano_obs, memb_dim = 'member', memb = FALSE) +corr <- Corr(smooth_ano_exp, smooth_ano_obs, memb_dim = 'member', memb = FALSE, + dat_dim = 'dataset') } diff --git a/man/DiffCorr.Rd b/man/DiffCorr.Rd index d127af817a8543ad478fad4114df93fa95ca046b..44bd52b0198922b52c08d25fabc60b7284fe468e 100644 --- a/man/DiffCorr.Rd +++ b/man/DiffCorr.Rd @@ -12,9 +12,11 @@ DiffCorr( time_dim = "sdate", memb_dim = NULL, method = "pearson", - alpha = NULL, + alpha = 0.05, handle.na = "return.na", test.type = "two-sided", + pval = TRUE, + sign = FALSE, ncores = NULL ) } @@ -47,8 +49,7 @@ directly to the function.} computed ("pearson" or "spearman"). The default value is "pearson".} \item{alpha}{A numeric of the significance level to be used in the statistical -significance test. If it is a numeric, "sign" will be returned. If NULL, the -p-value will be returned instead. The default value is NULL.} +significance test (output "sign"). The default value is 0.05.} \item{handle.na}{A charcater string indicating how to handle missing values. If "return.na", NAs will be returned for the cases that contain at least one @@ -63,6 +64,13 @@ significantly different) or "one-sided" (to assess whether the skill of "exp" is significantly higher than that of "ref") following Steiger (1980). The default value is "two-sided".} +\item{pval}{A logical value indicating whether to return the p-value of the +significance test Ho: DiffCorr = 0. The default value is TRUE.} + +\item{sign}{A logical value indicating whether to return the statistical +significance of the test Ho: DiffCorr = 0 based on 'alpha'. The default +value is FALSE.} + \item{ncores}{An integer indicating the number of cores to use for parallel computation. The default value is NULL.} } @@ -75,12 +83,12 @@ A list with: \item{$sign}{ A logical array of the statistical significance of the correlation differences with the same dimensions as the input arrays except "time_dim" - (and "memb_dim" if provided). Returned only if "alpha" is a numeric. + (and "memb_dim" if provided). Returned only if "sign" is TRUE. } \item{$p.val}{ A numeric array of the p-values with the same dimensions as the input arrays - except "time_dim" (and "memb_dim" if provided). Returned only if "alpha" is - NULL. + except "time_dim" (and "memb_dim" if provided). Returned only if "pval" is + TRUE. } } \description{ diff --git a/man/GetProbs.Rd b/man/GetProbs.Rd new file mode 100644 index 0000000000000000000000000000000000000000..fd84d2f878ecb208baafdf9176e67dfc145070d5 --- /dev/null +++ b/man/GetProbs.Rd @@ -0,0 +1,73 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/GetProbs.R +\name{GetProbs} +\alias{GetProbs} +\title{Compute probabilistic forecasts or the corresponding observations} +\usage{ +GetProbs( + data, + time_dim = "sdate", + memb_dim = "member", + indices_for_quantiles = NULL, + prob_thresholds = c(1/3, 2/3), + weights = NULL, + cross.val = FALSE, + ncores = NULL +) +} +\arguments{ +\item{data}{A named numerical array of the forecasts or observations with, at +least, time dimension.} + +\item{time_dim}{A character string indicating the name of the time dimension. +The default value is 'sdate'.} + +\item{memb_dim}{A character string indicating the name of the member dimension +to compute the probabilities of the forecast, or NULL if there is no member +dimension (e.g., for observations, or for forecast with only one ensemble +member). The default value is 'member'.} + +\item{indices_for_quantiles}{A vector of the indices to be taken along +'time_dim' for computing the absolute thresholds between the probabilistic +categories. If NULL, the whole period is used. The default value is NULL.} + +\item{prob_thresholds}{A numeric vector of the relative thresholds (from 0 to +1) between the categories. The default value is c(1/3, 2/3), which +corresponds to tercile equiprobable categories.} + +\item{weights}{A named numerical array of the weights for 'data' with +dimensions 'time_dim' and 'memb_dim' (if 'data' has them). The default value +is NULL. The ensemble should have at least 70 members or span at least 10 +time steps and have more than 45 members if consistency between the weighted +and unweighted methodologies is desired.} + +\item{cross.val}{A logical indicating whether to compute the thresholds +between probabilistic categories in cross-validation mode. The default value +is FALSE.} + +\item{ncores}{An integer indicating the number of cores to use for parallel +computation. The default value is NULL.} +} +\value{ +A numerical array of probabilities with dimensions c(bin, the rest dimensions +of 'data' except 'memb_dim'). 'bin' dimension has the length of probabilistic +categories, i.e., \code{length(prob_thresholds) + 1}. +} +\description{ +Compute probabilistic forecasts from an ensemble based on the relative +thresholds, or the probabilistic observations (i.e., which probabilistic +category was observed). A reference period can be specified to calculate the +absolute thresholds between each probabilistic category. The absolute +thresholds can be computed in cross-validation mode. If data is an ensemble, +the probabilities are calculated as the percentage of members that fall into +each category. For observations (or forecast without member dimension), 1 +means that the event happened, while 0 indicates that the event did not +happen. Weighted probabilities can be computed if the weights are provided for +each ensemble member and time step. +} +\examples{ +data <- array(rnorm(2000), dim = c(ensemble = 25, sdate = 20, time = 4)) +res <- GetProbs(data = data, time_dim = 'sdate', memb_dim = 'ensemble', + indices_for_quantiles = 4:17) + +} diff --git a/man/InsertDim.Rd b/man/InsertDim.Rd index 51418f0ba6624687dd12466e0edfeda3cfc98168..7a866a37c2887855ea5d5885f2d23a6569c8fbf2 100644 --- a/man/InsertDim.Rd +++ b/man/InsertDim.Rd @@ -4,7 +4,7 @@ \alias{InsertDim} \title{Add a named dimension to an array} \usage{ -InsertDim(data, posdim, lendim, name = NULL, ncores = NULL) +InsertDim(data, posdim, lendim, name = NULL) } \arguments{ \item{data}{An array to which the additional dimension to be added.} @@ -15,9 +15,6 @@ InsertDim(data, posdim, lendim, name = NULL, ncores = NULL) \item{name}{A character string indicating the name for the new dimension. The default value is NULL.} - -\item{ncores}{An integer indicating the number of cores to use for parallel -computation. The default value is NULL. This parameter is deprecated now.} } \value{ An array as parameter 'data' but with the added named dimension. diff --git a/man/Load.Rd b/man/Load.Rd index 10c03f94af836a709e9af17cfc409354681d6ee9..f91c315ca645c6c2af2aca2d3dd99242835909aa 100644 --- a/man/Load.Rd +++ b/man/Load.Rd @@ -273,6 +273,10 @@ interpolating to the specified grid.\cr If not specified and the selected output type is 'lon', 'lat' or 'lonlat', this parameter takes as default value the grid of the first experimental dataset, which is read automatically from the source files.\cr +Note that the auto-detected grid type is not guarenteed to be correct, and +it won't be correct if the netCDF file doesn't contain global domain. +Please check the warning carefully to ensure the detected grid type is +expected, or assign this parameter even regridding is not needed. The grid must be supported by 'cdo' tools. Now only supported: rNXxNY or tTRgrid.\cr Both rNXxNY and tRESgrid yield rectangular regular grids. rNXxNY yields diff --git a/man/MSE.Rd b/man/MSE.Rd new file mode 100644 index 0000000000000000000000000000000000000000..cd58402766e777dddce5b27027b6cdf2e34fc798 --- /dev/null +++ b/man/MSE.Rd @@ -0,0 +1,100 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/MSE.R +\name{MSE} +\alias{MSE} +\title{Compute mean square error} +\usage{ +MSE( + exp, + obs, + time_dim = "sdate", + dat_dim = NULL, + memb_dim = NULL, + comp_dim = NULL, + limits = NULL, + conf = TRUE, + alpha = 0.05, + ncores = NULL +) +} +\arguments{ +\item{exp}{A named numeric array of experimental data, with at least #' 'time_dim' dimension. It can also be a vector with the same length as 'obs'.} + +\item{obs}{A named numeric array of observational data, same dimensions as +parameter 'exp' except along 'dat_dim' and 'memb_dim'. It can also be a +vector with the same length as 'exp'.} + +\item{time_dim}{A character string indicating the name of dimension along +which the correlations are computed. The default value is 'sdate'.} + +\item{dat_dim}{A character string indicating the name of dataset or member +(nobs/nexp) dimension. The datasets of exp and obs will be paired and +computed MSE for each pair. The default value is NULL.} + +\item{memb_dim}{A character string indicating the name of the member dimension +to compute the ensemble mean; it should be set to NULL if the input data are +already the ensemble mean. The default value is NULL.} + +\item{comp_dim}{A character string indicating the name of dimension along which +obs is taken into account only if it is complete. The default value +is NULL.} + +\item{limits}{A vector of two integers indicating the range along comp_dim to +be completed. The default value is c(1, length(comp_dim dimension)).} + +\item{conf}{A logical value indicating whether to retrieve the confidence +intervals or not. The default value is TRUE.} + +\item{alpha}{A numeric indicating the significance level for the statistical +significance test. The default value is 0.05.} + +\item{ncores}{An integer indicating the number of cores to use for parallel +computation. The default value is NULL.} +} +\value{ +A list containing the numeric arrays with dimension:\cr + c(nexp, nobs, all other dimensions of exp except time_dim).\cr +nexp is the number of experiment (i.e., dat_dim in exp), and nobs is the +number of observation (i.e., dat_dim in obs).\cr +\item{$mse}{ + The mean square error. +} +\item{$conf.lower}{ + The lower confidence interval. Only present if \code{conf = TRUE}. +} +\item{$conf.upper}{ + The upper confidence interval. Only present if \code{conf = TRUE}. +} +} +\description{ +Compute the mean square error for an array of forecasts and an array of +observations. The MSEs are computed along time_dim, the dimension which +corresponds to the start date dimension. If comp_dim is given, the MSEs are +computed only if obs along the comp_dim dimension are complete between +limits[1] and limits[2], i.e. there are no NAs between limits[1] and +limits[2]. This option can be activated if the user wants to account only +for the forecasts for which the corresponding observations are available at +all leadtimes.\cr +The confidence interval is computed by the chi2 distribution.\cr +} +\examples{ +# Load sample data as in Load() example: +example(Load) +clim <- Clim(sampleData$mod, sampleData$obs) +ano_exp <- Ano(sampleData$mod, clim$clim_exp) +ano_obs <- Ano(sampleData$obs, clim$clim_obs) +smooth_ano_exp <- Smoothing(ano_exp, runmeanlen = 12, time_dim = 'ftime') +smooth_ano_obs <- Smoothing(ano_obs, runmeanlen = 12, time_dim = 'ftime') +res <- MSE(smooth_ano_exp, smooth_ano_obs, memb_dim = 'member', + comp_dim = 'ftime', limits = c(7, 54)) + +# Synthetic data: +exp1 <- array(rnorm(120), dim = c(dat = 3, sdate = 10, ftime = 4)) +obs1 <- array(rnorm(80), dim = c(dat = 2, sdate = 10, ftime = 4)) +res1 <- MSE(exp1, obs1, comp_dim = 'ftime', dat_dim = 'dat') + +exp2 <- array(rnorm(20), dim = c(sdate = 5, member = 4)) +obs2 <- array(rnorm(10), dim = c(sdate = 5, member = 2)) +res2 <- MSE(exp2, obs2, memb_dim = 'member') + +} diff --git a/man/MSSS.Rd b/man/MSSS.Rd new file mode 100644 index 0000000000000000000000000000000000000000..33df4501aac5e5137446830419d51860471fa797 --- /dev/null +++ b/man/MSSS.Rd @@ -0,0 +1,117 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/MSSS.R +\name{MSSS} +\alias{MSSS} +\title{Compute mean square error skill score} +\usage{ +MSSS( + exp, + obs, + ref = NULL, + time_dim = "sdate", + dat_dim = NULL, + memb_dim = NULL, + pval = TRUE, + sign = FALSE, + alpha = 0.05, + sig_method = "one-sided Fisher", + sig_method.type = NULL, + ncores = NULL +) +} +\arguments{ +\item{exp}{A named numeric array of experimental data which contains at least +time dimension (time_dim). It can also be a vector with the same length as +'obs', then the vector will automatically be 'time_dim'.} + +\item{obs}{A named numeric array of observational data which contains at least +time dimension (time_dim). The dimensions should be the same as parameter +'exp' except the length of 'dat_dim' and 'memb_dim' dimension. It can also +be a vector with the same length as 'exp', then the vector will +automatically be 'time_dim'.} + +\item{ref}{A named numerical array of the reference forecast data with at +least time dimension, or 0 (typical climatological forecast) or 1 +(normalized climatological forecast). If it is an array, the dimensions must +be the same as 'exp' except 'memb_dim' and 'dat_dim'. If there is only one +reference dataset, it should not have dataset dimension. If there is +corresponding reference for each experiment, the dataset dimension must +have the same length as in 'exp'. If 'ref' is NULL, the typical +climatological forecast is used as reference forecast (equivalent to 0.) +The default value is NULL.} + +\item{time_dim}{A character string indicating the name of dimension along +which the MSSS are computed. The default value is 'sdate'.} + +\item{dat_dim}{A character string indicating the name of dataset (nobs/nexp) +dimension. The default value is NULL.} + +\item{memb_dim}{A character string indicating the name of the member dimension +to compute the ensemble mean; it should be set to NULL if the data are +already the ensemble mean. The default value is NULL.} + +\item{pval}{A logical value indicating whether to compute or not the p-value +of the test Ho: MSSS = 0. The default value is TRUE.} + +\item{sign}{A logical value indicating whether to compute or not the +statistical significance of the test Ho: MSSS = 0. The default value is +FALSE.} + +\item{alpha}{A numeric of the significance level to be used in the +statistical significance test. The default value is 0.05.} + +\item{sig_method}{A character string indicating the significance method. The +options are "one-sided Fisher" (default) and "Random Walk".} + +\item{sig_method.type}{A character string indicating the test type of the +significance method. Check \code{RandomWalkTest()} parameter +\code{test.type} for details if parameter "sig_method" is "Random Walk". The +default is NULL (since "one-sided Fisher" doesn't have different test +types.)} + +\item{ncores}{An integer indicating the number of cores to use for parallel +computation. The default value is NULL.} +} +\value{ +A list containing the numeric arrays with dimension:\cr + c(nexp, nobs, all other dimensions of exp except time_dim).\cr +nexp is the number of experiment (i.e., dat_dim in exp), and nobs is the +number of observation (i.e., dat_dim in obs). If dat_dim is NULL, nexp and +nobs are omitted.\cr +\item{$msss}{ + A numerical array of the mean square error skill score. +} +\item{$p.val}{ + A numerical array of the p-value with the same dimensions as $msss. + Only present if \code{pval = TRUE}. +} +\item{sign}{ + A logical array of the statistical significance of the MSSS with the same + dimensions as $msss. Only present if \code{sign = TRUE}. +} +} +\description{ +Compute the mean square error skill score (MSSS) between an array of forecast +'exp' and an array of observation 'obs'. The two arrays should have the same +dimensions except along 'dat_dim' and 'memb_dim'. The MSSSs are computed along +'time_dim', the dimension which corresponds to the start date dimension. +MSSS computes the mean square error skill score of each exp in 1:nexp +against each obs in 1:nobs which gives nexp * nobs MSSS for each grid point +of the array.\cr +The p-value and significance test are optionally provided by an one-sided +Fisher test or Random Walk test.\cr +} +\examples{ +# Load sample data as in Load() example: +example(Load) +clim <- Clim(sampleData$mod, sampleData$obs) +ano_exp <- Ano(sampleData$mod, clim$clim_exp) +ano_obs <- Ano(sampleData$obs, clim$clim_obs) +rmsss <- MSSS(ano_exp, ano_obs, dat_dim = 'dataset', memb_dim = 'member') + +# Synthetic data: +exp <- array(rnorm(30), dim = c(dataset = 2, time = 3, memb = 5)) +obs <- array(rnorm(15), dim = c(time = 3, dataset = 1)) +res <- MSSS(exp, obs, time_dim = 'time', dat_dim = 'dataset', memb_dim = 'memb') + +} diff --git a/man/Plot2VarsVsLTime.Rd b/man/Plot2VarsVsLTime.Rd index 46b9cd50b91f9dae2bbeab55e059d7714150e798..9eeb928263365b81e3364e405c4a2ccbfa96e72e 100644 --- a/man/Plot2VarsVsLTime.Rd +++ b/man/Plot2VarsVsLTime.Rd @@ -115,15 +115,17 @@ required_complete_row <- 'ftime' # discard startdates for which there are NA le leadtimes_per_startdate <- 60 rms <- RMS(MeanDims(smooth_ano_exp, dim_to_mean), MeanDims(smooth_ano_obs, dim_to_mean), - comp_dim = required_complete_row, + comp_dim = required_complete_row, dat_dim = 'dataset', limits = c(ceiling((runmean_months + 1) / 2), - leadtimes_per_startdate - floor(runmean_months / 2))) + leadtimes_per_startdate - floor(runmean_months / 2))) smooth_ano_exp_m_sub <- smooth_ano_exp - InsertDim(MeanDims(smooth_ano_exp, 'member', na.rm = TRUE), posdim = 3, lendim = dim(smooth_ano_exp)['member'], name = 'member') +suppressWarnings({ spread <- Spread(smooth_ano_exp_m_sub, compute_dim = c('member', 'sdate')) +}) #Combine rms outputs into one array rms_combine <- abind::abind(rms$conf.lower, rms$rms, rms$conf.upper, along = 0) rms_combine <- Reorder(rms_combine, c(2, 3, 1, 4)) diff --git a/man/PlotACC.Rd b/man/PlotACC.Rd index 2764de3487103e4408f0da7d036e6abf5452beab..9a32f8b841b32b031581510a9299e9b2ab718a2b 100644 --- a/man/PlotACC.Rd +++ b/man/PlotACC.Rd @@ -110,8 +110,9 @@ sampleData$obs <- Season(sampleData$obs, monini = 11, moninf = 12, monsup = 2) clim <- Clim(sampleData$mod, sampleData$obs) ano_exp <- Ano(sampleData$mod, clim$clim_exp) ano_obs <- Ano(sampleData$obs, clim$clim_obs) -acc <- ACC(ano_exp, ano_obs, lat = sampleData$lat) -acc_bootstrap <- ACC(ano_exp, ano_obs, lat = sampleData$lat, conftype = 'bootstrap') +acc <- ACC(ano_exp, ano_obs, lat = sampleData$lat, dat_dim = 'dataset') +acc_bootstrap <- ACC(ano_exp, ano_obs, lat = sampleData$lat, conftype = 'bootstrap', + dat_dim = 'dataset') # Combine acc results for PlotACC res <- array(c(acc$conf.lower, acc$acc, acc$conf.upper, acc$p.val), dim = c(dim(acc$acc), 4)) diff --git a/man/PlotEquiMap.Rd b/man/PlotEquiMap.Rd index 19ff838922f4926f1fc093ed9c5a40b2185d2c26..5d3739a0349323304fcd881a804bda4a0fbf161c 100644 --- a/man/PlotEquiMap.Rd +++ b/man/PlotEquiMap.Rd @@ -404,5 +404,5 @@ sampleData <- s2dv:::.LoadSampleData('tos', c('experiment'), } PlotEquiMap(sampleData$mod[1, 1, 1, 1, , ], sampleData$lon, sampleData$lat, toptitle = 'Predicted sea surface temperature for Nov 1960 from 1st Nov', - sizetit = 0.5) + title_scale = 0.5) } diff --git a/man/PlotVsLTime.Rd b/man/PlotVsLTime.Rd index 05e2b422189793b9c359177ae3ef77ae34cc6ad4..21cfe5301919296c2fdacdb9d5b84c17acf39673 100644 --- a/man/PlotVsLTime.Rd +++ b/man/PlotVsLTime.Rd @@ -129,11 +129,12 @@ required_complete_row <- 'ftime' # discard startdates for which there are NA le leadtimes_per_startdate <- 60 corr <- Corr(MeanDims(smooth_ano_exp, dim_to_mean), MeanDims(smooth_ano_obs, dim_to_mean), - comp_dim = required_complete_row, + comp_dim = required_complete_row, dat_dim = 'dataset', limits = c(ceiling((runmean_months + 1) / 2), leadtimes_per_startdate - floor(runmean_months / 2))) # Combine corr results for plotting -corr_combine <- abind::abind(corr$conf.lower, corr$corr, corr$conf.upper, corr$p.val, along = 0) +corr_combine <- abind::abind(corr$conf.lower, corr$corr, corr$conf.upper, corr$p.val, + along = 0) corr_combine <- Reorder(corr_combine, c(2, 3, 1, 4)) \donttest{ PlotVsLTime(corr_combine, toptitle = "correlations", ytitle = "correlation", diff --git a/man/RMS.Rd b/man/RMS.Rd index 4391df47947a48b80c855d95f058d5e93034be3a..57473544cacaaa0f10629dd331dd830d2de79b38 100644 --- a/man/RMS.Rd +++ b/man/RMS.Rd @@ -8,30 +8,33 @@ RMS( exp, obs, time_dim = "sdate", - dat_dim = "dataset", + memb_dim = NULL, + dat_dim = NULL, comp_dim = NULL, limits = NULL, conf = TRUE, - conf.lev = 0.95, + alpha = 0.05, ncores = NULL ) } \arguments{ -\item{exp}{A named numeric array of experimental data, with at least two -dimensions 'time_dim' and 'dat_dim'. It can also be a vector with the -same length as 'obs', then the vector will automatically be 'time_dim' and -'dat_dim' will be 1.} +\item{exp}{A named numeric array of experimental data, with at least +'time_dim' dimension. It can also be a vector with the same length as 'obs'.} \item{obs}{A named numeric array of observational data, same dimensions as -parameter 'exp' except along dat_dim. It can also be a vector with the same -length as 'exp', then the vector will automatically be 'time_dim' and -'dat_dim' will be 1.} +parameter 'exp' except along 'dat_dim' and 'memb_dim'. It can also be a +vector with the same length as 'exp'.} \item{time_dim}{A character string indicating the name of dimension along which the correlations are computed. The default value is 'sdate'.} -\item{dat_dim}{A character string indicating the name of member (nobs/nexp) -dimension. The default value is 'dataset'.} +\item{memb_dim}{A character string indicating the name of the member dimension +to compute the ensemble mean; it should be set to NULL if the input data are +already the ensemble mean. The default value is NULL.} + +\item{dat_dim}{A character string indicating the name of dataset or member +(nobs/nexp) dimension. The datasets of exp and obs will be paired and +computed RMS for each pair. The default value is NULL.} \item{comp_dim}{A character string indicating the name of dimension along which obs is taken into account only if it is complete. The default value @@ -43,8 +46,8 @@ be completed. The default value is c(1, length(comp_dim dimension)).} \item{conf}{A logical value indicating whether to retrieve the confidence intervals or not. The default value is TRUE.} -\item{conf.lev}{A numeric indicating the confidence level for the -regression computation. The default value is 0.95.} +\item{alpha}{A numeric indicating the significance level for the statistical +significance test. The default value is 0.05.} \item{ncores}{An integer indicating the number of cores to use for parallel computation. The default value is NULL.} @@ -53,7 +56,8 @@ computation. The default value is NULL.} A list containing the numeric arrays with dimension:\cr c(nexp, nobs, all other dimensions of exp except time_dim).\cr nexp is the number of experiment (i.e., dat_dim in exp), and nobs is the -number of observation (i.e., dat_dim in obs).\cr +number of observation (i.e., dat_dim in obs). If dat_dim is NULL, nexp and +nobs are omitted.\cr \item{$rms}{ The root mean square error. } @@ -67,7 +71,7 @@ number of observation (i.e., dat_dim in obs).\cr \description{ Compute the root mean square error for an array of forecasts and an array of observations. The RMSEs are computed along time_dim, the dimension which -corresponds to the startdate dimension. If comp_dim is given, the RMSEs are +corresponds to the start date dimension. If comp_dim is given, the RMSEs are computed only if obs along the comp_dim dimension are complete between limits[1] and limits[2], i.e. there are no NAs between limits[1] and limits[2]. This option can be activated if the user wishes to account only @@ -77,14 +81,24 @@ The confidence interval is computed by the chi2 distribution.\cr } \examples{ # Load sample data as in Load() example: - set.seed(1) - exp1 <- array(rnorm(120), dim = c(dataset = 3, sdate = 5, ftime = 2, lon = 1, lat = 4)) - set.seed(2) - obs1 <- array(rnorm(80), dim = c(dataset = 2, sdate = 5, ftime = 2, lon = 1, lat = 4)) - set.seed(2) - na <- floor(runif(10, min = 1, max = 80)) - obs1[na] <- NA - res <- RMS(exp1, obs1, comp_dim = 'ftime') - # Renew example when Ano and Smoothing are ready +example(Load) +clim <- Clim(sampleData$mod, sampleData$obs) +ano_exp <- Ano(sampleData$mod, clim$clim_exp) +ano_obs <- Ano(sampleData$obs, clim$clim_obs) +smooth_ano_exp <- Smoothing(ano_exp, runmeanlen = 12, time_dim = 'ftime') +smooth_ano_obs <- Smoothing(ano_obs, runmeanlen = 12, time_dim = 'ftime') +res <- RMS(smooth_ano_exp, smooth_ano_obs, memb_dim = 'member', + comp_dim = 'ftime', limits = c(7, 54)) +# Synthetic data: +exp1 <- array(rnorm(120), dim = c(dat = 3, sdate = 10, ftime = 4)) +obs1 <- array(rnorm(80), dim = c(dat = 2, sdate = 10, ftime = 4)) +na <- floor(runif(10, min = 1, max = 80)) +obs1[na] <- NA +res1 <- RMS(exp1, obs1, comp_dim = 'ftime', dat_dim = 'dat') + +exp2 <- array(rnorm(20), dim = c(sdate = 5, member = 4)) +obs2 <- array(rnorm(10), dim = c(sdate = 5, member = 2)) +res2 <- RMS(exp2, obs2, memb_dim = 'member') + } diff --git a/man/RMSSS.Rd b/man/RMSSS.Rd index bcf221c351b37ffd12105716e9d1edba51b92130..7b31e26f2810fbab6ad23e9d4df0b1302e02d28e 100644 --- a/man/RMSSS.Rd +++ b/man/RMSSS.Rd @@ -9,27 +9,26 @@ RMSSS( obs, ref = NULL, time_dim = "sdate", - dat_dim = "dataset", + dat_dim = NULL, memb_dim = NULL, pval = TRUE, sign = FALSE, alpha = 0.05, sig_method = "one-sided Fisher", + sig_method.type = NULL, ncores = NULL ) } \arguments{ \item{exp}{A named numeric array of experimental data which contains at least -two dimensions for dat_dim and time_dim. It can also be a vector with the -same length as 'obs', then the vector will automatically be 'time_dim' and -'dat_dim' will be 1.} +time dimension (time_dim). It can also be a vector with the same length as +'obs', then the vector will automatically be 'time_dim'.} \item{obs}{A named numeric array of observational data which contains at least -two dimensions for dat_dim and time_dim. The dimensions should be the same -as paramter 'exp' except the length of 'dat_dim' dimension. The order of -dimension can be different. It can also be a vector with the same length as -'exp', then the vector will automatically be 'time_dim' and 'dat_dim' will -be 1.} +time dimension (time_dim). The dimensions should be the same as parameter +'exp' except the length of 'dat_dim' and 'memb_dim' dimension. It can also +be a vector with the same length as 'exp', then the vector will +automatically be 'time_dim'.} \item{ref}{A named numerical array of the reference forecast data with at least time dimension, or 0 (typical climatological forecast) or 1 @@ -38,18 +37,18 @@ be the same as 'exp' except 'memb_dim' and 'dat_dim'. If there is only one reference dataset, it should not have dataset dimension. If there is corresponding reference for each experiment, the dataset dimension must have the same length as in 'exp'. If 'ref' is NULL, the typical -climatological forecast is used as reference forecast (equivelant to 0.) +climatological forecast is used as reference forecast (equivalent to 0.) The default value is NULL.} \item{time_dim}{A character string indicating the name of dimension along which the RMSSS are computed. The default value is 'sdate'.} \item{dat_dim}{A character string indicating the name of dataset (nobs/nexp) -dimension. The default value is 'dataset'.} +dimension. The default value is NULL.} \item{memb_dim}{A character string indicating the name of the member dimension -to compute the ensemble mean; it should be set to NULL if the parameter 'exp' -and 'ref' are already the ensemble mean. The default value is NULL.} +to compute the ensemble mean; it should be set to NULL if the data are +already the ensemble mean. The default value is NULL.} \item{pval}{A logical value indicating whether to compute or not the p-value of the test Ho: RMSSS = 0. The default value is TRUE.} @@ -64,6 +63,12 @@ statistical significance test. The default value is 0.05.} \item{sig_method}{A character string indicating the significance method. The options are "one-sided Fisher" (default) and "Random Walk".} +\item{sig_method.type}{A character string indicating the test type of the +significance method. Check \code{RandomWalkTest()} parameter +\code{test.type} for details if parameter "sig_method" is "Random Walk". The +default is NULL (since "one-sided Fisher" doesn't have different test +types.)} + \item{ncores}{An integer indicating the number of cores to use for parallel computation. The default value is NULL.} } @@ -88,22 +93,35 @@ nobs are omitted.\cr \description{ Compute the root mean square error skill score (RMSSS) between an array of forecast 'exp' and an array of observation 'obs'. The two arrays should -have the same dimensions except along dat_dim, where the length can be -different, with the number of experiments/models (nexp) and the number of -observational datasets (nobs).\cr -RMSSS computes the root mean square error skill score of each jexp in 1:nexp -against each job in 1:nobs which gives nexp * nobs RMSSS for each grid point +have the same dimensions except along 'dat_dim' and 'memb_dim'. The RMSSSs +are computed along 'time_dim', the dimension which corresponds to the start +date dimension. +RMSSS computes the root mean square error skill score of each exp in 1:nexp +against each obs in 1:nobs which gives nexp * nobs RMSSS for each grid point of the array.\cr -The RMSSS are computed along the time_dim dimension which should correspond -to the start date dimension.\cr The p-value and significance test are optionally provided by an one-sided Fisher test or Random Walk test.\cr } \examples{ +# Load sample data as in Load() example: +example(Load) +clim <- Clim(sampleData$mod, sampleData$obs) +ano_exp <- Ano(sampleData$mod, clim$clim_exp) +ano_obs <- Ano(sampleData$obs, clim$clim_obs) +rmsss <- RMSSS(ano_exp, ano_obs, dat_dim = 'dataset', memb_dim = 'member') + set.seed(1) -exp <- array(rnorm(30), dim = c(dataset = 2, time = 3, memb = 5)) +exp1 <- array(rnorm(30), dim = c(dataset = 2, time = 3, memb = 5)) set.seed(2) -obs <- array(rnorm(15), dim = c(time = 3, memb = 5, dataset = 1)) -res <- RMSSS(exp, obs, time_dim = 'time', dat_dim = 'dataset') +obs1 <- array(rnorm(15), dim = c(time = 3, memb = 5, dataset = 1)) +res1 <- RMSSS(exp1, obs1, time_dim = 'time', dat_dim = 'dataset') + +exp2 <- array(rnorm(30), dim = c(lat = 2, time = 3, memb = 5)) +obs2 <- array(rnorm(15), dim = c(time = 3, lat = 2)) +res2 <- RMSSS(exp2, obs2, time_dim = 'time', memb_dim = 'memb') + +exp3 <- array(rnorm(30), dim = c(lat = 2, time = 3)) +obs3 <- array(rnorm(15), dim = c(time = 3, lat = 2)) +res3 <- RMSSS(exp3, obs3, time_dim = 'time') } diff --git a/man/ROCSS.Rd b/man/ROCSS.Rd index 1f4951781913d190e70b74642558dcbee4ab7073..7480f632ac1e6f9825da51c9cb0df7096e7fca13 100644 --- a/man/ROCSS.Rd +++ b/man/ROCSS.Rd @@ -10,6 +10,7 @@ ROCSS( ref = NULL, time_dim = "sdate", memb_dim = "member", + cat_dim = NULL, dat_dim = NULL, prob_thresholds = c(1/3, 2/3), indices_for_clim = NULL, @@ -18,27 +19,36 @@ ROCSS( ) } \arguments{ -\item{exp}{A named numerical array of the forecast with at least time and -member dimension.} +\item{exp}{A named numerical array of either the forecast with at least time +and member dimensions, or the probabilities with at least time and category +dimensions. The probabilities can be generated by \code{s2dv::GetProbs}.} -\item{obs}{A named numerical array of the observation with at least time -dimension. The dimensions must be the same as 'exp' except 'memb_dim' and -'dat_dim'.} +\item{obs}{A named numerical array of either the observations with at least +time dimension, or the probabilities with at least time and category +dimensions. The probabilities can be generated by \code{s2dv::GetProbs}. The +dimensions must be the same as 'exp' except 'memb_dim' and 'dat_dim'.} -\item{ref}{A named numerical array of the reference forecast data with at -least time and member dimension. The dimensions must be the same as 'exp' -except 'memb_dim' and 'dat_dim'. If there is only one reference dataset, -it should not have dataset dimension. If there is corresponding reference -for each experiement, the dataset dimension must have the same length as in -'exp'. If 'ref' is NULL, the random forecast is used as reference forecast. -The default value is NULL.} +\item{ref}{A named numerical array of the reference forecast with at least +time and member dimensions, or the probabilities with at least time and +category dimensions. The probability can be generated by +\code{s2dv::GetProbs}. The dimensions must be the same as 'exp' except +'memb_dim' and 'dat_dim'. If there is only one reference dataset, it should +not have dataset dimension. If there is corresponding reference for each +experiment, the dataset dimension must have the same length as in 'exp'. +If 'ref' is NULL, the random forecast is used as reference forecast. The +default value is NULL.} \item{time_dim}{A character string indicating the name of the time dimension. The default value is 'sdate'.} \item{memb_dim}{A character string indicating the name of the member dimension to compute the probabilities of the forecast and the reference forecast. The -default value is 'member'.} +default value is 'member'. If the data are probabilities, set memb_dim as +NULL.} + +\item{cat_dim}{A character string indicating the name of the category +dimension that is needed when exp, obs, and ref are probabilities. The +default value is NULL, which means that the data are not probabilities.} \item{dat_dim}{A character string indicating the name of dataset dimension. The length of this dimension can be different between 'exp' and 'obs'. @@ -60,11 +70,12 @@ FALSE.} computation. The default value is NULL.} } \value{ -A numerical array of ROCSS with the same dimensions as 'exp' excluding -'time_dim' and 'memb_dim' dimensions and including 'cat' dimension, which is -each category. The length if 'cat' dimension corresponds to the number of -probabilistic categories, i.e., 1 + length(prob_thresholds). If there are -multiple datasets, two additional dimensions 'nexp' and 'nobs' are added. +A numerical array of ROCSS with dimensions c(nexp, nobs, cat, the rest +dimensions of 'exp' except 'time_dim' and 'memb_dim' dimensions). nexp is the +number of experiment (i.e., dat_dim in exp), and nobs is the number of +observation (i.e., dat_dim in obs). If dat_dim is NULL, nexp and nobs are +omitted. dimension 'cat' refers to the probabilistic category, i.e., +\code{1 + length(prob_thresholds)}. } \description{ The Relative Operating Characteristic Skill Score (ROCSS; Kharin and Zwiers, @@ -73,14 +84,23 @@ against the false-alarm rates for a particular category or event. The ROC curve can be summarized with the area under the ROC curve, known as the ROC score, to provide a skill value for each category. The ROCSS ranges between minus infinite and 1. A positive ROCSS value indicates that the forecast has -higher skill than the reference forecasts, meaning the contrary otherwise. +higher skill than the reference forecast, meaning the contrary otherwise.\cr +The function accepts either the data or the probabilities of each data as +inputs. If there is more than one dataset, RPSS will be computed for each pair +of exp and obs data. } \examples{ +# Use data as input exp <- array(rnorm(1000), dim = c(lon = 3, lat = 2, sdate = 60, member = 10)) ref <- array(rnorm(1000), dim = c(lon = 3, lat = 2, sdate = 60, member = 10)) obs <- array(rnorm(1000), dim = c(lon = 3, lat = 2, sdate = 60)) ROCSS(exp = exp, obs = obs) ## random forecast as reference forecast ROCSS(exp = exp, obs = obs, ref = ref) ## ref as reference forecast +# Use probs as input +exp_probs <- GetProbs(exp, memb_dim = 'member') +obs_probs <- GetProbs(obs, memb_dim = NULL) +ref_probs <- GetProbs(ref, memb_dim = 'member') +ROCSS(exp = exp_probs, obs = obs_probs, ref = ref_probs, memb_dim = NULL, cat_dim = 'bin') } \references{ diff --git a/man/RPS.Rd b/man/RPS.Rd index 813c12f6e58e5cc6b33e9ffc95ebcd9555b4050d..041ca0779961570b72cb3349dc8669f3aff0b525 100644 --- a/man/RPS.Rd +++ b/man/RPS.Rd @@ -9,28 +9,37 @@ RPS( obs, time_dim = "sdate", memb_dim = "member", + cat_dim = NULL, dat_dim = NULL, prob_thresholds = c(1/3, 2/3), indices_for_clim = NULL, Fair = FALSE, weights = NULL, cross.val = FALSE, + na.rm = FALSE, ncores = NULL ) } \arguments{ -\item{exp}{A named numerical array of the forecast with at least time and -member dimension.} +\item{exp}{A named numerical array of either the forecasts with at least time +and member dimensions, or the probabilities with at least time and category +dimensions. The probabilities can be generated by \code{s2dv::GetProbs}.} -\item{obs}{A named numerical array of the observation with at least time -dimension. The dimensions must be the same as 'exp' except 'memb_dim' and -'dat_dim'.} +\item{obs}{A named numerical array of either the observation with at least +time dimension, or the probabilities with at least time and category +dimensions. The probabilities can be generated by \code{s2dv::GetProbs}. The +dimensions must be the same as 'exp' except 'memb_dim' and 'dat_dim'.} \item{time_dim}{A character string indicating the name of the time dimension. The default value is 'sdate'.} \item{memb_dim}{A character string indicating the name of the member dimension -to compute the probabilities of the forecast. The default value is 'member'.} +to compute the probabilities of the forecast. The default value is 'member'. +If the data are probabilities, set memb_dim as NULL.} + +\item{cat_dim}{A character string indicating the name of the category +dimension that is needed when the exp and obs are probabilities. The default +value is NULL, which means that the data are not probabilities.} \item{dat_dim}{A character string indicating the name of dataset dimension. The length of this dimension can be different between 'exp' and 'obs'. @@ -48,16 +57,21 @@ the whole period is used. The default value is NULL.} potential RPS that the forecast would have with an infinite ensemble size). The default value is FALSE.} -\item{weights}{A named numerical array of the weights for 'exp'. If 'dat_dim' -is NULL, the dimension should include 'memb_dim' and 'time_dim'. Else, the -dimension should also include 'dat_dim'. The default value is NULL. The -ensemble should have at least 70 members or span at least 10 time steps and -have more than 45 members if consistency between the weighted and unweighted -methodologies is desired.} +\item{weights}{A named numerical array of the weights for 'exp' probability +calculation. If 'dat_dim' is NULL, the dimensions should include 'memb_dim' +and 'time_dim'. Else, the dimension should also include 'dat_dim'. The +default value is NULL. The ensemble should have at least 70 members or span +at least 10 time steps and have more than 45 members if consistency between +the weighted and unweighted methodologies is desired.} -\item{cross.val}{A logical indicating whether to compute the thresholds between -probabilistic categories in cross-validation. -The default value is FALSE.} +\item{cross.val}{A logical indicating whether to compute the thresholds +between probabilistic categories in cross-validation. The default value is +FALSE.} + +\item{na.rm}{A logical or numeric value between 0 and 1. If it is numeric, it +means the lower limit for the fraction of the non-NA values. 1 is equal to +FALSE (no NA is acceptable), 0 is equal to TRUE (all NAs are acceptable). +than na.rm. Otherwise, RPS will be calculated. The default value is FALSE.} \item{ncores}{An integer indicating the number of cores to use for parallel computation. The default value is NULL.} @@ -77,14 +91,24 @@ of multi-categorical probabilistic forecasts. The RPS ranges between 0 (perfect forecast) and n-1 (worst possible forecast), where n is the number of categories. In the case of a forecast divided into two categories (the lowest number of categories that a probabilistic forecast can have), the RPS -corresponds to the Brier Score (BS; Wilks, 2011), therefore, ranges between 0 -and 1. If there is more than one dataset, RPS will be computed for each pair -of exp and obs data. +corresponds to the Brier Score (BS; Wilks, 2011), therefore ranging between 0 +and 1.\cr +The function first calculates the probabilities for forecasts and observations, +then use them to calculate RPS. Or, the probabilities of exp and obs can be +provided directly to compute the score. If there is more than one dataset, RPS +will be computed for each pair of exp and obs data. The fraction of acceptable +NAs can be adjusted. } \examples{ +# Use synthetic data exp <- array(rnorm(1000), dim = c(lat = 3, lon = 2, member = 10, sdate = 50)) obs <- array(rnorm(1000), dim = c(lat = 3, lon = 2, sdate = 50)) res <- RPS(exp = exp, obs = obs) +# Use probabilities as inputs +exp_probs <- GetProbs(exp, time_dim = 'sdate', memb_dim = 'member') +obs_probs <- GetProbs(obs, time_dim = 'sdate', memb_dim = NULL) +res2 <- RPS(exp = exp_probs, obs = obs_probs, memb_dim = NULL, cat_dim = 'bin') + } \references{ diff --git a/man/RPSS.Rd b/man/RPSS.Rd index d70425e62f7df67e3a3c5ef75b0ff9163fb98389..4b5b52250ab32f830a289d8a3c47122cfffecd2a 100644 --- a/man/RPSS.Rd +++ b/man/RPSS.Rd @@ -10,39 +10,51 @@ RPSS( ref = NULL, time_dim = "sdate", memb_dim = "member", + cat_dim = NULL, dat_dim = NULL, prob_thresholds = c(1/3, 2/3), indices_for_clim = NULL, Fair = FALSE, - weights = NULL, weights_exp = NULL, weights_ref = NULL, cross.val = FALSE, + na.rm = FALSE, + sig_method.type = "two.sided.approx", + alpha = 0.05, ncores = NULL ) } \arguments{ -\item{exp}{A named numerical array of the forecast with at least time and -member dimension.} +\item{exp}{A named numerical array of either the forecast with at least time +and member dimensions, or the probabilities with at least time and category +dimensions. The probabilities can be generated by \code{s2dv::GetProbs}.} -\item{obs}{A named numerical array of the observation with at least time -dimension. The dimensions must be the same as 'exp' except 'memb_dim' and -'dat_dim'.} +\item{obs}{A named numerical array of either the observation with at least +time dimension, or the probabilities with at least time and category +dimensions. The probabilities can be generated by \code{s2dv::GetProbs}. The +dimensions must be the same as 'exp' except 'memb_dim' and 'dat_dim'.} -\item{ref}{A named numerical array of the reference forecast data with at -least time and member dimension. The dimensions must be the same as 'exp' -except 'memb_dim' and 'dat_dim'. If there is only one reference dataset, -it should not have dataset dimension. If there is corresponding reference -for each experiement, the dataset dimension must have the same length as in -'exp'. If 'ref' is NULL, the climatological forecast is used as reference -forecast. The default value is NULL.} +\item{ref}{A named numerical array of either the reference forecast with at +least time and member dimensions, or the probabilities with at least time and +category dimensions. The probabilities can be generated by +\code{s2dv::GetProbs}. The dimensions must be the same as 'exp' except +'memb_dim' and 'dat_dim'. If there is only one reference dataset, it should +not have dataset dimension. If there is corresponding reference for each +experiment, the dataset dimension must have the same length as in 'exp'. If +'ref' is NULL, the climatological forecast is used as reference forecast. +The default value is NULL.} \item{time_dim}{A character string indicating the name of the time dimension. The default value is 'sdate'.} \item{memb_dim}{A character string indicating the name of the member dimension to compute the probabilities of the forecast and the reference forecast. The -default value is 'member'.} +default value is 'member'. If the data are probabilities, set memb_dim as +NULL.} + +\item{cat_dim}{A character string indicating the name of the category +dimension that is needed when exp, obs, and ref are probabilities. The +default value is NULL, which means that the data are not probabilities.} \item{dat_dim}{A character string indicating the name of dataset dimension. The length of this dimension can be different between 'exp' and 'obs'. @@ -60,22 +72,32 @@ the whole period is used. The default value is NULL.} potential RPSS that the forecast would have with an infinite ensemble size). The default value is FALSE.} -\item{weights}{Deprecated and will be removed in the next release. Please use -'weights_exp' and 'weights_ref' instead.} - -\item{weights_exp}{A named numerical array of the forecast ensemble weights. -The dimension should include 'memb_dim', 'time_dim' and 'dat_dim' if there -are multiple datasets. All dimension lengths must be equal to 'exp' -dimension lengths. The default value is NULL, which means no weighting is -applied. The ensemble should have at least 70 members or span at least 10 -time steps and have more than 45 members if consistency between the weighted - and unweighted methodologies is desired.} +\item{weights_exp}{A named numerical array of the forecast ensemble weights +for probability calculation. The dimension should include 'memb_dim', +'time_dim' and 'dat_dim' if there are multiple datasets. All dimension +lengths must be equal to 'exp' dimension lengths. The default value is NULL, +which means no weighting is applied. The ensemble should have at least 70 +members or span at least 10 time steps and have more than 45 members if +consistency between the weighted and unweighted methodologies is desired.} \item{weights_ref}{Same as 'weights_exp' but for the reference forecast.} -\item{cross.val}{A logical indicating whether to compute the thresholds between -probabilistics categories in cross-validation. -The default value is FALSE.} +\item{cross.val}{A logical indicating whether to compute the thresholds +between probabilistics categories in cross-validation. The default value is +FALSE.} + +\item{na.rm}{A logical or numeric value between 0 and 1. If it is numeric, it +means the lower limit for the fraction of the non-NA values. 1 is equal to +FALSE (no NA is acceptable), 0 is equal to TRUE (all NAs are acceptable). +than na.rm. Otherwise, RPS will be calculated. The default value is FALSE.} + +\item{sig_method.type}{A character string indicating the test type of the +significance method. Check \code{RandomWalkTest()} parameter +\code{test.type} for details. The default is 'two.sided.approx', which is +the default of \code{RandomWalkTest()}.} + +\item{alpha}{A numeric of the significance level to be used in the statistical +significance test. The default value is 0.05.} \item{ncores}{An integer indicating the number of cores to use for parallel computation. The default value is NULL.} @@ -98,14 +120,20 @@ based on the Ranked Probability Score (RPS; Wilks, 2011). It can be used to assess whether a forecast presents an improvement or worsening with respect to a reference forecast. The RPSS ranges between minus infinite and 1. If the RPSS is positive, it indicates that the forecast has higher skill than the -reference forecast, while a negative value means that it has a lower skill. +reference forecast, while a negative value means that it has a lower skill.\cr Examples of reference forecasts are the climatological forecast (same probabilities for all categories for all time steps), persistence, a previous model version, and another model. It is computed as \code{RPSS = 1 - RPS_exp / RPS_ref}. The statistical significance is obtained -based on a Random Walk test at the 95% confidence level (DelSole and Tippett, -2016). If there is more than one dataset, RPS will be computed for each pair -of exp and obs data. +based on a Random Walk test at the specified confidence level (DelSole and +Tippett, 2016).\cr +The function accepts either the ensemble members or the probabilities of +each data as inputs. If there is more than one dataset, RPSS will be +computed for each pair of exp and obs data. The NA ratio of data will be +examined before the calculation. If the ratio is higher than the threshold +(assigned by parameter \code{na.rm}), NA will be returned directly. NAs are +counted by per-pair method, which means that only the time steps that all the +datasets have values count as non-NA values. } \examples{ set.seed(1) @@ -119,9 +147,19 @@ weights <- sapply(1:dim(exp)['sdate'], function(i) { n/sum(n) }) dim(weights) <- c(member = 10, sdate = 50) +# Use data as input res <- RPSS(exp = exp, obs = obs) ## climatology as reference forecast res <- RPSS(exp = exp, obs = obs, ref = ref) ## ref as reference forecast res <- RPSS(exp = exp, obs = obs, ref = ref, weights_exp = weights, weights_ref = weights) +res <- RPSS(exp = exp, obs = obs, alpha = 0.01, sig_method.type = 'two.sided') + +# Use probs as input +exp_probs <- GetProbs(exp, memb_dim = 'member') +obs_probs <- GetProbs(obs, memb_dim = NULL) +ref_probs <- GetProbs(ref, memb_dim = 'member') +res <- RPSS(exp = exp_probs, obs = obs_probs, ref = ref_probs, memb_dim = NULL, + cat_dim = 'bin') + } \references{ Wilks, 2011; https://doi.org/10.1016/B978-0-12-385022-5.00008-7 diff --git a/man/RatioPredictableComponents.Rd b/man/RatioPredictableComponents.Rd index 3e7fbad6c000b42fc6fb1bd689db8007d250b853..8e6dbb7cf38e642b56194b2c0618cbae53ed458b 100644 --- a/man/RatioPredictableComponents.Rd +++ b/man/RatioPredictableComponents.Rd @@ -8,22 +8,22 @@ RatioPredictableComponents( exp, obs, time_dim = "year", - member_dim = "member", + memb_dim = "member", na.rm = FALSE, ncores = NULL ) } \arguments{ -\item{exp}{A numerical array with, at least, 'time_dim' and 'member_dim' +\item{exp}{A numerical array with, at least, 'time_dim' and 'memb_dim' dimensions.} \item{obs}{A numerical array with the same dimensions than 'exp' except the -'member_dim' dimension.} +'memb_dim' dimension.} \item{time_dim}{A character string indicating the name of the time dimension. The default value is 'year'.} -\item{member_dim}{A character string indicating the name of the member +\item{memb_dim}{A character string indicating the name of the member dimension. The default value is 'member'.} \item{na.rm}{A logical value indicating whether to remove NA values during @@ -34,7 +34,7 @@ computation. The default value is NULL.} } \value{ An array of the ratio of the predictable components. it has the same - dimensions as 'exp' except 'time_dim' and 'member_dim' dimensions. + dimensions as 'exp' except 'time_dim' and 'memb_dim' dimensions. } \description{ This function computes the ratio of predictable components (RPC; Eade et al., 2014). diff --git a/man/RatioSDRMS.Rd b/man/RatioSDRMS.Rd index f1f6f3ddea92b777910d2fdc0176e9fd02a5f67a..07afc461c64a380bc0bea9e1eb9fe297a39c7082 100644 --- a/man/RatioSDRMS.Rd +++ b/man/RatioSDRMS.Rd @@ -7,7 +7,7 @@ RatioSDRMS( exp, obs, - dat_dim = "dataset", + dat_dim = NULL, memb_dim = "member", time_dim = "sdate", pval = TRUE, @@ -23,8 +23,7 @@ dimensions 'memb_dim' and 'time_dim'. It should have the same dimensions as parameter 'exp' except along 'dat_dim' and 'memb_dim'.} \item{dat_dim}{A character string indicating the name of dataset (nobs/nexp) -dimension. If there is no dataset dimension, set as NULL. The default value -is 'dataset'.} +dimension. The default value is NULL (no dataset).} \item{memb_dim}{A character string indicating the name of the member dimension. It must be one dimension in 'exp' and 'obs'. The default value @@ -43,13 +42,12 @@ computation. The default value is NULL.} A list of two arrays with dimensions c(nexp, nobs, the rest of dimensions of 'exp' and 'obs' except memb_dim and time_dim), which nexp is the length of dat_dim of 'exp' and nobs is the length of dat_dim of 'obs'. -(only present if \code{pval = TRUE}) of the one-sided Fisher test with -Ho: SD/RMSE = 1.\cr\cr + If dat_dim is NULL, nexp and nobs are omitted. \cr \item{$ratio}{ The ratio of the ensemble spread and RMSE. } \item{$p_val}{ - The p-value of the one-sided Fisher test with Ho: SD/RMSE = 1. Only present + The p-value of the one-sided Fisher's test with Ho: SD/RMSE = 1. Only present if \code{pval = TRUE}. } } @@ -57,12 +55,12 @@ Ho: SD/RMSE = 1.\cr\cr Compute the ratio between the standard deviation of the members around the ensemble mean in experimental data and the RMSE between the ensemble mean of experimental and observational data. The p-value is provided by a one-sided -Fischer test. +Fisher's test. } \examples{ # Load sample data as in Load() example: example(Load) -rsdrms <- RatioSDRMS(sampleData$mod, sampleData$obs) +rsdrms <- RatioSDRMS(sampleData$mod, sampleData$obs, dat_dim = 'dataset') # Reorder the data in order to plot it with PlotVsLTime rsdrms_plot <- array(dim = c(dim(rsdrms$ratio)[1:2], 4, dim(rsdrms$ratio)[3])) rsdrms_plot[, , 2, ] <- rsdrms$ratio diff --git a/man/Regression.Rd b/man/Regression.Rd index 8e27295175b9a357bad33c7c6ff28c09eb06ead4..9ac0c9460a910722d0b1b86584ac39f24ee3b955 100644 --- a/man/Regression.Rd +++ b/man/Regression.Rd @@ -11,7 +11,8 @@ Regression( formula = y ~ x, pval = TRUE, conf = TRUE, - conf.lev = 0.95, + sign = FALSE, + alpha = 0.05, na.action = na.omit, ncores = NULL ) @@ -34,8 +35,11 @@ or not. The default value is TRUE.} \item{conf}{A logical value indicating whether to retrieve the confidence intervals or not. The default value is TRUE.} -\item{conf.lev}{A numeric indicating the confidence level for the -regression computation. The default value is 0.95.} +\item{sign}{A logical value indicating whether to compute or not the +statistical significance of the test The default value is FALSE.} + +\item{alpha}{A numeric of the significance level to be used in the +statistical significance test. The default value is 0.05.} \item{na.action}{A function or an integer. A function (e.g., na.omit, na.exclude, na.fail, na.pass) indicates what should happen when the data @@ -75,6 +79,10 @@ A list containing: A numeric array with same dimensions as parameter 'daty' and 'datax' except the 'reg_dim' dimension, The array contains the p-value. } +\item{sign}{ + A logical array of the statistical significance of the regression with the + same dimensions as $regression. Only present if \code{sign = TRUE}. +} \item{$filtered}{ A numeric array with the same dimension as paramter 'datay' and 'datax', the filtered datay from the regression onto datax along the 'reg_dim' @@ -98,6 +106,6 @@ names(dim(datay)) <- c('sdate', 'ftime') datax <- sampleData$obs[, 1, , ] names(dim(datax)) <- c('sdate', 'ftime') res1 <- Regression(datay, datax, formula = y~poly(x, 2, raw = TRUE)) -res2 <- Regression(datay, datax, conf.lev = 0.9) +res2 <- Regression(datay, datax, alpha = 0.1) } diff --git a/man/ResidualCorr.Rd b/man/ResidualCorr.Rd index fe7dd1012f5f81a4feaf574891b2468b79e88572..ad40f636d61257616ccf6fef3ca9295988eb04a9 100644 --- a/man/ResidualCorr.Rd +++ b/man/ResidualCorr.Rd @@ -12,8 +12,10 @@ ResidualCorr( time_dim = "sdate", memb_dim = NULL, method = "pearson", - alpha = NULL, + alpha = 0.05, handle.na = "return.na", + pval = TRUE, + sign = FALSE, ncores = NULL ) } @@ -47,8 +49,7 @@ computed ("pearson", "kendall", or "spearman"). The default value is "pearson".} \item{alpha}{A numeric of the significance level to be used in the statistical -significance test. If it is a numeric, "sign" will be returned. If NULL, the -p-value will be returned instead. The default value is NULL.} +significance test (output "sign"). The default value is 0.05.} \item{handle.na}{A charcater string indicating how to handle missing values. If "return.na", NAs will be returned for the cases that contain at least one @@ -57,6 +58,13 @@ steps with no missing values in all "exp", "ref", and "obs" will be used. If "na.fail", an error will arise if any of "exp", "ref", or "obs" contains any NA. The default value is "return.na".} +\item{pval}{A logical value indicating whether to return the p-value of the +significance test Ho: DiffCorr = 0. The default value is TRUE.} + +\item{sign}{A logical value indicating whether to return the statistical +significance of the test Ho: DiffCorr = 0 based on 'alpha'. The default +value is FALSE.} + \item{ncores}{An integer indicating the number of cores to use for parallel computation. The default value is NULL.} } @@ -69,12 +77,12 @@ A list with: \item{$sign}{ A logical array indicating whether the residual correlation is statistically significant or not with the same dimensions as the input arrays except "time_dim" - (and "memb_dim" if provided). Returned only if "alpha" is a numeric. + (and "memb_dim" if provided). Returned only if "sign" is TRUE. } \item{$p.val}{ A numeric array of the p-values with the same dimensions as the input arrays - except "time_dim" (and "memb_dim" if provided). Returned only if "alpha" is - NULL. + except "time_dim" (and "memb_dim" if provided). Returned only if "pval" is + TRUE. } } \description{ diff --git a/man/Spectrum.Rd b/man/Spectrum.Rd index 84b39c0cf44cc4165c25b354680ec226b8bd668d..18671e5cb1defc78f69817388f2f1e8ccd2e35dc 100644 --- a/man/Spectrum.Rd +++ b/man/Spectrum.Rd @@ -4,7 +4,7 @@ \alias{Spectrum} \title{Estimate frequency spectrum} \usage{ -Spectrum(data, time_dim = "ftime", conf.lev = 0.95, ncores = NULL) +Spectrum(data, time_dim = "ftime", alpha = 0.05, ncores = NULL) } \arguments{ \item{data}{A vector or numeric array of which the frequency spectrum is @@ -15,8 +15,8 @@ evenly spaced in time.} \item{time_dim}{A character string indicating the dimension along which to compute the frequency spectrum. The default value is 'ftime'.} -\item{conf.lev}{A numeric indicating the confidence level for the Monte-Carlo -significance test. The default value is 0.95.} +\item{alpha}{A numeric indicating the significance level for the Monte-Carlo +significance test. The default value is 0.05.} \item{ncores}{An integer indicating the number of cores to use for parallel computation. The default value is NULL.} diff --git a/man/Spread.Rd b/man/Spread.Rd index e26bc14551fcbdbdfa5eaad2a9628119c9b65210..d3f93bbb229cafcac4235025fe11a93a52fd68b3 100644 --- a/man/Spread.Rd +++ b/man/Spread.Rd @@ -10,7 +10,7 @@ Spread( compute_dim = "member", na.rm = TRUE, conf = TRUE, - conf.lev = 0.95, + alpha = 0.05, ncores = NULL ) } @@ -27,8 +27,8 @@ kept (FALSE) for computation. The default value is TRUE.} \item{conf}{A logical value indicating whether to compute the confidence intervals or not. The default value is TRUE.} -\item{conf.lev}{A numeric value of the confidence level for the computation. -The default value is 0.95.} +\item{alpha}{A numeric of the significance level to be used in the +statistical significance test. The default value is 0.05.} \item{ncores}{An integer indicating the number of cores to use for parallel computation. The default value is NULL.} @@ -72,7 +72,9 @@ smooth_ano_exp_m_sub <- smooth_ano_exp - InsertDim(MeanDims(smooth_ano_exp, 'mem posdim = 3, lendim = dim(smooth_ano_exp)['member'], name = 'member') +suppressWarnings({ spread <- Spread(smooth_ano_exp_m_sub, compute_dim = c('member', 'sdate')) +}) \dontrun{ PlotVsLTime(Reorder(spread$iqr, c('dataset', 'stats', 'ftime')), diff --git a/man/StatSeasAtlHurr.Rd b/man/StatSeasAtlHurr.Rd index 965732291cd23fd966a15710a7446a366266dde7..16154675526e990ba618691882ddb81fc4899595 100644 --- a/man/StatSeasAtlHurr.Rd +++ b/man/StatSeasAtlHurr.Rd @@ -38,7 +38,7 @@ the tropical Atlantic (bounded by 10N-25N and 80W-20W) and the tropics at large (bounded by 30N-30S). The anomalies are for the JJASON season.\cr The estimated seasonal average is either 1) number of hurricanes, 2) number of tropical cyclones with lifetime >=48h or 3) power dissipation index -(PDI; in 10^11 m^3 s^{-2}).\cr +(PDI; in 10^11 m^3 s^(-2)).\cr The statistical models used in this function are described in references. } \examples{ diff --git a/man/Trend.Rd b/man/Trend.Rd index 7623c3613149b891a5bf83c26bb32fb56716c541..012474870ccfdd8d0b3777145f9b5df2cfbc9f12 100644 --- a/man/Trend.Rd +++ b/man/Trend.Rd @@ -9,9 +9,10 @@ Trend( time_dim = "ftime", interval = 1, polydeg = 1, + alpha = 0.05, conf = TRUE, - conf.lev = 0.95, pval = TRUE, + sign = FALSE, ncores = NULL ) } @@ -28,15 +29,18 @@ points along 'time_dim' dimension. The default value is 1.} \item{polydeg}{A positive integer indicating the degree of polynomial regression. The default value is 1.} +\item{alpha}{A numeric indicating the significance level for the statistical +significance test. The default value is 0.05.} + \item{conf}{A logical value indicating whether to retrieve the confidence intervals or not. The default value is TRUE.} -\item{conf.lev}{A numeric indicating the confidence level for the -regression computation. The default value is 0.95.} - \item{pval}{A logical value indicating whether to compute the p-value or not. The default value is TRUE.} +\item{sign}{A logical value indicating whether to retrieve the statistical +significance based on 'alpha'. The default value is FALSE.} + \item{ncores}{An integer indicating the number of cores to use for parallel computation. The default value is NULL.} } @@ -53,7 +57,7 @@ A list containing: A numeric array with the first dimension 'stats', followed by the same dimensions as parameter 'data' except the 'time_dim' dimension. The length of the 'stats' dimension should be \code{polydeg + 1}, containing the - lower limit of the \code{conf.lev}\% confidence interval for all the + lower limit of the \code{(1-alpha)}\% confidence interval for all the regression coefficients with the same order as \code{$trend}. Only present \code{conf = TRUE}. } @@ -61,7 +65,7 @@ A list containing: A numeric array with the first dimension 'stats', followed by the same dimensions as parameter 'data' except the 'time_dim' dimension. The length of the 'stats' dimension should be \code{polydeg + 1}, containing the - upper limit of the \code{conf.lev}\% confidence interval for all the + upper limit of the \code{(1-alpha)}\% confidence interval for all the regression coefficients with the same order as \code{$trend}. Only present \code{conf = TRUE}. } @@ -70,6 +74,9 @@ A list containing: 'stats' is 1, followed by the same dimensions as parameter 'data' except the 'time_dim' dimension. Only present if \code{pval = TRUE}. } +\item{$sign}{ + The statistical significance. Only present if \code{sign = TRUE}. +} \item{$detrended}{ A numeric array with the same dimensions as paramter 'data', containing the detrended values along the 'time_dim' dimension. diff --git a/man/UltimateBrier.Rd b/man/UltimateBrier.Rd index 0dfa772e00dd9d9efa0679b781fee56a76ae855f..a20412e3976fe2aa0afde3d7cef877e8cfd603d9 100644 --- a/man/UltimateBrier.Rd +++ b/man/UltimateBrier.Rd @@ -7,7 +7,7 @@ UltimateBrier( exp, obs, - dat_dim = "dataset", + dat_dim = NULL, memb_dim = "member", time_dim = "sdate", quantile = TRUE, @@ -28,7 +28,7 @@ dimensions that at least include 'time_dim'. If it has 'exp' except 'dat_dim' and 'memb_dim'. It can be provided by \code{Ano()}.} \item{dat_dim}{A character string indicating the name of the dataset -dimension in 'exp' and 'obs'. The default value is 'dataset'. If there is no dataset +dimension in 'exp' and 'obs'. The default value is NULL (no dataset). dimension, set NULL.} \item{memb_dim}{A character string indicating the name of the member @@ -109,7 +109,7 @@ sampleData$obs <- Season(sampleData$obs, monini = 11, moninf = 12, monsup = 2) clim <- Clim(sampleData$mod, sampleData$obs) exp <- Ano(sampleData$mod, clim$clim_exp) obs <- Ano(sampleData$obs, clim$clim_obs) -bs <- UltimateBrier(exp, obs) -bss <- UltimateBrier(exp, obs, type = 'BSS') +bs <- UltimateBrier(exp, obs, dat_dim = 'dataset') +bss <- UltimateBrier(exp, obs, type = 'BSS', dat_dim = 'dataset') } diff --git a/s2dv-manual.pdf b/s2dv-manual.pdf deleted file mode 100644 index b4929e91355acf41086488ce94a23527e278f884..0000000000000000000000000000000000000000 Binary files a/s2dv-manual.pdf and /dev/null differ diff --git a/tests/testthat/test-ACC.R b/tests/testthat/test-ACC.R index 6431a9c05155fc76a94e4d92f271df482a3bcc38..544a23551c4fcd5d94cba55aefb3e9f9538fb025 100644 --- a/tests/testthat/test-ACC.R +++ b/tests/testthat/test-ACC.R @@ -1,18 +1,20 @@ -context("s2dv::ACC tests") - ############################################## ##NOTE: bootstrap is not tested because sample() is used inside. # dat1 set.seed(1) - exp1 <- array(rnorm(60), dim = c(dataset = 1, member = 2, sdate = 5, + exp1 <- array(rnorm(60), dim = c(member = 2, sdate = 5, ftime = 1, lat = 2, lon = 3)) set.seed(2) - obs1 <- array(rnorm(30), dim = c(dataset = 1, member = 1, sdate = 5, + obs1 <- array(rnorm(30), dim = c(member = 1, sdate = 5, ftime = 1, lat = 2, lon = 3)) lat1 <- c(30, 35) lon1 <- c(0, 5, 10) + + set.seed(2) + obs1_2 <- array(rnorm(30), dim = c(sdate = 5, ftime = 1, lat = 2, lon = 3)) + # dat2 set.seed(1) exp2 <- array(rnorm(60), dim = c(dataset = 2, sdate = 5, @@ -26,6 +28,13 @@ context("s2dv::ACC tests") lat2 <- c(30, 35) lon2 <- c(0, 5, 10) + # dat3 + set.seed(1) + exp3 <- array(rnorm(72), dim = c(dat = 2, member = 3, sdate = 3, lat = 2, lon = 2)) + set.seed(2) + obs3 <- array(rnorm(12), dim = c(dat = 1, sdate = 3, lat = 2, lon = 2)) + lat3 <- c(0, 10) + ############################################## test_that("1. Input checks", { # exp and obs (1) @@ -46,10 +55,6 @@ test_that("1. Input checks", { ACC(array(1:10, dim = c(2, 5)), array(1:4, dim = c(2, 2))), "Parameter 'exp' and 'obs' must have dimension names." ) - expect_error( - ACC(array(1:10, dim = c(a = 2, c = 5)), array(1:4, dim = c(a = 2, b = 2))), - "Parameter 'exp' and 'obs' must have same dimension name" - ) # dat_dim expect_error( ACC(exp1, obs1, dat_dim = 1), @@ -59,11 +64,6 @@ test_that("1. Input checks", { ACC(exp1, obs1, dat_dim = 'a'), "Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension." ) - # space_dim (deprecated) - expect_warning( - ACC(exp1, obs1, space_dim = c('lat', 'lon'), lat = c(1, 2)), - "! Warning: Parameter 'space_dim' is deprecated. Use 'lat_dim' and 'lon_dim'\n! instead." - ) # lat_dim expect_error( ACC(exp1, obs1, lat_dim = 1), @@ -98,7 +98,7 @@ test_that("1. Input checks", { ) expect_error( ACC(exp1, obs1, memb_dim = 'memb'), - "Parameter 'memb_dim' is not found in 'exp' or 'obs' dimension." + "Parameter 'memb_dim' is not found in 'exp' nor 'obs' dimension. Set it as NULL if there is no member dimension." ) # lat expect_error( @@ -131,10 +131,10 @@ test_that("1. Input checks", { ACC(exp1, obs1, lat = c(1, 2), lon = c(1, 2, 3), memb_dim = NULL, conftype = 'bootstrap'), "Parameter 'memb_dim' cannot be NULL when parameter 'conftype' is 'bootstrap'." ) - # conf.lev + # alpha expect_error( - ACC(exp1, obs1, lat = c(1, 2), lon = c(1, 2, 3), conf.lev = -1), - "Parameter 'conf.lev' must be a numeric number between 0 and 1." + ACC(exp1, obs1, lat = c(1, 2), lon = c(1, 2, 3), alpha = -1), + "Parameter 'alpha' must be a numeric number between 0 and 1." ) # pval expect_error( @@ -162,11 +162,11 @@ test_that("2. Output checks: dat1", { expect_equal( dim(ACC(exp1, obs1, lat = lat1, lon = lon1)$acc), - c(nexp = 1, nobs = 1, sdate = 5, ftime = 1) + c(sdate = 5, ftime = 1) ) expect_equal( names(ACC(exp1, obs1, lat = lat1)), - c("acc", "conf.lower", "conf.upper", "p.val", "macc") + c("acc", "macc", "conf.lower", "conf.upper", "p.val") ) expect_equal( as.vector(ACC(exp1, obs1, lat = lat1)$acc), @@ -194,22 +194,22 @@ test_that("2. Output checks: dat1", { ) expect_equal( dim(ACC(exp1, obs1, lat = lat1, dat_dim = 'member', memb_dim = NULL)$acc), - c(nexp = 2, nobs = 1, sdate = 5, dataset = 1, ftime = 1) + c(nexp = 2, nobs = 1, sdate = 5, ftime = 1) ) expect_equal( names(ACC(exp1, obs1, lat = lat1, conf = FALSE)), - c("acc", "p.val", "macc") + c("acc", "macc", "p.val") ) expect_equal( names(ACC(exp1, obs1, lat = lat1, pval = FALSE)), - c("acc", "conf.lower", "conf.upper", "macc") + c("acc", "macc", "conf.lower", "conf.upper") ) expect_equal( names(ACC(exp1, obs1, lat = lat1, conf = FALSE, pval = FALSE)), c("acc", "macc") ) expect_equal( - as.vector(ACC(exp1, obs1, lat = lat1, conf = FALSE, avg_dim = NULL, conf.lev = 0.9)$p.val), + as.vector(ACC(exp1, obs1, lat = lat1, conf = FALSE, avg_dim = NULL, alpha = 0.1)$p.val), c(0.6083998, 0.6083998, 0.6083998, 0.6083998, 0.6083998), tolerance = 0.00001 ) @@ -219,6 +219,46 @@ test_that("2. Output checks: dat1", { tolerance = 0.00001 ) + # bootstrap + expect_equal( + names(ACC(exp1, obs1, lat = lat1, lon = lon1, conftype = 'bootstrap')), + c('acc', 'acc_conf.lower', 'acc_conf.upper', 'macc', 'macc_conf.lower', 'macc_conf.upper') + ) + expect_equal( + dim(ACC(exp1, obs1, lat = lat1, lon = lon1, conftype = 'bootstrap')$acc), + c(sdate = 5, ftime = 1) + ) + expect_equal( + dim(ACC(exp1, obs1, lat = lat1, lon = lon1, conftype = 'bootstrap')$acc_conf.lower), + c(sdate = 5, ftime = 1) + ) + expect_equal( + dim(ACC(exp1, obs1, lat = lat1, lon = lon1, conftype = 'bootstrap')$macc_conf.lower), + c(ftime = 1) + ) + expect_equal( + dim(ACC(exp1, obs1, lat = lat1, lon = lon1, conftype = 'bootstrap')$macc), + c(ftime = 1) + ) + # boostrap, avg_time is NULL + expect_equal( + names(ACC(exp1, obs1, lat = lat1, lon = lon1, conftype = 'bootstrap', avg_dim = NULL)), + c('acc', 'acc_conf.lower', 'acc_conf.upper', 'macc', 'macc_conf.lower', 'macc_conf.upper') + ) + expect_equal( + dim(ACC(exp1, obs1, lat = lat1, lon = lon1, conftype = 'bootstrap', avg_dim = NULL)$acc), + c(sdate = 5, ftime = 1) + ) + expect_equal( + dim(ACC(exp1, obs1, lat = lat1, lon = lon1, conftype = 'bootstrap', avg_dim = NULL)$acc_conf.lower), + c(sdate = 5, ftime = 1) + ) + + # obs1_2, no memb_dim + expect_equal( + ACC(exp1, obs1, lat = lat1), + ACC(exp1, obs1_2, lat = lat1) + ) }) @@ -227,18 +267,81 @@ test_that("2. Output checks: dat1", { test_that("3. Output checks: dat2", { expect_equal( - dim(ACC(exp2, obs2, lat = lat2, lon = lon2, memb_dim = NULL)$acc), + dim(ACC(exp2, obs2, lat = lat2, lon = lon2, memb_dim = NULL, dat_dim = 'dataset')$acc), c(nexp = 2, nobs = 1, sdate = 5, ftime = 1) ) expect_equal( - as.vector(ACC(exp2, obs2, lat = lat2, memb_dim = NULL)$acc)[3:7], + as.vector(ACC(exp2, obs2, lat = lat2, memb_dim = NULL, dat_dim = 'dataset')$acc)[3:7], c(-0.3601880, -0.5624773, -0.4603762, -0.6997169, -0.1336961), tolerance = 0.00001 ) expect_equal( - mean(ACC(exp2, obs2, lat = lat2, memb_dim = NULL)$acc), + mean(ACC(exp2, obs2, lat = lat2, memb_dim = NULL, dat_dim = 'dataset')$acc), -0.1484762, tolerance = 0.00001 ) }) + + +############################################## + +test_that("4. Output checks: dat3", { + +expect_equal( +names(ACC(exp3, obs3, lat = lat3, memb_dim = "member", dat_dim = "dat")), +c('acc', 'macc', 'conf.lower', 'conf.upper', 'p.val') +) +expect_equal( +names(ACC(exp3, obs3, lat = lat3, memb_dim = "member", dat_dim = "dat", avg_dim = NULL)), +c('acc', 'conf.lower', 'conf.upper', 'p.val') +) +expect_equal( +dim(ACC(exp3, obs3, lat = lat3, memb_dim = "member", dat_dim = "dat")$acc), +c(nexp = 2, nobs = 1, sdate = 3) +) +expect_equal( +dim(ACC(exp3, obs3, lat = lat3, memb_dim = "member", dat_dim = "dat", avg_dim = NULL)$acc), +c(nexp = 2, nobs = 1, sdate = 3) +) +expect_equal( +dim(ACC(exp3, obs3, lat = lat3, memb_dim = "member", dat_dim = "dat")$p.val), +c(nexp = 2, nobs = 1, sdate = 3) +) +expect_equal( +dim(ACC(exp3, obs3, lat = lat3, memb_dim = "member", dat_dim = "dat")$conf.upper), +c(nexp = 2, nobs = 1, sdate = 3) +) +expect_equal( +dim(ACC(exp3, obs3, lat = lat3, memb_dim = "member", dat_dim = "dat")$macc), +c(nexp = 2, nobs = 1) +) + +expect_equal( +ACC(exp3, array(obs3, c(member = 1, dim(obs3))), lat = lat3, memb_dim = "member", dat_dim = "dat"), +ACC(exp3, obs3, lat = lat3, memb_dim = "member", dat_dim = "dat") +) + +# bootstrap +expect_equal( +names(ACC(exp3, obs3, lat = lat3, memb_dim = "member", dat_dim = "dat", conftype = 'bootstrap')), +c('acc', 'acc_conf.lower', 'acc_conf.upper', 'macc', 'macc_conf.lower', 'macc_conf.upper') +) +expect_equal( +dim(ACC(exp3, obs3, lat = lat3, memb_dim = "member", dat_dim = "dat", conftype = 'bootstrap')$acc), +c(nexp = 2, nobs = 1, sdate = 3) +) +expect_equal( +dim(ACC(exp3, obs3, lat = lat3, memb_dim = "member", dat_dim = "dat", conftype = 'bootstrap')$acc_conf.lower), +c(nexp = 2, nobs = 1, sdate = 3) +) +expect_equal( +dim(ACC(exp3, obs3, lat = lat3, memb_dim = "member", dat_dim = "dat", conftype = 'bootstrap')$macc), +c(nexp = 2, nobs = 1) +) +expect_equal( +dim(ACC(exp3, obs3, lat = lat3, memb_dim = "member", dat_dim = "dat", conftype = 'bootstrap')$macc_conf.upper), +c(nexp = 2, nobs = 1) +) + +}) diff --git a/tests/testthat/test-AMV.R b/tests/testthat/test-AMV.R index 9adfaefa51211858c4b475b5446b1ea4ef0298c2..f2cc6d0750ac4df616effb023a2e738a11ea955b 100644 --- a/tests/testthat/test-AMV.R +++ b/tests/testthat/test-AMV.R @@ -1,5 +1,3 @@ -context("s2dv::AMV tests") - ############################################## lat <- seq(-90, 90, 10) lon <- seq(0, 360, 10) diff --git a/tests/testthat/test-AbsBiasSS.R b/tests/testthat/test-AbsBiasSS.R index 08a4a83d147be31fe3be3d69c796c55c59940c20..1df34c48ec6e0678cfb21b9f1ccbcee309dc73b3 100644 --- a/tests/testthat/test-AbsBiasSS.R +++ b/tests/testthat/test-AbsBiasSS.R @@ -1,5 +1,3 @@ -context("s2dv::AbsBiasSS tests") - ############################################## # dat1 @@ -263,6 +261,20 @@ test_that("5. Output checks: dat4", { c(-0.213733950, -0.214240924, 0.110399615, -0.009733463, 0.264602089) ) + # sig_method.type + expect_equal( + c(AbsBiasSS(exp4, obs4, memb_dim = 'member', dat_dim = 'dataset')$sign), + c(F, T, F, F, F, F, T, F, F, F, F, F) + ) + expect_equal( + c(AbsBiasSS(exp4, obs4, memb_dim = 'member', dat_dim = 'dataset', sig_method.type = 'two.sided', alpha = 0.01)$sign), + rep(F, 12) + ) + expect_equal( + c(AbsBiasSS(exp4, obs4, memb_dim = 'member', dat_dim = 'dataset', sig_method.type = 'less', alpha = 0.1)$sign), + c(F, T, F, T, F, F, T, rep(F, 5)) + ) + }) ############################################## diff --git a/tests/testthat/test-Ano.R b/tests/testthat/test-Ano.R index a74c7c946778287ac84451319457c36782362401..bfcf0cb9e7e6f89b15a13ebd16fd9dfb0d005d66 100644 --- a/tests/testthat/test-Ano.R +++ b/tests/testthat/test-Ano.R @@ -1,5 +1,3 @@ -context("s2dv::Ano test") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-Ano_CrossValid.R b/tests/testthat/test-Ano_CrossValid.R index 2d7c00c127b75ad9e240630bce109fd9b001d904..0e2b44256b7f66ee134cdc79ae96e0201112664a 100644 --- a/tests/testthat/test-Ano_CrossValid.R +++ b/tests/testthat/test-Ano_CrossValid.R @@ -1,12 +1,22 @@ -context("s2dv::Ano_CrossValid tests") - ############################################## - # dat1 +# dat1 set.seed(1) exp1 <- array(rnorm(60), dim = c(dataset = 2, member = 3, sdate = 5, ftime = 2)) set.seed(2) obs1 <- array(rnorm(20), dim = c(dataset = 1, member = 2, sdate = 5, ftime = 2)) +## different member and dat dim +obs1_2 <- obs1 +dim(obs1_2) <- c(member = 2, sdate = 5, ftime = 2) +obs1_3 <- obs1[1,1,,] +obs1_4 <- obs1[, 1, , ]; dim(obs1_4) <- c(dataset = 1, dim(obs1_4)) + +exp1_2 <- exp1[,1,,] + +## not usual dimension order +exp1_5 <- aperm(exp1, 4:1) +obs1_5 <- aperm(obs1, c(3, 4, 2, 1)) + # dat2 set.seed(1) exp2 <- array(rnorm(30), dim = c(member = 3, ftime = 2, sdate = 5)) @@ -19,6 +29,8 @@ exp3 <- array(rnorm(30), dim = c(ftime = 2, sdate = 5)) set.seed(2) obs3 <- array(rnorm(20), dim = c(ftime = 2, sdate = 5)) +# dat4: not usual dimension order + ############################################## test_that("1. Input checks", { @@ -57,7 +69,7 @@ test_that("1. Input checks", { ) expect_error( Ano_CrossValid(exp1, obs1, dat_dim = 'dat'), - "Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension. Set it as NULL if there is no dataset dimension." + "Parameter 'dat_dim' is not found in 'exp' nor 'obs' dimension. Set it as NULL if there is no dataset dimension." ) # memb expect_error( @@ -71,7 +83,7 @@ test_that("1. Input checks", { ) expect_error( Ano_CrossValid(exp1, obs1, memb = FALSE, memb_dim = 'memb'), - "Parameter 'memb_dim' is not found in 'exp' or 'obs' dimension." + "Parameter 'memb_dim' is not found in 'exp' nor 'obs' dimension. Set it as NULL if there is no member dimension." ) expect_error( Ano_CrossValid(exp1, obs1, memb = FALSE, memb_dim = 'ftime'), @@ -117,6 +129,59 @@ test_that("2. dat1", { tolerance = 0.0001 ) + expect_equal( + dim(Ano_CrossValid(exp1, obs1_2)$obs), + c(sdate = 5, member = 2, ftime = 2) + ) + expect_equal( + Ano_CrossValid(exp1, obs1)$exp, + Ano_CrossValid(exp1, obs1_2)$exp + ) + expect_equal( + c(Ano_CrossValid(exp1, obs1)$obs), + c(Ano_CrossValid(exp1, obs1_2)$obs) + ) + + expect_equal( + Ano_CrossValid(exp1, obs1)$exp, + Ano_CrossValid(exp1, obs1_3)$exp + ) + expect_equal( + dim(Ano_CrossValid(exp1, obs1_3)$obs), + c(sdate = 5, ftime = 2) + ) + expect_equal( + c(Ano_CrossValid(exp1, obs1_3)$obs), + c(Ano_CrossValid(exp1, obs1)$obs[, 1, 1, ]) + ) + + expect_equal( + dim(Ano_CrossValid(exp1, obs1_4)$obs), + c(sdate = 5, dataset = 1, ftime = 2) + ) + expect_equal( + Ano_CrossValid(exp1, obs1_4)$exp, + Ano_CrossValid(exp1, obs1)$exp + ) + expect_equal( + c(Ano_CrossValid(exp1, obs1_4)$obs), + c(Ano_CrossValid(exp1, obs1)$obs[, , 1, ]) + ) + + expect_equal( + dim(Ano_CrossValid(exp1_2, obs1)$exp), + c(sdate = 5, dataset = 2, ftime = 2) + ) + expect_equal( + c(Ano_CrossValid(exp1_2, obs1)$exp), + c(Ano_CrossValid(exp1, obs1)$exp[,,1,]) + ) + + expect_equal( + Ano_CrossValid(exp1, obs1), + Ano_CrossValid(exp1_5, obs1_5) + ) + }) ############################################## diff --git a/tests/testthat/test-Bias.R b/tests/testthat/test-Bias.R index 842ecc2c2b626649603f3d964ca20adee2fd22df..4c6cc99f39cf5e981bad69ada0c44c599d97c5ee 100644 --- a/tests/testthat/test-Bias.R +++ b/tests/testthat/test-Bias.R @@ -1,5 +1,3 @@ -context("s2dv::Bias tests") - ############################################## # dat1 diff --git a/tests/testthat/test-BrierScore.R b/tests/testthat/test-BrierScore.R index e2c34f95744114b72ff7f5ac89131bf29f5da250..3f02ac5798f8e203566c61437f4b92256c19b367 100644 --- a/tests/testthat/test-BrierScore.R +++ b/tests/testthat/test-BrierScore.R @@ -1,5 +1,3 @@ -context("s2dv::BrierScore tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-CDORemap.R b/tests/testthat/test-CDORemap.R index 1ace086d7fd7117c7d1ca978f309614ae84c3e22..5492d51f6b3611e04b293b2922750636aa23a502 100644 --- a/tests/testthat/test-CDORemap.R +++ b/tests/testthat/test-CDORemap.R @@ -1,5 +1,3 @@ -context("s2dv::CDORemap tests") - # data1: regular grid data1 <- array(1:360*181*2, dim = c(lon = 360, lat = 181, time = 2)) lons1 <- seq(0, 359) diff --git a/tests/testthat/test-CRPS.R b/tests/testthat/test-CRPS.R index 972eb45c7a7993ad3b7c535886b8f596ac09a56c..417b3be8fe4eae13553beb4fcdf77ac4ec595354 100644 --- a/tests/testthat/test-CRPS.R +++ b/tests/testthat/test-CRPS.R @@ -1,5 +1,3 @@ -context("s2dv::CRPS tests") - ############################################## # dat1 diff --git a/tests/testthat/test-CRPSS.R b/tests/testthat/test-CRPSS.R index db0eecd632c1494a332987d6bc0f31f079ae7ed3..f06919140c68b691ba45dec492d536ece65c0e9c 100644 --- a/tests/testthat/test-CRPSS.R +++ b/tests/testthat/test-CRPSS.R @@ -1,5 +1,3 @@ -context("s2dv::CRPSS tests") - ############################################## # dat1 @@ -103,6 +101,11 @@ test_that("1. Input checks", { CRPSS(exp1, obs1, Fair = 1), "Parameter 'Fair' must be either TRUE or FALSE." ) + # clim.cross.val + expect_error( + CRPSS(exp1, obs1, clim.cross.val = NA), + "Parameter 'clim.cross.val' must be either TRUE or FALSE." + ) # ncores expect_error( CRPSS(exp2, obs2, ncores = 1.5), @@ -146,7 +149,7 @@ test_that("2. Output checks: dat1", { ) expect_equal( as.vector(CRPSS(exp1, obs1)$sign), - c(FALSE, FALSE), + c(FALSE, FALSE) ) expect_equal( as.vector(CRPSS(exp1, obs1, Fair = T)$crpss), @@ -178,6 +181,34 @@ test_that("2. Output checks: dat1", { c(0.3491793, 0.3379610), tolerance = 0.0001 ) + # clim.cross.val + expect_equal( + as.vector(CRPSS(exp1, obs1, ref = NULL, clim.cross.val = F)$crpss), + c(-0.1582765, -0.2390707), + tolerance = 0.0001 + ) + + # sig_method.type + expect_equal( + as.vector(CRPSS(exp1, obs1, sig_method.type = "two.sided", alpha = 0.15)$sign), + c(FALSE, FALSE) + ) + expect_equal( + as.vector(CRPSS(exp1, obs1, sig_method.type = "two.sided", alpha = 0.4)$sign), + c(FALSE, TRUE) + ) + expect_equal( + as.vector(CRPSS(exp1, obs1, ref1, sig_method.type = "two.sided", alpha = 0.15)$sign), + c(TRUE, FALSE) + ) + expect_equal( + as.vector(CRPSS(exp1, obs1, ref1, sig_method.type = "two.sided", alpha = 0.4)$sign), + c(TRUE, FALSE) + ) + expect_equal( + as.vector(CRPSS(exp1, obs1, ref1, sig_method.type = "less", alpha = 0.15)$sign), + c(FALSE, FALSE) + ) }) @@ -222,7 +253,7 @@ test_that("3. Output checks: dat2", { expect_equal( as.vector(CRPSS(exp2, obs2)$sign), - FALSE, + FALSE ) expect_equal( as.vector(CRPSS(exp2, obs2, Fair = T)$crpss), @@ -275,7 +306,7 @@ test_that("4. Output checks: dat3", { ) expect_equal( as.vector(CRPSS(exp3, obs3, dat_dim = 'dataset')$sign), - rep(FALSE, 6), + rep(FALSE, 6) ) expect_equal( mean(CRPSS(exp3, obs3, dat_dim = 'dataset', Fair = T)$crpss), @@ -299,6 +330,17 @@ test_that("4. Output checks: dat3", { as.vector(CRPSS(exp2, obs2)$crpss) ) + # sig_method.type + expect_equal( + as.vector(CRPSS(exp3, obs3, dat_dim = 'dataset', sig_method.type = "two.sided", alpha = 0.5)$sign), + rep(F, 6) + ) + expect_equal( + as.vector(CRPSS(exp3, obs3, dat_dim = 'dataset', sig_method.type = "less", alpha = 0.5)$sign), + rep(T, 6) + ) + + }) ############################################## diff --git a/tests/testthat/test-Clim.R b/tests/testthat/test-Clim.R index f5e288ec2c4b956a44c2494d8ae86f280288a2a1..1a0443713802191cd0dffea739e0d4536c166253 100644 --- a/tests/testthat/test-Clim.R +++ b/tests/testthat/test-Clim.R @@ -1,5 +1,3 @@ -context("s2dv::Clim tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-Cluster.R b/tests/testthat/test-Cluster.R index 13297d80850198518b3baba264f4dcfe900989a8..b5fe6cfe3fa3203b226dc284e6d69d5016351ecc 100644 --- a/tests/testthat/test-Cluster.R +++ b/tests/testthat/test-Cluster.R @@ -1,5 +1,3 @@ -context("s2dv::Cluster tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-Composite.R b/tests/testthat/test-Composite.R index acc7fe64e975b3c41bfa986b36a18a7347d72715..cb948ef09b6b7f9c093f509c2abb8c1d7a97e705 100644 --- a/tests/testthat/test-Composite.R +++ b/tests/testthat/test-Composite.R @@ -1,5 +1,3 @@ -context("s2dv::Composite tests") - ############################################## # dat1 x1 <- array(0, dim = c(20, 10, 30)) diff --git a/tests/testthat/test-Consist_Trend.R b/tests/testthat/test-Consist_Trend.R index 91dacf7085035a0b16a6ab9689309a517c5ad5a6..7f5b5ab68661ee858db76c1ff9899683b13b3b59 100644 --- a/tests/testthat/test-Consist_Trend.R +++ b/tests/testthat/test-Consist_Trend.R @@ -1,5 +1,3 @@ -context("s2dv::Consist_Trend tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-Corr.R b/tests/testthat/test-Corr.R index ef013cab372b62984c359d70dc569cf9321e83b9..4cc57d4c7a2d2efac22af729653dcda2ec72459a 100644 --- a/tests/testthat/test-Corr.R +++ b/tests/testthat/test-Corr.R @@ -1,5 +1,3 @@ -context("s2dv::Corr tests") - ############################################## # dat1: memb_dim is NULL set.seed(1) @@ -13,6 +11,13 @@ context("s2dv::Corr tests") na <- floor(runif(10, min = 1, max = 120)) obs1[na] <- NA + set.seed(2) + obs1_2 <- array(rnorm(120), dim = c(dataset = 1, sdate = 5, + ftime = 3, lat = 2, lon = 4)) + set.seed(2) + na <- floor(runif(10, min = 1, max = 120)) + obs1_2[na] <- NA + # dat2: memb_dim = member set.seed(1) exp2 <- array(rnorm(180), dim = c(member = 3, dataset = 2, sdate = 5, @@ -47,12 +52,10 @@ context("s2dv::Corr tests") # dat5: exp and obs have memb_dim and dataset = NULL set.seed(1) - exp5 <- array(rnorm(90), dim = c(member = 3, sdate = 5, - lat = 2, lon = 3)) + exp5 <- array(rnorm(90), dim = c(member = 3, sdate = 5, lat = 2, lon = 3)) set.seed(2) - obs5 <- array(rnorm(30), dim = c(member = 1, sdate = 5, - lat = 2, lon = 3)) + obs5 <- array(rnorm(30), dim = c(member = 1, sdate = 5, lat = 2, lon = 3)) # dat6: exp and obs have memb_dim = NULL and dataset = NULL set.seed(1) @@ -89,10 +92,6 @@ test_that("1. Input checks", { "Parameter 'exp' and 'obs' must have dimension names." ) expect_error( - Corr(array(1:10, dim = c(a = 2, c = 5)), array(1:4, dim = c(a = 2, b = 2))), - "Parameter 'exp' and 'obs' must have same dimension name" - ) - expect_error( Corr(exp1, obs1, dat_dim = 1), "Parameter 'dat_dim' must be a character string." ) @@ -139,7 +138,7 @@ test_that("1. Input checks", { ) expect_error( Corr(exp1, obs1, memb_dim = 'memb'), - "Parameter 'memb_dim' is not found in 'exp' or 'obs' dimension." + "Parameter 'memb_dim' is not found in 'exp' nor 'obs' dimension. Set it as NULL if there is no member dimension." ) expect_error( Corr(exp2, obs2, memb_dim = 'member', memb = 1), @@ -164,7 +163,7 @@ test_that("1. Input checks", { ) expect_error( Corr(exp = array(1:10, dim = c(sdate = 2, dataset = 5, a = 1)), - obs = array(1:4, dim = c(a = 1, sdate = 2, dataset = 2))), + obs = array(1:4, dim = c(a = 1, sdate = 2, dataset = 2)), dat_dim = 'dataset'), "The length of time_dim must be at least 3 to compute correlation." ) @@ -174,80 +173,93 @@ test_that("1. Input checks", { test_that("2. Output checks: dat1", { suppressWarnings( expect_equal( - dim(Corr(exp1, obs1)$corr), + dim(Corr(exp1, obs1, dat_dim = 'dataset')$corr), c(nexp = 2, nobs = 1, member = 1, ftime = 3, lat = 2, lon = 4) ) ) suppressWarnings( expect_equal( - Corr(exp1, obs1)$corr[1:6], + Corr(exp1, obs1, dat_dim = 'dataset')$corr[1:6], c(0.11503859, -0.46959987, -0.64113021, 0.09776572, -0.32393603, 0.27565829), tolerance = 0.001 ) ) suppressWarnings( expect_equal( - length(which(is.na(Corr(exp1, obs1)$p.val))), + length(which(is.na(Corr(exp1, obs1, dat_dim = 'dataset')$p.val))), 2 ) ) suppressWarnings( expect_equal( - max(Corr(exp1, obs1)$conf.lower, na.rm = T), + max(Corr(exp1, obs1, dat_dim = 'dataset')$conf.lower, na.rm = T), 0.6332941, tolerance = 0.001 ) ) suppressWarnings( expect_equal( - length(which(is.na(Corr(exp1, obs1, comp_dim = 'ftime')$corr))), + length(which(is.na(Corr(exp1, obs1, dat_dim = 'dataset', comp_dim = 'ftime')$corr))), 6 ) ) suppressWarnings( expect_equal( - length(which(is.na(Corr(exp1, obs1, comp_dim = 'ftime', limits = c(2, 3))$corr))), + length(which(is.na(Corr(exp1, obs1, dat_dim = 'dataset', comp_dim = 'ftime', limits = c(2, 3))$corr))), 2 ) ) suppressWarnings( expect_equal( - min(Corr(exp1, obs1, alpha = 0.01)$conf.upper, na.rm = TRUE), + min(Corr(exp1, obs1, alpha = 0.01, dat_dim = 'dataset')$conf.upper, na.rm = TRUE), 0.2747904, tolerance = 0.0001 ) ) suppressWarnings( expect_equal( - length(Corr(exp1, obs1, conf = FALSE, pval = FALSE)), + length(Corr(exp1, obs1, conf = FALSE, pval = FALSE, dat_dim = 'dataset')), 1 ) ) suppressWarnings( expect_equal( - length(Corr(exp1, obs1, conf = FALSE)), + length(Corr(exp1, obs1, conf = FALSE, dat_dim = 'dataset')), 2 ) ) suppressWarnings( expect_equal( - length(Corr(exp1, obs1, pval = FALSE)), + length(Corr(exp1, obs1, pval = FALSE, dat_dim = 'dataset')), 3 ) ) suppressWarnings( expect_equal( - Corr(exp1, obs1, method = 'spearman')$corr[1:6], + Corr(exp1, obs1, method = 'spearman', dat_dim = 'dataset')$corr[1:6], c(-0.3, -0.4, -0.6, 0.3, -0.3, 0.2) ) ) suppressWarnings( expect_equal( - range(Corr(exp1, obs1, method = 'spearman', comp_dim = 'ftime')$p.val, na.rm = T), + range(Corr(exp1, obs1, method = 'spearman', comp_dim = 'ftime', dat_dim = 'dataset')$p.val, na.rm = T), c(0.0, 0.5), tolerance = 0.001 ) ) +# obs1_2, no memb_dim +suppressWarnings( + expect_equal( + Corr(exp1, obs1, dat_dim = 'dataset', memb_dim = 'member'), + Corr(exp1, obs1_2, dat_dim = 'dataset', memb_dim = 'member') + ) +) +suppressWarnings( + expect_equal( + Corr(exp1, obs1, dat_dim = 'dataset', memb_dim = 'member', memb = F), + Corr(exp1, obs1_2, dat_dim = 'dataset', memb_dim = 'member', memb = F) + ) +) }) @@ -255,113 +267,113 @@ suppressWarnings( test_that("3. Output checks: dat2", { # individual member expect_equal( - dim(Corr(exp2, obs2, memb_dim = 'member')$corr), + dim(Corr(exp2, obs2, memb_dim = 'member', dat_dim = 'dataset')$corr), c(nexp = 2, nobs = 1, exp_memb = 3, obs_memb = 1, lat = 2, lon = 3) ) expect_equal( - dim(Corr(exp2, obs2, memb_dim = 'member')$corr), - dim(Corr(exp2, obs2, memb_dim = 'member')$p) + dim(Corr(exp2, obs2, memb_dim = 'member', dat_dim = 'dataset')$corr), + dim(Corr(exp2, obs2, memb_dim = 'member', dat_dim = 'dataset')$p) ) expect_equal( - names(Corr(exp2, obs2, memb_dim = 'member')), + names(Corr(exp2, obs2, memb_dim = 'member', dat_dim = 'dataset')), c("corr", "p.val", "conf.lower", "conf.upper") ) expect_equal( - names(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, conf = FALSE)), + names(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, conf = FALSE, dat_dim = 'dataset')), c("corr") ) expect_equal( - names(Corr(exp2, obs2, memb_dim = 'member', conf = FALSE)), + names(Corr(exp2, obs2, memb_dim = 'member', conf = FALSE, dat_dim = 'dataset')), c("corr", "p.val") ) expect_equal( - names(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE)), + names(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, dat_dim = 'dataset')), c("corr", "conf.lower", "conf.upper") ) expect_equal( - names(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, sign = T)), + names(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, sign = T, dat_dim = 'dataset')), c("corr", "conf.lower", "conf.upper", "sign") ) expect_equal( - mean(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, conf = FALSE)$corr), + mean(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), 0.01645575, tolerance = 0.0001 ) expect_equal( - median(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, conf = FALSE)$corr), + median(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), 0.03024513, tolerance = 0.0001 ) expect_equal( - max(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, conf = FALSE)$corr), + max(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), 0.9327993, tolerance = 0.0001 ) expect_equal( - min(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, conf = FALSE)$corr), + min(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), -0.9361258, tolerance = 0.0001 ) expect_equal( - Corr(exp2, obs2, memb_dim = 'member', conf = FALSE)$p.val[1:5], + Corr(exp2, obs2, memb_dim = 'member', conf = FALSE, dat_dim = 'dataset')$p.val[1:5], c(0.24150854, 0.21790352, 0.04149139, 0.49851332, 0.19859843), tolerance = 0.0001 ) expect_equal( - Corr(exp2, obs2, memb_dim = 'member', pval = FALSE)$conf.lower[1:5], + Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, dat_dim = 'dataset')$conf.lower[1:5], c(-0.9500121, -0.9547642, -0.9883400, -0.8817478, -0.6879465), tolerance = 0.0001 ) expect_equal( - which(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, conf = F, sign = T)$sign), + which(Corr(exp2, obs2, memb_dim = 'member', pval = FALSE, conf = F, sign = T, dat_dim = 'dataset')$sign), c(3, 6, 12, 17, 23, 34) ) # ensemble mean expect_equal( - dim(Corr(exp2, obs2, memb_dim = 'member', memb = FALSE)$corr), + dim(Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, dat_dim = 'dataset')$corr), c(nexp = 2, nobs = 1, lat = 2, lon = 3) ) expect_equal( - mean(Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE)$corr), + mean(Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), 0.02939929, tolerance = 0.0001 ) expect_equal( - median(Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE)$corr), + median(Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), 0.03147432, tolerance = 0.0001 ) expect_equal( - max(Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE)$corr), + max(Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), 0.8048901, tolerance = 0.0001 ) expect_equal( - min(Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE)$corr), + min(Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), -0.6839388, tolerance = 0.0001 ) expect_equal( - Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, conf = FALSE)$p.val[1:5], + Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, conf = FALSE, dat_dim = 'dataset')$p.val[1:5], c(0.1999518, 0.2776874, 0.3255444, 0.2839667, 0.1264518), tolerance = 0.0001 ) expect_equal( - Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, pval = FALSE)$conf.lower[1:5], + Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, pval = FALSE, dat_dim = 'dataset')$conf.lower[1:5], c(-0.9582891, -0.7668065, -0.9316879, -0.9410621, -0.5659657), tolerance = 0.0001 ) # exp2_2 expect_equal( - which(is.na(Corr(exp2_2, obs2_2, memb_dim = 'member', memb = FALSE, pval = FALSE)$corr)), + which(is.na(Corr(exp2_2, obs2_2, memb_dim = 'member', memb = FALSE, pval = FALSE, dat_dim = 'dataset')$corr)), 1:2 ) expect_equal( - Corr(exp2_2, obs2_2, memb_dim = 'member', memb = FALSE, pval = FALSE)$corr[-c(1:2)], - Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, pval = FALSE)$corr[-c(1:2)] + Corr(exp2_2, obs2_2, memb_dim = 'member', memb = FALSE, pval = FALSE, dat_dim = 'dataset')$corr[-c(1:2)], + Corr(exp2, obs2, memb_dim = 'member', memb = FALSE, pval = FALSE, dat_dim = 'dataset')$corr[-c(1:2)] ) }) @@ -370,56 +382,56 @@ test_that("3. Output checks: dat2", { test_that("4. Output checks: dat3", { # individual member expect_equal( - dim(Corr(exp3, obs3, memb_dim = 'member')$corr), + dim(Corr(exp3, obs3, memb_dim = 'member', dat_dim = 'dataset')$corr), c(nexp = 2, nobs = 2, exp_memb = 3, obs_memb = 2, lat = 2, lon = 3) ) expect_equal( - names(Corr(exp3, obs3, memb_dim = 'member')), + names(Corr(exp3, obs3, memb_dim = 'member', dat_dim = 'dataset')), c("corr", "p.val", "conf.lower", "conf.upper") ) expect_equal( - mean(Corr(exp3, obs3, memb_dim = 'member', pval = FALSE, conf = FALSE)$corr), + mean(Corr(exp3, obs3, memb_dim = 'member', pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), 0.006468017, tolerance = 0.0001 ) expect_equal( - median(Corr(exp3, obs3, memb_dim = 'member', pval = FALSE, conf = FALSE)$corr), + median(Corr(exp3, obs3, memb_dim = 'member', pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), 0.03662394, tolerance = 0.0001 ) expect_equal( - max(Corr(exp3, obs3, memb_dim = 'member', pval = FALSE, conf = FALSE)$corr), + max(Corr(exp3, obs3, memb_dim = 'member', pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), 0.9798228, tolerance = 0.0001 ) expect_equal( - min(Corr(exp3, obs3, memb_dim = 'member', pval = FALSE, conf = FALSE)$corr), + min(Corr(exp3, obs3, memb_dim = 'member', pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), -0.9464891, tolerance = 0.0001 ) # ensemble mean expect_equal( - dim(Corr(exp3, obs3, memb_dim = 'member', memb = FALSE)$corr), + dim(Corr(exp3, obs3, memb_dim = 'member', memb = FALSE, dat_dim = 'dataset')$corr), c(nexp = 2, nobs = 2, lat = 2, lon = 3) ) expect_equal( - mean(Corr(exp3, obs3, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE)$corr), + mean(Corr(exp3, obs3, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), -0.01001896, tolerance = 0.0001 ) expect_equal( - median(Corr(exp3, obs3, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE)$corr), + median(Corr(exp3, obs3, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), -0.01895816, tolerance = 0.0001 ) expect_equal( - max(Corr(exp3, obs3, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE)$corr), + max(Corr(exp3, obs3, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), 0.798233, tolerance = 0.0001 ) expect_equal( - min(Corr(exp3, obs3, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE)$corr), + min(Corr(exp3, obs3, memb_dim = 'member', memb = FALSE, pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), -0.6464809, tolerance = 0.0001 ) @@ -429,17 +441,17 @@ test_that("4. Output checks: dat3", { test_that("5. Output checks: dat4", { # no member expect_equal( - dim(Corr(exp4, obs4)$corr), + dim(Corr(exp4, obs4, dat_dim = 'dataset')$corr), c(nexp = 1, nobs = 1, member = 1, lat = 2) ) # individual member expect_equal( - dim(Corr(exp4, obs4, memb_dim = 'member')$corr), + dim(Corr(exp4, obs4, memb_dim = 'member', dat_dim = 'dataset')$corr), c(nexp = 1, nobs = 1, exp_memb = 1, obs_memb = 1, lat = 2) ) # ensemble expect_equal( - dim(Corr(exp4, obs4, memb_dim = 'member', memb = FALSE)$corr), + dim(Corr(exp4, obs4, memb_dim = 'member', memb = FALSE, dat_dim = 'dataset')$corr), c(nexp = 1, nobs = 1, lat = 2) ) @@ -460,7 +472,7 @@ test_that("6. Output checks: dat5", { c("corr") ) expect_equal( - mean(Corr(exp5, obs5, dat_dim = NULL, memb_dim = 'member', pval = FALSE, conf = FALSE)$corr), + mean(Corr(exp5, obs5, dat_dim = NULL, memb_dim = 'member', pval = FALSE, conf = FALSE)$corr, dat_dim = 'dataset'), 0.1880204, tolerance = 0.0001 ) @@ -506,7 +518,7 @@ test_that("7. Output checks: dat6", { test_that("8. Output checks: dat6 and dat7", { expect_equal( mean(Corr(exp6, obs6, dat_dim = NULL, memb_dim = NULL, pval = FALSE, conf = FALSE)$corr), - mean(Corr(exp7, obs7, memb_dim = NULL, pval = FALSE, conf = FALSE)$corr), + mean(Corr(exp7, obs7, memb_dim = NULL, pval = FALSE, conf = FALSE, dat_dim = 'dataset')$corr), tolerance = 0.0001 ) }) diff --git a/tests/testthat/test-DiffCorr.R b/tests/testthat/test-DiffCorr.R index f47ac1bfec639e6d59f6fb69d2bdc8760ef17203..32f6625a990383dd66d16cf950165ca430b86250 100644 --- a/tests/testthat/test-DiffCorr.R +++ b/tests/testthat/test-DiffCorr.R @@ -1,5 +1,3 @@ -context("s2dv::DiffCorr tests") - ############################################## # dat1 @@ -82,8 +80,8 @@ test_that("1. Input checks", { ) # alpha expect_error( - DiffCorr(exp2, obs2, ref2, alpha = 1), - 'Parameter "alpha" must be NULL or a number between 0 and 1.' + DiffCorr(exp2, obs2, ref2, alpha = 1, sign = T), + 'Parameter "alpha" must be a number between 0 and 1.' ) # handle.na expect_error( @@ -130,7 +128,7 @@ c(0.26166060, 0.15899774, 0.39264452, 0.27959883, 0.34736305, 0.07479832), tolerance = 0.0001 ) expect_equal( -names(DiffCorr(exp1, obs1, ref1, memb_dim = 'memb', alpha = 0.05)), +names(DiffCorr(exp1, obs1, ref1, memb_dim = 'memb', alpha = 0.05, sign = T, pval = F)), c("diff.corr", "sign") ) expect_equal( @@ -143,11 +141,11 @@ as.vector(DiffCorr(exp1, obs1, ref1, memb_dim = 'memb', alpha = 0.05, test.type as.vector(DiffCorr(exp1, obs1, ref1, memb_dim = 'memb', alpha = 0.05, test.type = "two-sided")$diff.corr) ) expect_equal( -as.vector(DiffCorr(exp1, obs1, ref1, memb_dim = 'memb', alpha = 0.05)$sign), +as.vector(DiffCorr(exp1, obs1, ref1, memb_dim = 'memb', alpha = 0.05, sign = T)$sign), rep(FALSE, 6) ) expect_equal( -as.vector(DiffCorr(exp1, obs1, ref1, memb_dim = 'memb', alpha = 0.05, test.type = "one-sided")$sign), +as.vector(DiffCorr(exp1, obs1, ref1, memb_dim = 'memb', alpha = 0.05, sign = T, test.type = "one-sided")$sign), rep(FALSE, 6) ) expect_equal( @@ -228,11 +226,11 @@ DiffCorr(exp2, obs2, ref2, test.type = 'one-sided')$p, tolerance = 0.0001 ) expect_equal( -DiffCorr(exp2, obs2, ref2, test.type = 'one-sided', alpha = 0.7)$sign, +DiffCorr(exp2, obs2, ref2, test.type = 'one-sided', alpha = 0.7, sign = T)$sign, FALSE ) expect_equal( -DiffCorr(exp2, obs2, ref2, test.type = 'two-sided', alpha = 0.7)$sign, +DiffCorr(exp2, obs2, ref2, test.type = 'two-sided', alpha = 0.7, sign = T)$sign, TRUE ) expect_equal( diff --git a/tests/testthat/test-EOF.R b/tests/testthat/test-EOF.R index 01428e6eb73bd736c9ca7b7b6c5da83fa6c4f678..828bd5242651c70e5cfbe7bf3034dd4d3130c6c2 100644 --- a/tests/testthat/test-EOF.R +++ b/tests/testthat/test-EOF.R @@ -1,5 +1,3 @@ -context("s2dv::EOF tests") - ############################################## # dat1 set.seed(1) @@ -74,7 +72,7 @@ test_that("1. Input checks", { "length as the longitude dimension of 'ano'.") ) expect_warning( - EOF(dat1, lat = lat1, lon = c(350, 370)), + EOF(dat1, lat = lat1, lon = c(350, 370), neofs = 8), "Some 'lon' is out of the range \\[-360, 360\\]." ) # neofs diff --git a/tests/testthat/test-Eno.R b/tests/testthat/test-Eno.R index 08fda85e39b82cc160590760b29d31cd9540cf45..b69d11f386f6053512c87e71d1eea2eddb781d07 100644 --- a/tests/testthat/test-Eno.R +++ b/tests/testthat/test-Eno.R @@ -1,5 +1,3 @@ -context("s2dv::Eno tests") - ############################################## set.seed(1) dat1 <- array(rnorm(800), dim = c(dataset = 1, member = 2, sdate = 4, diff --git a/tests/testthat/test-EuroAtlanticTC.R b/tests/testthat/test-EuroAtlanticTC.R index 6e3ac4bad6b2173bdc033dec4e92ab697b9b8354..c6899942dea9bf7e9bbc1fc9457b7bf1e8c4da5c 100644 --- a/tests/testthat/test-EuroAtlanticTC.R +++ b/tests/testthat/test-EuroAtlanticTC.R @@ -1,5 +1,3 @@ -context("s2dv::EuroAtlanticTC tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-Filter.R b/tests/testthat/test-Filter.R index cf271e1aa87f86bdaa9e787ab52c0bf63ba30fd8..b387cfc0c7b84e42d86ee2b4f3a4782545b73cae 100644 --- a/tests/testthat/test-Filter.R +++ b/tests/testthat/test-Filter.R @@ -1,5 +1,3 @@ -context("s2dv::Filter tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-GMST.R b/tests/testthat/test-GMST.R index 01ab792c26a2145d7794659a81922d4de933ef39..0fbfa419599fb4fae55aef6ace245c5b067e3ec0 100644 --- a/tests/testthat/test-GMST.R +++ b/tests/testthat/test-GMST.R @@ -1,5 +1,3 @@ -context("s2dv::GMST tests") - ############################################## lat <- seq(-90, 90, 10) lon <- seq(0, 360, 10) diff --git a/tests/testthat/test-GSAT.R b/tests/testthat/test-GSAT.R index 2d7d7e00c20f20d16288208e9ce72f12e2fbe454..2eb5d687ae5d89a962f3fbc30e4ac6e28200f3fc 100644 --- a/tests/testthat/test-GSAT.R +++ b/tests/testthat/test-GSAT.R @@ -1,5 +1,3 @@ -context("s2dv::GSAT tests") - ############################################## lat <- seq(-90, 90, 10) lon <- seq(0, 360, 10) diff --git a/tests/testthat/test-GetProbs.R b/tests/testthat/test-GetProbs.R new file mode 100644 index 0000000000000000000000000000000000000000..f1958dc22c21bf35c227df8a5159040147bdf8db --- /dev/null +++ b/tests/testthat/test-GetProbs.R @@ -0,0 +1,257 @@ +############################################## + +# dat1 +set.seed(1) +data1 <- array(rnorm(60), dim = c(member = 3, sdate = 10, time = 2)) +set.seed(2) +weights1 <- array(abs(rnorm(30)), dim = c(member = 3, sdate = 10)) + +# dat2 +set.seed(1) +data2 <- array(rnorm(20), dim = c(sdate = 10, time = 2)) +set.seed(2) +weights2 <- array(abs(rnorm(10)), dim = c(sdate = 10)) + +############################################## + +test_that("1. Input checks", { + # data + expect_error( + GetProbs(c()), + "Parameter 'data' cannot be NULL." + ) + expect_error( + GetProbs(c(data2)), + "Parameter 'data' must have dimension names." + ) + # time_dim + expect_error( + GetProbs(data1, time_dim = 1), + 'Parameter "time_dim" must be a character string.' + ) + expect_error( + GetProbs(data1, time_dim = 'ftime'), + "Parameter 'time_dim' is not found in 'data' dimensions." + ) + # memb_dim + expect_error( + GetProbs(data1, memb_dim = TRUE), + "Parameter 'memb_dim' must be a character string." + ) + expect_error( + GetProbs(data1, memb_dim = 'memb'), + "Parameter 'memb_dim' is not found in 'data' dimensions." + ) + # prob_thresholds + expect_error( + GetProbs(data1, prob_thresholds = 1), + "Parameter 'prob_thresholds' must be a numeric vector between 0 and 1." + ) + # indices_for_clim + expect_error( + GetProbs(data1, indices_for_quantiles = array(1:6, dim = c(2, 3))), + "Parameter 'indices_for_quantiles' must be NULL or a numeric vector." + ) + expect_error( + GetProbs(data1, indices_for_quantiles = 3:11), + "Parameter 'indices_for_quantiles' should be the indices of 'time_dim'." + ) + # cross.val + expect_error( + GetProbs(data1, cross.val = 1), + "Parameter 'cross.val' must be either TRUE or FALSE." + ) + # weights + expect_error( + GetProbs(data1, weights = c(0, 1)), + "Parameter 'weights' must be a named numeric array." + ) + expect_error( + GetProbs(data1, weights = array(1, dim = c(member = 3, time = 10))), + "Parameter 'weights' must have dimension sdate and member." + ) + expect_error( + GetProbs(data1, weights = array(1, dim = c(member = 3, sdate = 1))), + "Parameter 'weights' must have the same dimension length as sdate and member dimension in 'data'." + ) +# expect_error( +# GetProbs(data3, weights = array(1, dim = c(member = 3, time = 10, dataset = 3)), dat_dim = 'dataset'), +# "Parameter 'weights' must have three dimensions with the names of 'memb_dim', 'time_dim' and 'dat_dim'." +# ) + # ncores + expect_error( + GetProbs(data1, ncores = 1.5), + "Parameter 'ncores' must be either NULL or a positive integer." + ) + +}) + +############################################## +test_that("2. Output checks: dat1", { + +expect_equal( +dim(GetProbs(data1)), +c(bin = 3, sdate = 10, time = 2) +) +expect_equal( +c(GetProbs(data1)[, 10, 2]), +c(0.3333333, 0.3333333, 0.3333333), +tolerance = 0.0001 +) +expect_equal( +c(GetProbs(data1)[, 2, 2]), +c(0.6666667, 0.3333333, 0.0000000), +tolerance = 0.0001 +) + +# indices_for_quantiles +expect_equal( +dim(GetProbs(data1, indices_for_quantiles = 4:7)), +c(bin = 3, sdate = 10, time = 2) +) +expect_equal( +c(GetProbs(data1, indices_for_quantiles = 4:7)[, 10, 2]), +c(0.3333333, 0.6666667, 0.0000000), +tolerance = 0.0001 +) + +# prob_thresholds +expect_equal( +dim(GetProbs(data1, prob_thresholds = c(0.25, 0.5, 0.75))), +c(bin = 4, sdate = 10, time = 2) +) +expect_equal( +c(GetProbs(data1, prob_thresholds = c(0.25, 0.5, 0.75))[, 10, 2]), +c(0.3333333, 0.3333333, 0.3333333, 0.0000000), +tolerance = 0.0001 +) +expect_equal( +c(GetProbs(data1, prob_thresholds = c(0.25, 0.5, 0.75))[, 3, 2]), +c(0.0000000, 0.6666667, 0.0000000, 0.3333333), +tolerance = 0.0001 +) + +# weights +expect_equal( +dim(GetProbs(data1, weights = weights1)), +c(bin = 3, sdate = 10, time = 2) +) +expect_equal( +c(GetProbs(data1, weights = weights1)[, 10, 2]), +c(0.3327220, 0.5296149, 0.1376631), +tolerance = 0.0001 +) +expect_equal( +sum(c(GetProbs(data1, weights = weights1))), +20 +) + +# cross.val +expect_equal( +dim(GetProbs(data1, cross.val = T)), +c(bin = 3, sdate = 10, time = 2) +) +expect_equal( +c(GetProbs(data1, cross.val = T)[, 10, 2]), +c(0.3333333, 0.3333333, 0.3333333), +tolerance = 0.0001 +) +expect_equal( +c(GetProbs(data1, cross.val = T)[, 4, 2]), +c(0.0000000, 0.6666667, 0.3333333), +tolerance = 0.0001 +) + +# cross.val + weights +expect_equal( +dim(GetProbs(data1, cross.val = T, weights = weights1)), +c(bin = 3, sdate = 10, time = 2) +) +expect_equal( +c(GetProbs(data1, cross.val = T, weights = weights1)[, 10, 2]), +c(0.3335612, 0.5277459, 0.1386929), +tolerance = 0.0001 +) + +}) + + +############################################## +test_that("3. Output checks: dat2", { + +expect_equal( +dim(GetProbs(data2, memb_dim = NULL)), +c(bin = 3, sdate = 10, time = 2) +) +expect_equal( +c(GetProbs(data2, memb_dim = NULL)[, 10, 2]), +c(0, 1, 0) +) +expect_equal( +unique(c(GetProbs(data2, memb_dim = NULL))), +c(1, 0) +) + +# indices_for_quantiles +expect_equal( +dim(GetProbs(data2, memb_dim = NULL, indices_for_quantiles = 4:7)), +c(bin = 3, sdate = 10, time = 2) +) +expect_equal( +c(GetProbs(data2, memb_dim = NULL, indices_for_quantiles = 4:7)[, 10, 2]), +c(0, 0, 1) +) + +# prob_thresholds +expect_equal( +dim(GetProbs(data2, memb_dim = NULL, prob_thresholds = c(0.25, 0.5, 0.75))), +c(bin = 4, sdate = 10, time = 2) +) +expect_equal( +c(GetProbs(data2, memb_dim = NULL, prob_thresholds = c(0.25, 0.5, 0.75))[, 10, 2]), +c(0, 0, 1, 0) +) +expect_equal( +c(GetProbs(data2, memb_dim = NULL, prob_thresholds = c(0.25, 0.5, 0.75))[, 3, 2]), +c(1, 0, 0, 0) +) + +# weights +expect_equal( +dim(GetProbs(data2, memb_dim = NULL, weights = weights2)), +c(bin = 3, sdate = 10, time = 2) +) +expect_equal( +c(GetProbs(data2, memb_dim = NULL, weights = weights2)[, 10, 2]), +c(0, 1, 0) +) +expect_equal( +sum(c(GetProbs(data2, memb_dim = NULL, weights = weights2))), +20 +) + +# cross.val +expect_equal( +dim(GetProbs(data2, memb_dim = NULL, cross.val = T)), +c(bin = 3, sdate = 10, time = 2) +) +expect_equal( +c(GetProbs(data2, memb_dim = NULL, cross.val = T)[, 10, 2]), +c(0, 1, 0) +) +expect_equal( +c(GetProbs(data2, memb_dim = NULL, cross.val = T)[, 4, 2]), +c(1, 0, 0) +) + +# cross.val + weights +expect_equal( +dim(GetProbs(data2, memb_dim = NULL, cross.val = T, weights = weights2)), +c(bin = 3, sdate = 10, time = 2) +) +expect_equal( +c(GetProbs(data2, memb_dim = NULL, cross.val = T, weights = weights2)[, 10, 2]), +c(0, 1, 0) +) + +}) diff --git a/tests/testthat/test-Histo2Hindcast.R b/tests/testthat/test-Histo2Hindcast.R index 025f0035677e712d259954b11fe009353bb19626..1bd69dafda95b344026e3d5e04d9d1536af7dbca 100644 --- a/tests/testthat/test-Histo2Hindcast.R +++ b/tests/testthat/test-Histo2Hindcast.R @@ -1,5 +1,3 @@ -context("s2dv::Histo2Hindcast tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-InsertDim.R b/tests/testthat/test-InsertDim.R index 876e7e36c2e0c74f0a4a2017d35f4857e2109816..1e401a9c582c4d14aaeb0d609f56971b19091473 100644 --- a/tests/testthat/test-InsertDim.R +++ b/tests/testthat/test-InsertDim.R @@ -1,5 +1,3 @@ -context("s2dv::InsertDim tests") - ############################################## dat1 <- array(c(1:26), dim = c(dat = 1, sdate = 13, ftime = 2)) dat2 <- array(c(1:24), dim = c(2, 3, c = 4)) diff --git a/tests/testthat/test-Load.R b/tests/testthat/test-Load.R index 826613920b57c4306db09918997185a524f85414..5e2e72392d7f03d8cb3c769f939cb6b535d8a909 100644 --- a/tests/testthat/test-Load.R +++ b/tests/testthat/test-Load.R @@ -1,5 +1,3 @@ -context("s2dv::Load tests") - ############################################## test_that("1-1.", { @@ -178,5 +176,45 @@ c(rep(NA, 4), 101250, rep(NA, 5), 100940, NA), tolerance = 0.0001 ) +}) + +test_that("1-4.", { + +exp <- list(list( + name = "system5_m1", + path = file.path("/esarchive/exp/ecmwf/system5_m1/monthly_mean/$VAR_NAME$_f6h/$VAR_NAME$_$START_DATE$.nc") + )) + +suppressWarnings( +res <- Load( + var = "tas", + exp = exp, + obs = NULL, + sdates = c('19930201'), + output = "lonlat", + leadtimemin = 1, + leadtimemax = 1, + nmember = 2, + latmin = -90, #10, + latmax = 90, #12, + lonmin = 0, + lonmax = 359.9, + grid = 'r1296x640', #'t426grid', + dimnames = list(lon='longitude', lat='latitude', member='ensemble'), + nprocs = 1) +) +expect_equal( +dim(res$mod), +c(dataset = 1, member = 2, sdate = 1, ftime = 1, lat = 640, lon = 1296) +) +expect_equal( +as.vector(res$mod[1, 1, 1, 1, 100, 1:4]), +c(277.38, 277.38, 277.37, 277.37), +tolerance = 0.0001 +) +expect_equal( +any(is.na(res$mod)), +FALSE +) }) diff --git a/tests/testthat/test-MSE.R b/tests/testthat/test-MSE.R new file mode 100644 index 0000000000000000000000000000000000000000..05bba2d8da007ed2f544b59e6b0a163930d50174 --- /dev/null +++ b/tests/testthat/test-MSE.R @@ -0,0 +1,259 @@ +############################################## + # dat1 + set.seed(1) + exp1 <- array(rnorm(120), dim = c(dataset = 3, sdate = 5, ftime = 2, lon = 1, lat = 4)) + set.seed(2) + obs1 <- array(rnorm(80), dim = c(dataset = 2, sdate = 5, ftime = 2, lon = 1, lat = 4)) + set.seed(2) + na <- floor(runif(10, min = 1, max = 80)) + obs1[na] <- NA + + # dat 2: vector + set.seed(5) + exp2 <- rnorm(10) + set.seed(6) + obs2 <- rnorm(10) + + # dat3 + set.seed(1) + exp3 <- array(rnorm(120), dim = c(sdate = 5, ftime = 2, member = 3)) + set.seed(2) + obs3 <- array(rnorm(80), dim = c(sdate = 5, ftime = 2, member = 2)) + + # dat4 + set.seed(1) + exp4 <- array(rnorm(120), dim = c(dataset = 2, sdate = 5, time = 1)) + set.seed(2) + obs4 <- array(rnorm(80), dim = c(dataset = 1, sdate = 5, member = 2, time = 1)) + +############################################## +test_that("1. Input checks", { + + expect_error( + MSE(c(), c()), + "Parameter 'exp' and 'obs' cannot be NULL." + ) + expect_error( + MSE(c('b'), c('a')), + "Parameter 'exp' and 'obs' must be a numeric array." + ) + expect_error( + MSE(c(1:10), c(2:4)), + "Parameter 'exp' and 'obs' must be array with as least two dimensions time_dim and dat_dim, or vector of same length." + ) + expect_error( + MSE(array(1:10, dim = c(2, 5)), array(1:4, dim = c(2, 2))), + "Parameter 'exp' and 'obs' must have dimension names." + ) + expect_error( + MSE(array(1:20, dim = c(dat = 2, sdate = 5, member = 2)), + array(1:10, dim = c(dat = 2, sdate = 5, time = 1))), + "Parameter 'exp' and 'obs' must have the same dimension names." + ) + expect_error( + MSE(exp1, obs1, dat_dim = 1), + "Parameter 'dat_dim' must be a character string." + ) + expect_error( + MSE(exp1, obs1, dat_dim = 'a'), + "Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension." + ) + expect_error( + MSE(exp1, obs1, time_dim = c('sdate', 'a')), + "Parameter 'time_dim' must be a character string." + ) + expect_error( + MSE(exp1, obs1, time_dim = 'a'), + "Parameter 'time_dim' is not found in 'exp' or 'obs' dimension." + ) + expect_error( + MSE(exp1, obs1, comp_dim = c('sdate', 'ftime')), + "Parameter 'comp_dim' must be a character string." + ) + expect_error( + MSE(exp1, obs1, comp_dim = 'a'), + "Parameter 'comp_dim' is not found in 'exp' or 'obs' dimension." + ) + expect_error( + MSE(exp1, obs1, limits = c(1,3)), + "Paramter 'comp_dim' cannot be NULL if 'limits' is assigned." + ) + expect_error( + MSE(exp1, obs1, comp_dim = 'ftime', limits = c(1)), + paste0("Parameter 'limits' must be a vector of two positive ", + "integers smaller than the length of paramter 'comp_dim'.") + ) + expect_error( + MSE(exp1, obs1, alpha = -1), + "Parameter 'alpha' must be a numeric number between 0 and 1." + ) + expect_error( + MSE(exp1, obs1, conf = 1), + "Parameter 'conf' must be one logical value." + ) + expect_error( + MSE(exp1, obs1, ncores = 1.5), + "Parameter 'ncores' must be a positive integer." + ) + expect_error( + MSE(exp = array(1:10, dim = c(dataset = 1, member = 5, sdate = 1)), + obs = array(1:4, dim = c(dataset = 2, member = 2, sdate = 2))), + "Parameter 'exp' and 'obs' must have same length of all dimensions except 'dat_dim' and 'memb_dim'." + ) + expect_error( + MSE(exp = array(1:5, dim = c(sdate = 1, dataset = 5, a = 1)), + obs = array(1:2, dim = c(a = 1, sdate = 1, dataset = 2)), dat_dim = "dataset"), + "The length of time_dim must be at least 2 to compute MSE." + ) + + + +}) + +############################################## +test_that("2. Output checks: dat1", { +suppressWarnings( + expect_equal( + dim(MSE(exp1, obs1, dat_dim = 'dataset')$mse), + c(nexp = 3, nobs = 2, ftime = 2, lon = 1, lat = 4) + ) +) +suppressWarnings( + expect_equal( + MSE(exp1, obs1, dat_dim = 'dataset')$mse[1:6], + c(1.2815677, 2.0832803, 1.1894637, 1.3000403, 1.4053807, 0.8157563)^2, + tolerance = 0.001 + ) +) +suppressWarnings( + expect_equal( + length(which(is.na(MSE(exp1, obs1, dat_dim = 'dataset')$conf.lower))), + 4 + ) +) +suppressWarnings( + expect_equal( + c(MSE(exp1, obs1, dat_dim = 'dataset')$conf.lower[2,1,,1,2:3]), + c(1.8869268, 0.4418298, 0.2694637, 0.8215383), + tolerance = 0.001 + ) +) +suppressWarnings( + expect_equal( + length(which(is.na(MSE(exp1, obs1, dat_dim = 'dataset', comp_dim = 'ftime')$mse))), + 0 + ) +) +suppressWarnings( + expect_equal( + length(which(is.na(MSE(exp1, obs1, dat_dim = 'dataset', comp_dim = 'ftime')$conf.upper))), + 8 + ) +) +suppressWarnings( + expect_equal( + length(which(is.na(MSE(exp1, obs1, dat_dim = 'dataset', comp_dim = 'lat')$conf.lower))), + 36 + ) +) +suppressWarnings( + expect_equal( + length(which(is.na(MSE(exp1, obs1, dat_dim = 'dataset', comp_dim = 'lat', limits = c(1, 2))$conf.lower))), + 21 + ) +) +suppressWarnings( + expect_equal( + c(MSE(exp1, obs1, dat_dim = 'dataset', alpha = 0.01)$conf.upper[2,1,,1,2:3]), + c(13.844841, 5.044269, 1.977121, 6.027826), + tolerance = 0.0001 + ) +) +suppressWarnings( + expect_equal( + length(MSE(exp1, obs1, dat_dim = 'dataset', conf = FALSE)), + 1 + ) +) + + +}) + +############################################## +test_that("3. Output checks: dat2", { + + expect_equal( + dim(MSE(exp2, obs2)$mse), + NULL + ) + + expect_equal( + as.vector(MSE(exp2, obs2)$mse), + 1.429815^2, + tolerance = 0.00001 + ) +}) + +############################################## +test_that("4. Output checks: dat3", { + + expect_equal( + dim(MSE(exp3, obs3, dat_dim = NULL, memb_dim = 'member')$mse), + c(ftime = 2) + ) + + expect_equal( + as.vector(MSE(exp3, obs3, dat_dim = NULL, memb_dim = 'member')$mse), + c(0.6191331, 0.7133894)^2, + tolerance = 0.00001 + ) + expect_equal( + names(MSE(exp3, obs3, dat_dim = NULL, memb_dim = 'member', conf = FALSE)), + c("mse") + ) + expect_equal( + names(MSE(exp3, obs3, dat_dim = NULL, memb_dim = 'member', conf = TRUE)), + c('mse', 'conf.lower', 'conf.upper') + ) + expect_equal( + c(MSE(exp3, obs3, dat_dim = NULL, memb_dim = 'member')$conf.lower), + c(0.2567713, 0.3409037), + tolerance = 0.0001 + ) + expect_equal( + c(MSE(exp3, obs3, dat_dim = NULL, memb_dim = 'member')$conf.upper), + c(1.231523, 1.635038), + tolerance = 0.0001 + ) + +}) + +############################################## + +test_that("5. Output checks: dat4", { + + expect_equal( + dim(MSE(exp4, obs4, dat_dim = 'dataset', memb_dim = 'member')$mse), + c(nexp = 2, nobs = 1, time = 1) + ) + expect_equal( + c(MSE(exp4, obs4, dat_dim = 'dataset', memb_dim = 'member')$mse), + c(0.6775320, 0.8954404)^2, + tolerance = 0.0001 + ) + expect_equal( + c(MSE(exp4, obs4, dat_dim = 'dataset', memb_dim = 'member')$conf.lower), + c(0.3074949, 0.5370958), + tolerance = 0.0001 + ) + expect_equal( + c(MSE(exp4, obs4, dat_dim = 'dataset', memb_dim = 'member')$conf.upper), + c(1.474804, 2.576013), + tolerance = 0.0001 + ) + + +}) + +############################################## + diff --git a/tests/testthat/test-MSSS.R b/tests/testthat/test-MSSS.R new file mode 100644 index 0000000000000000000000000000000000000000..29952fae9e7c54f95f4acb051e437e8854a64ecc --- /dev/null +++ b/tests/testthat/test-MSSS.R @@ -0,0 +1,265 @@ +############################################## + # case 1 + set.seed(1) + exp1 <- array(rnorm(15), dim = c(sdate = 3, dataset = 5)) + set.seed(2) + obs1 <- array(rnorm(6), dim = c(sdate = 3, dataset = 2)) + set.seed(3) + ref1_1 <- array(rnorm(15), dim = c(sdate = 3, dataset = 5)) + ref1_2 <- exp1[, 3] + dim(ref1_2) <- c(sdate = 3) + + # case 2 + set.seed(3) + exp2 <- array(rnorm(30), dim = c(dataset = 3, sdate = 5, member = 3)) + set.seed(4) + obs2 <- array(rnorm(20), dim = c(dataset = 2, sdate = 5, member = 2)) + set.seed(5) + ref2 <- array(rnorm(15), dim = c(sdate = 5, member = 3)) + + # case 3: vector + set.seed(5) + exp3 <- rnorm(10) + set.seed(6) + obs3 <- rnorm(10) + + +############################################## + +test_that("1. Input checks", { + ## exp and obs (1) + expect_error( + MSSS(c(), c()), + "Parameter 'exp' and 'obs' cannot be NULL." + ) + expect_error( + MSSS('exp', 'obs'), + "Parameter 'exp' and 'obs' must be a numeric array." + ) + expect_error( + MSSS(c(1:10), c(2:4)), + paste0("Parameter 'exp' and 'obs' must be array with as least two dimensions ", + "time_dim and dat_dim, or vector of same length.") + ) + expect_error( + MSSS(array(1:10, dim = c(2, 5)), array(1:10, dim = c(2, 5))), + "Parameter 'exp' and 'obs' must have dimension names." + ) + expect_error( + MSSS(array(1:20, dim = c(dat = 2, sdate = 5, member = 2)), + array(1:10, dim = c(dat = 2, sdate = 5, time = 1))), + "Parameter 'exp' and 'obs' must have the same dimension names." + ) + ## time_dim + expect_error( + MSSS(exp1, obs1, time_dim = 1), + "Parameter 'time_dim' must be a character string." + ) + expect_error( + MSSS(exp1, obs1, time_dim = 'a'), + "Parameter 'time_dim' is not found in 'exp' or 'obs' dimension." + ) + ## dat_dim + expect_error( + MSSS(exp1, obs1, dat_dim = NA), + "Parameter 'dat_dim' must be a character string." + ) + expect_error( + MSSS(exp1, obs1, dat_dim = 'memb'), + paste0("Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension.", + " Set it as NULL if there is no dataset dimension.") + ) + ## pval + expect_error( + MSSS(exp1, obs1, pval = c(T, T)), + "Parameter 'pval' must be one logical value." + ) + ## sign + expect_error( + MSSS(exp1, obs1, sign = 0.05), + "Parameter 'sign' must be one logical value." + ) + ## alpha + expect_error( + MSSS(exp1, obs1, alpha = T), + "Parameter 'alpha' must be one numeric value." + ) + ## ncores + expect_error( + MSSS(exp1, obs1, ncores = 1.4), + "Parameter 'ncores' must be a positive integer." + ) + ## exp and obs (2) + expect_error( + MSSS(exp = array(1:10, dim = c(dataset = 1, member = 5, sdate = 1)), + obs = array(1:4, dim = c(dataset = 2, member = 2, sdate = 2))), + "Parameter 'exp' and 'obs' must have same length of all dimensions except 'dat_dim' and 'memb_dim'." + ) + expect_error( + MSSS(exp = array(1:10, dim = c(sdate = 1, dataset = 5, a = 1)), + obs = array(1:4, dim = c(a = 1, sdate = 1, dataset = 2)), dat_dim = "dataset"), + "The length of time_dim must be more than 2 to compute MSSS." + ) +}) + +############################################## +test_that("2. Output checks: case 1", { + + res1_1 <- MSSS(exp1, obs1, dat_dim = 'dataset') + expect_equal( + dim(res1_1$msss), + c(nexp = 5, nobs = 2) + ) + expect_equal( + dim(res1_1$p.val), + c(nexp = 5, nobs = 2) + ) + expect_equal( + c(res1_1$msss)[3:8], + c(0.03359106, -0.05535409, -0.80010171, 0.03151828, -5.53371892, -1.67639444), + tolerance = 0.00001 + ) + expect_equal( + as.vector(res1_1$p.val)[3:7], + c(0.4829225, 0.5269121, 0.7641713, 0.4839926, 0.9771112), + tolerance = 0.001 + ) + + exp1_2 <- exp1; exp1_2[2:4] <- NA + obs1_2 <- obs1; obs1_2[1:2] <- NA + res1_2 <- MSSS(exp1_2, obs1_2, dat_dim = 'dataset', pval = T, sign = T) + + expect_equal( + names(res1_2), + c("msss", "p.val", "sign") + ) + expect_equal( + c(res1_2$msss), + c(rep(NA, 7), -1.676394, -1.520840, -3.455754), + tolerance = 0.0001 + ) + expect_equal( + c(res1_2$p.val), + c(rep(NA, 7), 0.8774973, 0.8640313, 0.9520470), + tolerance = 0.0001 + ) + expect_equal( + c(res1_2$sign), + c(rep(NA, 7), rep(FALSE, 3)) + ) + + #ref + res1_3 <- MSSS(exp1, obs1, ref = ref1_1, dat_dim = 'dataset') + expect_equal( + dim(res1_3$msss), + c(nexp = 5, nobs = 2) + ) + expect_equal( + dim(res1_3$p.val), + c(nexp = 5, nobs = 2) + ) + expect_equal( + as.vector(res1_3$msss[2, ]), + c(-3.828708, -96.610622), + tolerance = 0.0001 + ) + expect_equal( + as.vector(res1_3$p.val[2, ]), + c(0.9588755, 0.9998951), + tolerance = 0.0001 + ) + res1_4 <- MSSS(exp1, obs1, ref = ref1_2, dat_dim = 'dataset', sign = T, alpha = 0.3) + expect_equal( + dim(res1_4$msss), + c(nexp = 5, nobs = 2) + ) + expect_equal( + dim(res1_4$sign), + c(nexp = 5, nobs = 2) + ) + expect_equal( + as.vector(res1_4$msss[2, ]), + c(-2.705537, -1.441239), + tolerance = 0.0001 + ) + expect_equal( + as.vector(res1_4$p[2, ]), + c(0.9321160, 0.8563146), + tolerance = 0.0001 + ) + expect_equal( + as.vector(res1_4$sign), + c(rep(F, 5), T, rep(F, 4)) + ) + + # Random Walk + suppressWarnings({ + res1_5 <- MSSS(exp1, obs1, ref = ref1_1, dat_dim = 'dataset', sig_method = "Random Walk", pval = F, sign = T, sig_method.type = 'two.sided') + }) + expect_equal( + as.vector(res1_5$sign), + rep(F, 10) + ) + suppressWarnings({ + res1_6 <- MSSS(exp1, obs1, ref = NULL, dat_dim = 'dataset', sig_method = "Random Walk", pval = T, sign = T, sig_method.type = 'two.sided') + }) + expect_equal( + as.vector(res1_6$p), + c(1, 1, 1, 1, 1, 1, 0.25, 0.25, 1, 1) + ) + expect_equal( + as.vector(res1_6$sign), + rep(F, 10) + ) + +}) + + +############################################## +test_that("3. Output checks: case 2", { + res1 <- MSSS(exp2, obs2, ref2, dat_dim = "dataset", memb_dim = 'member', sign = T) + expect_equal( + dim(res1$msss), + c(nexp = 3, nobs = 2) + ) + expect_equal( + dim(res1$sign), + c(nexp = 3, nobs = 2) + ) + expect_equal( + dim(res1$p), + c(nexp = 3, nobs = 2) + ) + expect_equal( + c(res1$msss), + c(-0.18155696, 0.51980557, -0.66121915, -0.08753543, 0.62908575, -0.95419385), + tolerance = 0.0001 + ) + expect_equal( + c(res1$p), + c(0.62284746, 0.09217502, 0.82539496, 0.56264155, 0.04034091, 0.88868245), + tolerance = 0.0001 + ) + + res2 <- MSSS(apply(exp2, 1:2, mean), apply(obs2, 1:2, mean), array(apply(ref2, 1, mean), dim = c(sdate = 5)), dat_dim = "dataset", sign = T) + expect_equal( + res1, res2 + ) + +}) + +############################################## + +test_that("4. Output checks: case 3", { + + expect_equal( + dim(MSSS(exp3, obs3)$msss), + NULL + ) + expect_equal( + as.vector(MSSS(exp3, obs3)$msss), + -1.653613, + tolerance = 0.00001 + ) + +}) diff --git a/tests/testthat/test-MeanDims.R b/tests/testthat/test-MeanDims.R index c043c784b8f302ae8c5ab53f5a00a206601ba4fc..502c8285927e23b6e5d865b5bdb90a663bbc727a 100644 --- a/tests/testthat/test-MeanDims.R +++ b/tests/testthat/test-MeanDims.R @@ -1,5 +1,3 @@ -context("s2dv::MeanDims tests") - ############################################## # dat1 dat1 <- array(c(1:20), diff --git a/tests/testthat/test-NAO.R b/tests/testthat/test-NAO.R index 05fcd22caec34d412f254f64e9819484a0269e04..91b994329ed8f3051dc2ce9a2150562069bfe640 100644 --- a/tests/testthat/test-NAO.R +++ b/tests/testthat/test-NAO.R @@ -1,5 +1,3 @@ -context("s2dv::NAO tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-Persistence.R b/tests/testthat/test-Persistence.R index 4eabe836734c3ecf8cd636facb8a9e4b69ffb6ff..a28a2f30d6e2563a151652db9361f910ddbb5c30 100644 --- a/tests/testthat/test-Persistence.R +++ b/tests/testthat/test-Persistence.R @@ -1,5 +1,3 @@ -context("s2dv::Persistence tests") - ############################################## #dat1: year set.seed(1) diff --git a/tests/testthat/test-ProbBins.R b/tests/testthat/test-ProbBins.R index 4b3d0ecde379c9543d2485e054bc15153d22d90c..5b286e0cc4616e9088e9d7b1d5bd84e09391cce7 100644 --- a/tests/testthat/test-ProbBins.R +++ b/tests/testthat/test-ProbBins.R @@ -1,5 +1,3 @@ -context("s2dv::ProbBins tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-ProjectField.R b/tests/testthat/test-ProjectField.R index f3f05ce618df9777393a911eaae2f02c9e428b9e..8781c082f4df5086e8ae720634c03aef93abde5b 100644 --- a/tests/testthat/test-ProjectField.R +++ b/tests/testthat/test-ProjectField.R @@ -1,5 +1,3 @@ -context("s2dv::ProjectField tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-REOF.R b/tests/testthat/test-REOF.R index 39747e2fafd0cf1efa758eb3ef5cb2fd10ee3267..862e10ae810aa5386a5822094000d5999d52f2d1 100644 --- a/tests/testthat/test-REOF.R +++ b/tests/testthat/test-REOF.R @@ -1,5 +1,3 @@ -context("s2dv::REOF tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-RMS.R b/tests/testthat/test-RMS.R index bf059efa722e33951bbc0955189791abfe282e24..660f4e7f31ceca39a8038d326717f5048bf0c3bc 100644 --- a/tests/testthat/test-RMS.R +++ b/tests/testthat/test-RMS.R @@ -1,10 +1,7 @@ -context("s2dv::RMS tests") - ############################################## # dat1 set.seed(1) exp1 <- array(rnorm(120), dim = c(dataset = 3, sdate = 5, ftime = 2, lon = 1, lat = 4)) - set.seed(2) obs1 <- array(rnorm(80), dim = c(dataset = 2, sdate = 5, ftime = 2, lon = 1, lat = 4)) set.seed(2) @@ -19,17 +16,15 @@ context("s2dv::RMS tests") # dat3 set.seed(1) - exp3 <- array(rnorm(120), dim = c(sdate = 5, ftime = 2, lon = 1, lat = 4)) - + exp3 <- array(rnorm(120), dim = c(sdate = 5, ftime = 2, member = 3)) set.seed(2) - obs3 <- array(rnorm(80), dim = c(sdate = 5, ftime = 2, lon = 1, lat = 4)) + obs3 <- array(rnorm(80), dim = c(sdate = 5, ftime = 2, member = 2)) # dat4 set.seed(1) - exp4 <- array(rnorm(120), dim = c(sdates = 5, ftimes = 2, lon = 1, lat = 1)) - + exp4 <- array(rnorm(120), dim = c(dataset = 2, sdate = 5, time = 1)) set.seed(2) - obs4 <- array(rnorm(80), dim = c(sdates = 5, ftimes = 2, lon = 1, lat = 1)) + obs4 <- array(rnorm(80), dim = c(dataset = 1, sdate = 5, member = 2, time = 1)) ############################################## test_that("1. Input checks", { @@ -51,8 +46,9 @@ test_that("1. Input checks", { "Parameter 'exp' and 'obs' must have dimension names." ) expect_error( - RMS(array(1:10, dim = c(a = 2, c = 5)), array(1:4, dim = c(a = 2, b = 2))), - "Parameter 'exp' and 'obs' must have same dimension name" + RMS(array(1:20, dim = c(dat = 2, sdate = 5, member = 2)), + array(1:10, dim = c(dat = 2, sdate = 5, time = 1))), + "Parameter 'exp' and 'obs' must have the same dimension names." ) expect_error( RMS(exp1, obs1, dat_dim = 1), @@ -88,8 +84,8 @@ test_that("1. Input checks", { "integers smaller than the length of paramter 'comp_dim'.") ) expect_error( - RMS(exp1, obs1, conf.lev = -1), - "Parameter 'conf.lev' must be a numeric number between 0 and 1." + RMS(exp1, obs1, alpha = -1), + "Parameter 'alpha' must be a numeric number between 0 and 1." ) expect_error( RMS(exp1, obs1, conf = 1), @@ -100,13 +96,13 @@ test_that("1. Input checks", { "Parameter 'ncores' must be a positive integer." ) expect_error( - RMS(exp = array(1:10, dim = c(sdate = 1, dataset = 5, a = 1)), - obs = array(1:4, dim = c(a = 1, sdate = 2, dataset = 2))), - "Parameter 'exp' and 'obs' must have same length of all dimension except 'dat_dim'." + RMS(exp = array(1:10, dim = c(dataset = 1, member = 5, sdate = 1)), + obs = array(1:4, dim = c(dataset = 2, member = 2, sdate = 2))), + "Parameter 'exp' and 'obs' must have same length of all dimensions except 'dat_dim' and 'memb_dim'." ) expect_error( RMS(exp = array(1:5, dim = c(sdate = 1, dataset = 5, a = 1)), - obs = array(1:2, dim = c(a = 1, sdate = 1, dataset = 2))), + obs = array(1:2, dim = c(a = 1, sdate = 1, dataset = 2)), dat_dim = "dataset"), "The length of time_dim must be at least 2 to compute RMS." ) @@ -118,64 +114,64 @@ test_that("1. Input checks", { test_that("2. Output checks: dat1", { suppressWarnings( expect_equal( - dim(RMS(exp1, obs1)$rms), + dim(RMS(exp1, obs1, dat_dim = 'dataset')$rms), c(nexp = 3, nobs = 2, ftime = 2, lon = 1, lat = 4) ) ) suppressWarnings( expect_equal( - RMS(exp1, obs1)$rms[1:6], + RMS(exp1, obs1, dat_dim = 'dataset')$rms[1:6], c(1.2815677, 2.0832803, 1.1894637, 1.3000403, 1.4053807, 0.8157563), tolerance = 0.001 ) ) suppressWarnings( expect_equal( - length(which(is.na(RMS(exp1, obs1)$conf.lower))), + length(which(is.na(RMS(exp1, obs1, dat_dim = 'dataset')$conf.lower))), 4 ) ) suppressWarnings( expect_equal( - max(RMS(exp1, obs1)$conf.lower, na.rm = T), + max(RMS(exp1, obs1, dat_dim = 'dataset')$conf.lower, na.rm = T), 1.399509, tolerance = 0.001 ) ) suppressWarnings( expect_equal( - length(which(is.na(RMS(exp1, obs1, comp_dim = 'ftime')$rms))), + length(which(is.na(RMS(exp1, obs1, dat_dim = 'dataset', comp_dim = 'ftime')$rms))), 0 ) ) suppressWarnings( expect_equal( - length(which(is.na(RMS(exp1, obs1, comp_dim = 'ftime')$conf.upper))), + length(which(is.na(RMS(exp1, obs1, dat_dim = 'dataset', comp_dim = 'ftime')$conf.upper))), 8 ) ) suppressWarnings( expect_equal( - length(which(is.na(RMS(exp1, obs1, comp_dim = 'lat')$conf.lower))), + length(which(is.na(RMS(exp1, obs1, dat_dim = 'dataset', comp_dim = 'lat')$conf.lower))), 36 ) ) suppressWarnings( expect_equal( - length(which(is.na(RMS(exp1, obs1, comp_dim = 'lat', limits = c(1, 2))$conf.lower))), + length(which(is.na(RMS(exp1, obs1, dat_dim = 'dataset', comp_dim = 'lat', limits = c(1, 2))$conf.lower))), 21 ) ) suppressWarnings( expect_equal( - min(RMS(exp1, obs1, conf.lev = 0.99)$conf.upper, na.rm = TRUE), + min(RMS(exp1, obs1, dat_dim = 'dataset', alpha = 0.01)$conf.upper, na.rm = TRUE), 1.406368, tolerance = 0.0001 ) ) suppressWarnings( expect_equal( - length(RMS(exp1, obs1, conf = FALSE)), + length(RMS(exp1, obs1, dat_dim = 'dataset', conf = FALSE)), 1 ) ) @@ -188,7 +184,7 @@ test_that("3. Output checks: dat2", { expect_equal( dim(RMS(exp2, obs2)$rms), - c(nexp = 1, nobs = 1) + NULL ) expect_equal( @@ -202,23 +198,62 @@ test_that("3. Output checks: dat2", { test_that("4. Output checks: dat3", { expect_equal( - dim(RMS(exp3, obs3, dat_dim = NULL)$rms), - c(ftime = 2, lon = 1, lat = 4) + dim(RMS(exp3, obs3, dat_dim = NULL, memb_dim = 'member')$rms), + c(ftime = 2) ) expect_equal( - as.vector(RMS(exp3, obs3, dat_dim = NULL)$rms), - c(1.6458118, 0.8860392, 0.8261295, 1.1681939, 2.1693538, 1.3064454, 0.5384229, 1.1215333), + as.vector(RMS(exp3, obs3, dat_dim = NULL, memb_dim = 'member')$rms), + c(0.6191331, 0.7133894), tolerance = 0.00001 ) expect_equal( - dim(RMS(exp3, obs3, dat_dim = NULL, conf = FALSE)$rms), - c(ftime = 2, lon = 1, lat = 4) + names(RMS(exp3, obs3, dat_dim = NULL, memb_dim = 'member', conf = FALSE)), + c("rms") + ) + expect_equal( + names(RMS(exp3, obs3, dat_dim = NULL, memb_dim = 'member', conf = TRUE)), + c('rms', 'conf.lower', 'conf.upper') + ) + expect_equal( + c(RMS(exp3, obs3, dat_dim = NULL, memb_dim = 'member')$conf.lower), + c(0.4147271, 0.4778648), + tolerance = 0.0001 + ) + expect_equal( + c(RMS(exp3, obs3, dat_dim = NULL, memb_dim = 'member')$conf.upper), + c(1.989109, 2.291930), + tolerance = 0.0001 + ) + +}) + +############################################## + +test_that("5. Output checks: dat4", { + + expect_equal( + dim(RMS(exp4, obs4, dat_dim = 'dataset', memb_dim = 'member')$rms), + c(nexp = 2, nobs = 1, time = 1) + ) + expect_equal( + c(RMS(exp4, obs4, dat_dim = 'dataset', memb_dim = 'member')$rms), + c(0.6775320, 0.8954404), + tolerance = 0.0001 ) expect_equal( - dim(RMS(exp4, obs4, time_dim = 'sdates', dat_dim = NULL, conf = TRUE)$rms), - c(ftimes = 2, lon = 1, lat = 1) + c(RMS(exp4, obs4, dat_dim = 'dataset', memb_dim = 'member')$conf.lower), + c(0.4538456, 0.5998118), + tolerance = 0.0001 ) + expect_equal( + c(RMS(exp4, obs4, dat_dim = 'dataset', memb_dim = 'member')$conf.upper), + c(2.176729, 2.876811), + tolerance = 0.0001 + ) + + }) -############################################## \ No newline at end of file +############################################## + diff --git a/tests/testthat/test-RMSSS.R b/tests/testthat/test-RMSSS.R index fa019e60ab94ba053a54ecad7ce5289297e10ad2..a364b40c50cd917002560c6e0b0613e7e3091a4a 100644 --- a/tests/testthat/test-RMSSS.R +++ b/tests/testthat/test-RMSSS.R @@ -1,5 +1,3 @@ -context("s2dv::RMSSS tests") - ############################################## # case 1 set.seed(1) @@ -13,9 +11,11 @@ context("s2dv::RMSSS tests") # case 2 set.seed(3) - exp2 <- array(rnorm(120), dim = c(time = 10, dat = 1, lon = 3, lat = 2, dataset = 2)) + exp2 <- array(rnorm(30), dim = c(dataset = 3, sdate = 5, member = 3)) set.seed(4) - obs2 <- array(rnorm(60), dim = c(dat = 1, time = 10, dataset = 1, lat = 2, lon = 3)) + obs2 <- array(rnorm(20), dim = c(dataset = 2, sdate = 5, member = 2)) + set.seed(5) + ref2 <- array(rnorm(15), dim = c(sdate = 5, member = 3)) # case 3: vector set.seed(5) @@ -23,20 +23,6 @@ context("s2dv::RMSSS tests") set.seed(6) obs3 <- rnorm(10) - # case 4 - set.seed(7) - exp4 <- array(rnorm(60), dim = c(sdate = 10, lon = 3, lat = 2)) - set.seed(8) - obs4 <- array(exp4 + rnorm(60) / 2, dim = dim(exp4)) - - # case 5: memb_dim - set.seed(1) - exp5 <- array(rnorm(45), dim = c(sdate = 3, dataset = 5, member = 3)) - set.seed(2) - obs5 <- array(rnorm(3), dim = c(sdate = 3, dataset = 1, member = 1)) - set.seed(3) - ref5 <- array(rnorm(6), dim = c(sdate = 3, member = 2)) - ############################################## @@ -60,8 +46,9 @@ test_that("1. Input checks", { "Parameter 'exp' and 'obs' must have dimension names." ) expect_error( - RMSSS(array(1:10, dim = c(a = 3, c = 5)), array(1:4, dim = c(a = 3, b = 5)), time_dim = 'a', dat_dim = NULL), - "Parameter 'exp' and 'obs' must have same dimension name" + RMSSS(array(1:20, dim = c(dat = 2, sdate = 5, member = 2)), + array(1:10, dim = c(dat = 2, sdate = 5, time = 1))), + "Parameter 'exp' and 'obs' must have the same dimension names." ) ## time_dim expect_error( @@ -104,13 +91,13 @@ test_that("1. Input checks", { ) ## exp and obs (2) expect_error( - RMSSS(exp = array(1:10, dim = c(sdate = 1, dataset = 5, a = 1)), - obs = array(1:4, dim = c(a = 1, sdate = 2, dataset = 2))), - "Parameter 'exp' and 'obs' must have same length of all dimension except 'memb_dim' and 'dat_dim'." + RMSSS(exp = array(1:10, dim = c(dataset = 1, member = 5, sdate = 1)), + obs = array(1:4, dim = c(dataset = 2, member = 2, sdate = 2))), + "Parameter 'exp' and 'obs' must have same length of all dimensions except 'dat_dim' and 'memb_dim'." ) expect_error( RMSSS(exp = array(1:10, dim = c(sdate = 1, dataset = 5, a = 1)), - obs = array(1:4, dim = c(a = 1, sdate = 1, dataset = 2))), + obs = array(1:4, dim = c(a = 1, sdate = 1, dataset = 2)), dat_dim = "dataset"), "The length of time_dim must be more than 2 to compute RMSSS." ) }) @@ -128,8 +115,8 @@ test_that("2. Output checks: case 1", { c(nexp = 5, nobs = 2) ) expect_equal( - mean(res1_1$rmsss), - -0.5449538, + c(res1_1$rmsss)[3:8], + c(0.01693900, -0.02730428, -0.34167869, 0.01588531, -1.55611403, -0.63596896), tolerance = 0.00001 ) expect_equal( @@ -138,20 +125,27 @@ test_that("2. Output checks: case 1", { tolerance = 0.001 ) - exp1_2 <- exp1 - exp1_2[2:4] <- NA - obs1_2 <- obs1 - obs1_2[1:2] <- NA - res1_2 <- RMSSS(exp1_2, obs1_2, dat_dim = 'dataset', pval = TRUE) + exp1_2 <- exp1; exp1_2[2:4] <- NA + obs1_2 <- obs1; obs1_2[1:2] <- NA + res1_2 <- RMSSS(exp1_2, obs1_2, dat_dim = 'dataset', pval = T, sign = T) expect_equal( - length(res1_2$rmsss[which(is.na(res1_2$rmsss))]), - 7 + names(res1_2), + c("rmsss", "p.val", "sign") ) expect_equal( - range(res1_2$p.val, na.rm = T), - c(0.7159769, 0.8167073), - tolerance = 0.00001 + c(res1_2$rmsss), + c(rep(NA, 7), -0.6359690, -0.5877153, -1.1108657), + tolerance = 0.0001 + ) + expect_equal( + c(res1_2$p.val), + c(rep(NA, 7), 0.7279944, 0.7159769, 0.8167073), + tolerance = 0.0001 + ) + expect_equal( + c(res1_2$sign), + c(rep(NA, 7), rep(FALSE, 3)) ) #ref @@ -200,113 +194,89 @@ test_that("2. Output checks: case 1", { # Random Walk suppressWarnings({ - res1_5 <- RMSSS(exp1, obs1, ref = ref1_1, dat_dim = 'dataset', sig_method = "Random Walk", pval = F, sign = T) + res1_5 <- RMSSS(exp1, obs1, ref = ref1_1, dat_dim = 'dataset', sig_method = "Random Walk", pval = F, sign = T, sig_method.type = 'two.sided.approx') }) expect_equal( as.vector(res1_5$sign), rep(F, 10) ) suppressWarnings({ - res1_6 <- RMSSS(exp1, obs1, ref = NULL, dat_dim = 'dataset', sig_method = "Random Walk", pval = F, sign = T) + res1_6 <- RMSSS(exp1, obs1, ref = NULL, dat_dim = 'dataset', sig_method = "Random Walk", pval = F, sign = T, sig_method.type = 'two.sided.approx') }) expect_equal( as.vector(res1_6$sign), rep(F, 10) ) -}) - - -############################################## -test_that("3. Output checks: case 2", { - + suppressWarnings({ + res1_7 <- RMSSS(exp1, obs1, ref = NULL, dat_dim = 'dataset', sig_method = "Random Walk", pval = T, sign = T, sig_method.type = 'two.sided', alpha = 0.4) + }) expect_equal( - dim(RMSSS(exp2, obs2, time_dim = 'time')$rmsss), - c(nexp = 2, nobs = 1, dat = 1, lon = 3, lat = 2) + names(res1_7), + c('rmsss', 'p.val', 'sign') ) expect_equal( - mean(RMSSS(exp2, obs2, time_dim = 'time')$rmsss), - -0.3912208, - tolerance = 0.00001 + res1_7$rmsss, + res1_6$rmsss ) expect_equal( - range(RMSSS(exp2, obs2, time_dim = 'time')$p.val), - c(0.2627770, 0.9868412), - tolerance = 0.00001 - ) - -}) - -############################################## - -test_that("4. Output checks: case 3", { - - expect_equal( - dim(RMSSS(exp3, obs3)$rmsss), - c(nexp = 1, nobs = 1) + c(res1_7$p[, 2]), + c(1, 0.25, 0.25, 1, 1), + tolerance = 0.0001 ) expect_equal( - as.vector(RMSSS(exp3, obs3)$rmsss), - -0.6289915, - tolerance = 0.00001 + c(res1_7$sign[, 2]), + c(F, T, T, F, F) ) - }) -############################################## -test_that("5. Output checks: case 4", { +############################################## +test_that("3. Output checks: case 2", { + res1 <- RMSSS(exp2, obs2, ref2, dat_dim = "dataset", memb_dim = 'member', sign = T) expect_equal( - dim(RMSSS(exp4, obs4, dat_dim = NULL)$rmsss), - c(lon = 3, lat = 2) - ) - expect_equal( - dim(RMSSS(exp4, obs4, dat_dim = NULL)$p.val), - c(lon = 3, lat = 2) - ) - expect_equal( - as.vector(RMSSS(exp4, obs4, dat_dim = NULL)$rmsss), - c(0.5393823, 0.6818405, 0.4953423, 0.4093817, 0.5972085, 0.5861135), - tolerance = 0.00001 - ) - expect_equal( - as.vector(RMSSS(exp4, obs4, dat_dim = NULL)$p.val), - c(0.015203983, 0.001091360, 0.026987112, 0.066279877, 0.006161059, 0.007437649), - tolerance = 0.00001 + dim(res1$rmsss), + c(nexp = 3, nobs = 2) ) expect_equal( - names(RMSSS(exp4, obs4, dat_dim = NULL)), - c('rmsss', 'p.val') + dim(res1$sign), + c(nexp = 3, nobs = 2) ) - expect_equal( - names(RMSSS(exp4, obs4, dat_dim = NULL, sign = T, pval = F)), - c('rmsss', 'sign') + expect_equal( + dim(res1$p), + c(nexp = 3, nobs = 2) ) expect_equal( - names(RMSSS(exp4, obs4, dat_dim = NULL, sign = T, pval = T)), - c('rmsss', 'p.val', 'sign') + c(res1$rmsss), + c(-0.08699446, 0.30703938, -0.28888291, -0.04284967, 0.39097270, -0.39792484), + tolerance = 0.0001 ) expect_equal( - as.vector(RMSSS(exp4, obs4, dat_dim = NULL, sign = T, pval = F)$sign), - c(T, T, T, F, T, T) + c(res1$p), + c(0.5622736, 0.2474466, 0.6825138, 0.5314309, 0.1799964, 0.7338230), + tolerance = 0.0001 ) - expect_equal( - as.vector(RMSSS(exp4, obs4, dat_dim = NULL, sign = T, pval = F, alpha = 0.01)$sign), - c(F, T, F, F, T, T) + + + res2 <- RMSSS(apply(exp2, 1:2, mean), apply(obs2, 1:2, mean), array(apply(ref2, 1, mean), dim = c(sdate = 5)),dat_dim = "dataset", sign = T) + expect_equal( + res1, res2 ) - + }) ############################################## -test_that("6. Output checks: case 5", { - res5_1 <- RMSSS(exp5, obs5, ref = ref5, dat_dim = 'dataset', memb_dim = 'member') - res5_2 <- RMSSS(s2dv::MeanDims(exp5, 'member'), s2dv::MeanDims(obs5, 'member'), - ref = s2dv::MeanDims(ref5, 'member'), dat_dim = 'dataset') + +test_that("4. Output checks: case 3", { + expect_equal( - res5_1, - res5_2 + dim(RMSSS(exp3, obs3)$rmsss), + NULL + ) + expect_equal( + as.vector(RMSSS(exp3, obs3)$rmsss), + -0.6289915, + tolerance = 0.00001 ) - }) - diff --git a/tests/testthat/test-ROCSS.R b/tests/testthat/test-ROCSS.R index a95d0ba1b80f637556f4d0b014880d8d930a0b84..5b35246a5e976322bc5e6717dc76fd5ab48cab13 100644 --- a/tests/testthat/test-ROCSS.R +++ b/tests/testthat/test-ROCSS.R @@ -1,5 +1,3 @@ -context("s2dv::ROCSS tests") - ############################################## # dat1 @@ -10,6 +8,11 @@ obs1 <- array(rnorm(20), dim = c(sdate = 10, lat = 2)) set.seed(3) ref1 <- array(rnorm(40), dim = c(member = 2, sdate = 10, lat = 2)) +# dat1_2 +exp1_2 <- GetProbs(exp1, memb_dim = 'member') +obs1_2 <- GetProbs(obs1, memb_dim = NULL) +ref1_2 <- GetProbs(ref1, memb_dim = 'member') + # dat2 set.seed(1) exp2 <- array(rnorm(30), dim = c(member = 3, sdate = 10)) @@ -151,6 +154,16 @@ c(0.5238095, 0.5357143), tolerance = 0.0001 ) +# dat1_2 +expect_equal( + ROCSS(exp1, obs1), + ROCSS(exp1_2, obs1_2, memb_dim = NULL, cat_dim = 'bin') +) +expect_equal( + ROCSS(exp1, obs1, ref1), + ROCSS(exp1_2, obs1_2, ref1_2, memb_dim = NULL, cat_dim = 'bin') +) + }) ############################################## diff --git a/tests/testthat/test-RPS.R b/tests/testthat/test-RPS.R index 51ba992d93b3e1ee4828bb095bbe363126a8b716..d87da87eb4785fa64604671f36d129f676e82823 100644 --- a/tests/testthat/test-RPS.R +++ b/tests/testthat/test-RPS.R @@ -1,5 +1,3 @@ -context("s2dv::RPS tests") - ############################################## # dat1 @@ -10,6 +8,10 @@ obs1 <- array(rnorm(20), dim = c(sdate = 10, lat = 2)) set.seed(3) weights1 <- array(abs(rnorm(30)), dim = c(member = 3, sdate = 10)) +# dat1_2: probabilites +exp1_2 <- GetProbs(exp1, memb_dim = 'member') +obs1_2 <- GetProbs(obs1, memb_dim = NULL) + # dat2 set.seed(1) exp2 <- array(rnorm(20), dim = c(member = 2, sdate = 10)) @@ -26,10 +28,18 @@ obs2_1 <- array(rnorm(10), dim = c(member = 1, sdate = 10)) set.seed(1) exp3 <- array(rnorm(40), dim = c(member = 2, sdate = 10, dataset = 2)) set.seed(2) -obs3 <- array(rnorm(30), dim = c(member = 1, sdate = 10, dataset = 3)) +obs3 <- array(rnorm(60), dim = c(member = 3, sdate = 10, dataset = 3)) set.seed(3) weights3 <- array(abs(rnorm(30)), dim = c(member = 2, sdate = 10, dataset = 2)) +# dat4 +exp4 <- exp3 +obs4 <- obs3 +obs4[2, 1, 1] <- NA +obs4[3, 2, 1] <- NA +exp4[1, 1, 2] <- NA +weights4 <- weights3 + ############################################## test_that("1. Input checks", { # exp and obs (1) @@ -60,6 +70,12 @@ test_that("1. Input checks", { RPS(exp1, obs1, memb_dim = 'memb'), "Parameter 'memb_dim' is not found in 'exp' dimension." ) + # cat_dim + expect_error( + RPS(exp1_2, obs1_2, memb_dim = NULL), + "Only one of the two parameters 'memb_dim' and 'cat_dim' can have value." + ) + # exp, ref, and obs (2) expect_error( RPS(exp1, array(1:9, dim = c(sdate = 9))), @@ -160,6 +176,10 @@ test_that("2. Output checks: dat1", { c(0.3559286, 0.6032109), tolerance = 0.0001 ) + expect_equal( + as.vector(RPS(exp1, obs1)), + as.vector(RPS(exp1_2, obs1_2, memb_dim = NULL, cat_dim = 'bin')) + ) }) @@ -187,7 +207,7 @@ test_that("3. Output checks: dat2", { }) ############################################## -test_that("3. Output checks: dat3", { +test_that("4. Output checks: dat3", { expect_equal( dim(RPS(exp3, obs3, dat_dim = 'dataset')), @@ -195,12 +215,12 @@ test_that("3. Output checks: dat3", { ) expect_equal( as.vector(RPS(exp3, obs3, dat_dim = 'dataset')), - c(0.75, 0.65, 0.75, 0.85, 0.75, 0.55), + c(0.3388889, 0.3388889, 0.2944444, 0.3277778, 0.3388889, 0.3388889), tolerance = 0.0001 ) expect_equal( as.vector(RPS(exp3, obs3, dat_dim = 'dataset', indices_for_clim = 2:5, prob_thresholds = seq(0.1, 0.9, 0.1))), - c(2.75, 2.45, 2.55, 2.55, 2.65, 2.15), + c(1.394444, 1.394444, 1.250000, 1.316667, 1.394444, 1.394444), tolerance = 0.0001 ) # weights @@ -210,8 +230,77 @@ test_that("3. Output checks: dat3", { ) expect_equal( as.vector(RPS(exp3, obs3, weights = weights3, dat_dim = 'dataset')), - c(0.7365024, 0.8316852, 0.6778686, 1.0256509, 0.8406320, 0.6385640), + c(0.3255765, 0.4290578, 0.2917297, 0.3554689, 0.3255765, 0.4290578), tolerance = 0.0001 ) }) + +############################################## +test_that("5. Output checks: dat4", { + + res1 <- RPS(exp4, obs4, dat_dim = 'dataset') + + expect_equal( + which(is.na(res1)), + c(1, 2, 4, 6) + ) + expect_equal( + res1[1, 2:3], + RPS(exp3, obs3, dat_dim = 'dataset')[1, 2:3] + ) + + res2 <- RPS(exp4, obs4, dat_dim = 'dataset', na.rm = T) + + expect_equal( + res2, + RPS(exp4, obs4, dat_dim = 'dataset', na.rm = 0) + ) + expect_equal( + which(is.na(res2)), + integer(0) + ) + expect_equal( + c(res2), + c(0.3472222, 0.3680556, 0.2944444, 0.3333333, 0.3388889, 0.2222222), + tolerance = 0.0001 + ) + + expect_equal( + RPS(exp4, obs4, dat_dim = 'dataset', na.rm = 0.8), + res2 + ) + expect_equal( + RPS(exp4, obs4, dat_dim = 'dataset', na.rm = 0.95), + res1 + ) + expect_equal( + which(is.na(RPS(exp4, obs4, dat_dim = 'dataset', na.rm = 0.9))), + c(1, 2) + ) + + # weights + res3 <- RPS(exp4, obs4, weights = weights4, dat_dim = 'dataset') + expect_equal( + which(is.na(res3)), + c(1, 2, 4, 6) + ) + expect_equal( + res3[1, 2:3], + RPS(exp3, obs3, dat_dim = 'dataset',weights = weights3)[1, 2:3] + ) + + res4 <- RPS(exp4, obs4, weights = weights4, dat_dim = 'dataset', na.rm = 0) + expect_equal( + c(res4), + c(0.3865228, 0.4885273, 0.2917297, 0.4143631, 0.3255765, 0.4028817), + tolerance = 0.0001 + ) + + expect_equal( + which(is.na(RPS(exp4, obs4, weights = weights4, dat_dim = 'dataset', na.rm = 0.9))), + c(1, 2) + ) + +}) + diff --git a/tests/testthat/test-RPSS.R b/tests/testthat/test-RPSS.R index 36efee8df0c317e06a46439643da8dc81a0561ee..5e90e954ff48770d4f1fd848ae27b082e196093c 100644 --- a/tests/testthat/test-RPSS.R +++ b/tests/testthat/test-RPSS.R @@ -1,5 +1,3 @@ -context("s2dv::RPSS tests") - ############################################## # dat1 @@ -12,6 +10,16 @@ ref1 <- array(rnorm(40), dim = c(member = 2, sdate = 10, lat = 2)) set.seed(4) weights1 <- array(abs(rnorm(30)), dim = c(member = 3, sdate = 10)) +# dat1_2 +exp1_2 <- GetProbs(exp1, memb_dim = 'member') +obs1_2 <- GetProbs(obs1, memb_dim = NULL) +ref1_2 <- GetProbs(ref1, memb_dim = 'member') + +# dat1_3: NAs +exp1_3 <- exp1; exp1_3[1, 2, 1] <- NA +obs1_3 <- obs1; obs1_3[2, 1] <- NA +ref1_3 <- ref1; ref1_3[1, 3, 1] <- NA + # dat2 set.seed(1) exp2 <- array(rnorm(20), dim = c(member = 2, sdate = 10)) @@ -46,6 +54,12 @@ weights_exp4 <- array(abs(rnorm(60)), dim = c(member = 2, sdate = 10, dataset = set.seed(5) weights_ref4 <- array(abs(rnorm(20)), dim = c(member = 2, sdate = 10)) +# dat4_2: NAs +exp4_2 <- exp4; exp4_2[1, 2, 1, 1] <- NA +obs4_2 <- obs4; obs4_2[1, 1:4, 1, 1] <- NA +ref4_2 <- ref4 + + ############################################## test_that("1. Input checks", { # exp and obs (1) @@ -185,7 +199,7 @@ test_that("2. Output checks: dat1", { ) expect_equal( as.vector(RPSS(exp1, obs1)$sign), - c(FALSE, FALSE), + c(FALSE, FALSE) ) expect_equal( as.vector(RPSS(exp1, obs1, Fair = T)$rpss), @@ -255,6 +269,65 @@ test_that("2. Output checks: dat1", { tolerance = 0.0001 ) + # dat1_2 + expect_equal( + RPSS(exp1, obs1), + RPSS(exp1_2, obs1_2, memb_dim = NULL, cat_dim = 'bin') + ) + expect_equal( + RPSS(exp1, obs1, ref1), + RPSS(exp1_2, obs1_2, ref1_2, memb_dim = NULL, cat_dim = 'bin') + ) + + # dat1_3 + expect_equal( + as.vector(RPSS(exp1_3, obs1_3)$rpss), + c(NA, -0.05263158), + tolerance = 0.0001 + ) + expect_equal( + as.vector(RPSS(exp1_3, obs1_3)$sign), + c(NA, FALSE) + ) + expect_equal( + as.vector(RPSS(exp1_3, obs1_3, na.rm = T)$rpss), + c(0.16666667, -0.05263158), + tolerance = 0.0001 + ) + expect_equal( + as.vector(RPSS(exp1_3, obs1_3, na.rm = T)$sign), + c(F, F) + ) + expect_equal( + as.vector(RPSS(exp1_3, obs1_3, na.rm = 0.9)$sign), + c(F, F) + ) + + # sig_method.type + expect_equal( + c(RPSS(exp1, obs1, alpha = 0.05, sig_method.type = "two.sided")$sign), + c(F, F) + ) + expect_equal( + c(RPSS(exp1 + 1:60, obs = obs1 + 1:20, alpha = 0.025, sig_method.type = "two.sided")$sign), + c(T, T) + ) + expect_equal( + c(RPSS(exp1 + 1:60, obs = obs1 + 1:20, alpha = 0.01, sig_method.type = "two.sided")$sign), + c(F, F) + ) + expect_equal( + c(RPSS(exp1, obs1, alpha = 0.01, sig_method.type = "greater")$sign), + c(F, F) + ) + expect_equal( + c(RPSS(exp1, obs1, alpha = 0.4, sig_method.type = "greater")$sign), + c(F, F) + ) + expect_equal( + c(RPSS(exp1, obs1, alpha = 0.4, sig_method.type = "less")$sign), + c(T, T) + ) }) @@ -293,7 +366,7 @@ test_that("3. Output checks: dat2", { ) expect_equal( as.vector(RPSS(exp2, obs2)$sign), - FALSE, + FALSE ) expect_equal( as.vector(RPSS(exp2, obs2, Fair = T)$rpss), @@ -363,7 +436,7 @@ test_that("4. Output checks: dat3", { ) expect_equal( as.vector(RPSS(exp3, obs3, dat_dim = 'dataset')$sign)[1:3], - c(FALSE, FALSE, TRUE), + c(FALSE, FALSE, TRUE) ) expect_equal( mean(RPSS(exp3, obs3, dat_dim = 'dataset', weights_exp = weights3, Fair = T)$rpss), @@ -393,6 +466,21 @@ test_that("4. Output checks: dat3", { RPSS(exp3, obs3, dat_dim = 'dataset')$rpss[1], RPSS(exp2, obs2)$rpss ) + + # sig_method.type + expect_equal( + c(RPSS(exp3, obs3, dat_dim = 'dataset', alpha = 0.05, sig_method.type = "two.sided")$sign), + c(F,F,T,F,F,F) + ) + expect_equal( + c(RPSS(exp3, obs3, dat_dim = 'dataset', alpha = 0.01, sig_method.type = "two.sided")$sign), + rep(F, 6) + ) + expect_equal( + c(RPSS(exp3, obs3, dat_dim = 'dataset', alpha = 0.05, sig_method.type = "greater")$sign), + rep(F, 6) + ) + }) ############################################## @@ -421,4 +509,38 @@ test_that("5. Output checks: dat4", { RPSS(exp3, obs3, weights_exp = weights3, dat_dim = 'dataset')$rpss ) + # dat4_2: NAs + expect_equal( + as.vector(RPSS(exp4_2, obs4_2, ref4_2, dat_dim = 'dataset')$rpss[, , 1]), + c(NA, NA, NA, NA, c(-0.3076923, 0.1538462)), + tolerance = 0.0001 + ) + expect_equal( + as.vector(RPSS(exp4_2, obs4_2, ref4_2, dat_dim = 'dataset')$rpss[, , 2]), + c(0, 0.1333333, -0.4, 0.1176471, 0, 0.3529412), + tolerance = 0.0001 + ) + expect_equal( + which(is.na(RPSS(exp4_2, obs4_2, ref4_2, dat_dim = 'dataset')$rpss)), + which(is.na(RPSS(exp4_2, obs4_2, ref4_2, dat_dim = 'dataset')$sign)) + ) + expect_equal( + as.vector(RPSS(exp4_2, obs4_2, ref4_2, dat_dim = 'dataset', na.rm = 0.9)$rpss[, , 1]), + c(NA, NA, NA, -0.3333333, -0.3076923, 0.1538462), + tolerance = 0.0001 + ) + expect_equal( + which(is.na(RPSS(exp4_2, obs4_2, ref4_2, dat_dim = 'dataset', na.rm = 0.9)$rpss)), + which(is.na(RPSS(exp4_2, obs4_2, ref4_2, dat_dim = 'dataset', na.rm = 0.9)$sign)) + ) + expect_equal( + as.vector(RPSS(exp4_2, obs4_2, ref4_2, dat_dim = 'dataset', na.rm = 0.6)$rpss[, , 1]), + c(0.25, 0.1666667, -0.1666667, -0.3333333, -0.3076923, 0.1538462), + tolerance = 0.0001 + ) + expect_equal( + which(is.na(RPSS(exp4_2, obs4_2, ref4_2, dat_dim = 'dataset', na.rm = 0.6)$rpss)), + which(is.na(RPSS(exp4_2, obs4_2, ref4_2, dat_dim = 'dataset', na.rm = 0.6)$sign)) + ) + }) diff --git a/tests/testthat/test-RandomWalkTest.R b/tests/testthat/test-RandomWalkTest.R index a0462c890c4052fada9b9516afd636e94f0baf3c..7ef24b5a014e63beca310092346b7faf3748f90c 100644 --- a/tests/testthat/test-RandomWalkTest.R +++ b/tests/testthat/test-RandomWalkTest.R @@ -1,5 +1,3 @@ -context("s2dv::RandomWalkTest tests") - ############################################## #dat1 set.seed(1) diff --git a/tests/testthat/test-RatioPredictableComponents.R b/tests/testthat/test-RatioPredictableComponents.R index b6a080835fa7fbfd2ccd51c756c2c1b9124cd1fb..7edaf2012cafe12ffdd35bff9126e4aa18c50b82 100644 --- a/tests/testthat/test-RatioPredictableComponents.R +++ b/tests/testthat/test-RatioPredictableComponents.R @@ -1,5 +1,3 @@ -context("RatioPredictableComponents test") - ############################################## # dat1 set.seed(1) @@ -37,8 +35,8 @@ test_that("1. Input checks", { "'exp' must have 'time_dim' dimension." ) expect_error( - RatioPredictableComponents(exp1, obs1, member_dim = 'ens'), - "'exp' must have 'member_dim' dimension." + RatioPredictableComponents(exp1, obs1, memb_dim = 'ens'), + "'exp' must have 'memb_dim' dimension." ) expect_error( RatioPredictableComponents(exp1, array(rnorm(6), dim = c(sdate = 3, time = 2))), diff --git a/tests/testthat/test-RatioRMS.R b/tests/testthat/test-RatioRMS.R index 11df46ccf04d14e305059163e5c5d670906e57bc..35bb651ffcabaa5b73546f7715013fe4308ec72b 100644 --- a/tests/testthat/test-RatioRMS.R +++ b/tests/testthat/test-RatioRMS.R @@ -1,5 +1,3 @@ -context("s2dv::RatioRMS tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-RatioSDRMS.R b/tests/testthat/test-RatioSDRMS.R index 5dbc171337d4bec4fe9db6e2c491ec527511efc3..53be127b509d90e8197848718e45dc9da6800067 100644 --- a/tests/testthat/test-RatioSDRMS.R +++ b/tests/testthat/test-RatioSDRMS.R @@ -1,11 +1,11 @@ -context("s2dv::RatioSDRMS tests") - ############################################## # dat1 set.seed(1) exp1 <- array(rnorm(40), dim = c(dataset = 2, member = 2, sdate = 5, ftime = 2)) set.seed(2) obs1 <- array(rnorm(10), dim = c(dataset = 1, member = 1, sdate = 5, ftime = 2)) + obs1_2 <- obs1 + dim(obs1_2) <- dim(obs1_2)[-2] # dat2 exp2 <- exp1 @@ -39,10 +39,6 @@ test_that("1. Input checks", { "Parameter 'exp' and 'obs' must have dimension names." ) expect_error( - RatioSDRMS(array(1:10, dim = c(a = 2, c = 5)), array(1:4, dim = c(a = 2, b = 2))), - "Parameter 'exp' and 'obs' must have the same dimension names." - ) - expect_error( RatioSDRMS(exp1, obs1, dat_dim = 1), "Parameter 'dat_dim' must be a character string." ) @@ -51,27 +47,27 @@ test_that("1. Input checks", { "Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension." ) expect_error( - RatioSDRMS(exp1, obs1, memb_dim = 1), + RatioSDRMS(exp1, obs1, memb_dim = 1, dat_dim = 'dataset'), "Parameter 'memb_dim' must be a character string." ) expect_error( - RatioSDRMS(exp1, obs1, memb_dim = 'a'), - "Parameter 'memb_dim' is not found in 'exp' or 'obs' dimension." + RatioSDRMS(exp1, obs1, memb_dim = 'a', dat_dim = 'dataset'), + "Parameter 'memb_dim' is not found in 'exp' nor 'obs' dimension. Set it as NULL if there is no member dimension." ) expect_error( - RatioSDRMS(exp1, obs1, time_dim = c('sdate', 'a')), + RatioSDRMS(exp1, obs1, time_dim = c('sdate', 'a'), dat_dim = 'dataset'), "Parameter 'time_dim' must be a character string." ) expect_error( - RatioSDRMS(exp1, obs1, time_dim = 'a'), + RatioSDRMS(exp1, obs1, time_dim = 'a', dat_dim = 'dataset'), "Parameter 'time_dim' is not found in 'exp' or 'obs' dimension." ) expect_error( - RatioSDRMS(exp1, obs1, pval = 1), + RatioSDRMS(exp1, obs1, pval = 1, dat_dim = 'dataset'), "Parameter 'pval' must be one logical value." ) expect_error( - RatioSDRMS(exp1, obs1, ncores = 1.5), + RatioSDRMS(exp1, obs1, ncores = 1.5, dat_dim = 'dataset'), "Parameter 'ncores' must be a positive integer." ) expect_error( @@ -85,52 +81,56 @@ test_that("1. Input checks", { ############################################## test_that("2. Output checks: dat1", { expect_equal( -names(RatioSDRMS(exp1, obs1)), +names(RatioSDRMS(exp1, obs1, dat_dim = 'dataset')), c('ratio', 'p.val') ) expect_equal( -dim(RatioSDRMS(exp1, obs1)$ratio), +dim(RatioSDRMS(exp1, obs1, dat_dim = 'dataset')$ratio), c(nexp = 2, nobs = 1, ftime = 2) ) expect_equal( -dim(RatioSDRMS(exp1, obs1)$p.val), +dim(RatioSDRMS(exp1, obs1, dat_dim = 'dataset')$p.val), c(nexp = 2, nobs = 1, ftime = 2) ) expect_equal( -as.vector(RatioSDRMS(exp1, obs1)$ratio), +as.vector(RatioSDRMS(exp1, obs1, dat_dim = 'dataset')$ratio), c(0.7198164, 0.6525068, 0.6218262, 0.6101527), tolerance = 0.0001 ) expect_equal( -as.vector(RatioSDRMS(exp1, obs1)$p.val), +as.vector(RatioSDRMS(exp1, obs1, dat_dim = 'dataset')$p.val), c(0.8464094, 0.8959219, 0.9155102, 0.9224119), tolerance = 0.0001 ) expect_equal( -names(RatioSDRMS(exp1, obs1, pval = F)), +names(RatioSDRMS(exp1, obs1, pval = F, dat_dim = 'dataset')), c('ratio') ) expect_equal( -as.vector(RatioSDRMS(exp1, obs1)$ratio), -as.vector(RatioSDRMS(exp1, obs1, pval = F)$ratio) +as.vector(RatioSDRMS(exp1, obs1, dat_dim = 'dataset')$ratio), +as.vector(RatioSDRMS(exp1, obs1, pval = F, dat_dim = 'dataset')$ratio) ) +expect_equal( +RatioSDRMS(exp1, obs1, dat_dim = 'dataset'), +RatioSDRMS(exp1, obs1_2, dat_dim = 'dataset') +) }) ############################################## test_that("3. Output checks: dat2", { expect_equal( -dim(RatioSDRMS(exp2, obs2)$ratio), +dim(RatioSDRMS(exp2, obs2, dat_dim = 'dataset')$ratio), c(nexp = 2, nobs = 1, ftime = 2) ) expect_equal( -as.vector(RatioSDRMS(exp2, obs2)$ratio), +as.vector(RatioSDRMS(exp2, obs2, dat_dim = 'dataset')$ratio), c(0.7635267, 0.6525068, 0.6218262, 0.6101527), tolerance = 0.0001 ) expect_equal( -as.vector(RatioSDRMS(exp1, obs1)$p.val), -c(0.8464094, 0.8959219, 0.9155102, 0.9224119), +as.vector(RatioSDRMS(exp2, obs2, dat_dim = 'dataset')$p.val), +c(0.7970868, 0.8959219, 0.9155102, 0.9224119), tolerance = 0.0001 ) diff --git a/tests/testthat/test-Regression.R b/tests/testthat/test-Regression.R index a29076f5ed9be19cb0cb7b776fc9d7c2344ac349..a94a864b0b3bdc41ab151ff3b14acbf8e8f8160b 100644 --- a/tests/testthat/test-Regression.R +++ b/tests/testthat/test-Regression.R @@ -1,5 +1,3 @@ -context("s2dv::Regression tests") - ############################################## # dat1 set.seed(1) @@ -83,8 +81,8 @@ test_that("1. Input checks", { "Parameter 'conf' must be one logical value." ) expect_error( - Regression(datay1, datax1, conf.lev = 1.5), - "Parameter 'conf.lev' must be a numeric number between 0 and 1." + Regression(datay1, datax1, alpha = 2), + "Parameter 'alpha' must be a numeric number between 0 and 1." ) expect_error( Regression(datay1, datax1, ncores = T), @@ -127,7 +125,11 @@ test_that("2. Output checks: dat1", { 2 ) expect_equal( - range(Regression(datay1, datax1, conf.lev = 0.99)$conf.low, na.rm = T), + length(Regression(datay1, datax1, pval = F, sign = T)), + 5 + ) + expect_equal( + range(Regression(datay1, datax1, alpha = 0.01)$conf.low, na.rm = T), c(-380.888744, 0.220794), tolerance = 0.001 ) @@ -136,6 +138,10 @@ test_that("2. Output checks: dat1", { 0.005335, tolerance = 0.0001 ) + expect_equal( + c(Regression(datay1, datax1, sign = T, conf = F)$sign[1, 2, ]), + c(TRUE, FALSE, FALSE, FALSE) + ) expect_equal( mean(Regression(datay1, datax1, formula = y~poly(x, 2, raw = T))$p.val, na.rm = TRUE), 0.3407307, diff --git a/tests/testthat/test-Reorder.R b/tests/testthat/test-Reorder.R index b17259e672e21443ab010c03a6a97ded3c8c0505..7401b2a6357bdf53ac43008682700d8e5e65acf9 100644 --- a/tests/testthat/test-Reorder.R +++ b/tests/testthat/test-Reorder.R @@ -1,5 +1,3 @@ -context("s2dv::Reorder tests") - ############################################## # dat1 dat1 <- array(c(1:30), dim = c(dat = 1, sdate = 3, ftime = 2, lon = 5)) diff --git a/tests/testthat/test-ResidualCorr.R b/tests/testthat/test-ResidualCorr.R index be71b4748295f70ddc43f7c3d3bccdf1d1f60100..f4b6a49a9d8c40881b9dca2d2e3bb78138393b08 100644 --- a/tests/testthat/test-ResidualCorr.R +++ b/tests/testthat/test-ResidualCorr.R @@ -1,5 +1,3 @@ -context("s2dv::ResidualCorr tests") - ############################################## # dat1 @@ -75,8 +73,8 @@ test_that("1. Input checks", { ) # alpha expect_error( - ResidualCorr(exp2, obs2, ref2, alpha = 1), - 'Parameter "alpha" must be NULL or a number between 0 and 1.' + ResidualCorr(exp2, obs2, ref2, alpha = 1, sign = T), + 'Parameter "alpha" must be a number between 0 and 1.' ) # handle.na expect_error( @@ -116,7 +114,7 @@ c(0.49695468, 0.05446055, 0.25203961, 0.23522967, 0.16960864, 0.10618145), tolerance = 0.0001 ) expect_equal( -names(ResidualCorr(exp1, obs1, ref1, memb_dim = 'memb', alpha = 0.05)), +names(ResidualCorr(exp1, obs1, ref1, memb_dim = 'memb', alpha = 0.05, sign = T, pval = F)), c("res.corr", "sign") ) expect_equal( @@ -125,7 +123,7 @@ c(0.002784318, 0.537697647, -0.240071018, 0.258706464, 0.338160748, 0.432107476) tolerance = 0.0001 ) expect_equal( -as.vector(ResidualCorr(exp1, obs1, ref1, memb_dim = 'memb', alpha = 0.05)$sign), +as.vector(ResidualCorr(exp1, obs1, ref1, memb_dim = 'memb', alpha = 0.05, sign = T)$sign), rep(FALSE, 6) ) expect_equal( diff --git a/tests/testthat/test-SPOD.R b/tests/testthat/test-SPOD.R index 8b56c721dc618409e5c14084262eb71aa07566af..91fe8cc82c678e4b1c6bd5a257f27c36f65c0a9c 100644 --- a/tests/testthat/test-SPOD.R +++ b/tests/testthat/test-SPOD.R @@ -1,5 +1,3 @@ -context("s2dv::SPOD tests") - ############################################## lat <- seq(-90, 90, 10) lon <- seq(0, 360, 10) diff --git a/tests/testthat/test-Season.R b/tests/testthat/test-Season.R index f9bfc43cf17e5564c9ccaba71212af5dc9e03fed..f33a96276f843cc837e9d529875d8fe41e580f21 100644 --- a/tests/testthat/test-Season.R +++ b/tests/testthat/test-Season.R @@ -1,5 +1,3 @@ -context("s2dv::Season tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-SignalNoiseRatio.R b/tests/testthat/test-SignalNoiseRatio.R index 9fc5ce35478b9988d35523c4f19be5562f4185cf..be4749201d7cff48c8689274a4390a215caf6b8f 100644 --- a/tests/testthat/test-SignalNoiseRatio.R +++ b/tests/testthat/test-SignalNoiseRatio.R @@ -1,6 +1,3 @@ - -context("SignalNoiseRatio test") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-Smoothing.R b/tests/testthat/test-Smoothing.R index ad3dcb9599a82bc32ffa5e0ddc7caba460adbd15..51a956a256167c49b339b23645e970270c53a9ed 100644 --- a/tests/testthat/test-Smoothing.R +++ b/tests/testthat/test-Smoothing.R @@ -1,5 +1,3 @@ -context("s2dv::Smoothing test") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-Spectrum.R b/tests/testthat/test-Spectrum.R index caf53d395d7a3a3c1d374fa6ab1abbd80decbbc9..721c9b17cad6b5f1ae48e38ca64f7126964aa351 100644 --- a/tests/testthat/test-Spectrum.R +++ b/tests/testthat/test-Spectrum.R @@ -1,5 +1,3 @@ -context("s2dv::Spectrum tests") - ############################################## # dat1 set.seed(1) @@ -38,10 +36,10 @@ test_that("1. Input checks", { Spectrum(dat1, time_dim = 2), "Parameter 'time_dim' must be a character string." ) - # conf.lev + # alpha expect_error( - Spectrum(dat1, conf.lev = -1), - "Parameter 'conf.lev' must be a numeric number between 0 and 1.", + Spectrum(dat1, alpha = -1), + "Parameter 'alpha' must be a numeric number between 0 and 1.", fixed = T ) # ncores diff --git a/tests/testthat/test-Spread.R b/tests/testthat/test-Spread.R index 1d299a6f41149744d600648699f6eea70d718dae..dab8744d9e6970dcc8b1623e7e70770bcc588795 100644 --- a/tests/testthat/test-Spread.R +++ b/tests/testthat/test-Spread.R @@ -1,5 +1,3 @@ -context("s2dv::Spread test") - ############################################## # dat1 set.seed(1) @@ -45,10 +43,10 @@ test_that("1. Input checks", { Spread(dat1, conf = 0.1), "Parameter 'conf' must be one logical value." ) - # conf.lev + # alpha expect_error( - Spread(dat1, conf.lev = c(0.05, 0.95)), - "Parameter 'conf.lev' must be a numeric number between 0 and 1." + Spread(dat1, alpha = c(0.05, 0.95)), + "Parameter 'alpha' must be a numeric number between 0 and 1." ) # ncores expect_error( diff --git a/tests/testthat/test-StatSeasAtlHurr.R b/tests/testthat/test-StatSeasAtlHurr.R index 82ef308e03d777bedeebd7334ca5c42386b0c812..448eadfc52e5b4048f6ca625ea3a2f3d006e9fe9 100644 --- a/tests/testthat/test-StatSeasAtlHurr.R +++ b/tests/testthat/test-StatSeasAtlHurr.R @@ -1,5 +1,3 @@ -context("s2dv::StatSeaAtlHurr tests") - ############################################## # dat1 set.seed(1) diff --git a/tests/testthat/test-TPI.R b/tests/testthat/test-TPI.R index b663c40bd1e36b4792da5adb4fa95e7ea1aae7a8..bef5ef69d988fb199f83accccdf47d3bb38b7225 100644 --- a/tests/testthat/test-TPI.R +++ b/tests/testthat/test-TPI.R @@ -1,5 +1,3 @@ -context("s2dv::TPI tests") - ############################################## lat <- seq(-90, 90, 10) lon <- seq(0, 360, 10) diff --git a/tests/testthat/test-Trend.R b/tests/testthat/test-Trend.R index 07da6ce45c6b687c8e0ef9a7f746be1794d83b50..385534b67e4eaa0231a7edfa87377882ba2b1676 100644 --- a/tests/testthat/test-Trend.R +++ b/tests/testthat/test-Trend.R @@ -1,5 +1,3 @@ -context("s2dv::Trend tests") - ############################################## # dat1 dat1 <- array(c(-5, -7, -10:10, 12, 11, 7, 16), @@ -65,12 +63,12 @@ test_that("1. Input checks", { "Parameter 'conf' must be one logical value." ) expect_error( - Trend(dat1, conf.lev = 3), - "Parameter 'conf.lev' must be a numeric number between 0 and 1." + Trend(dat1, alpha = 3), + "Parameter 'alpha' must be a numeric number between 0 and 1." ) expect_error( - Trend(dat1, conf.lev = TRUE), - "Parameter 'conf.lev' must be a numeric number between 0 and 1." + Trend(dat1, alpha = TRUE), + "Parameter 'alpha' must be a numeric number between 0 and 1." ) expect_error( Trend(dat1, pval = 0.95), @@ -88,7 +86,14 @@ test_that("1. Input checks", { ############################################## test_that("2. Output checks: dat1", { - + expect_equal( + names(Trend(dat1)), + c('trend', 'conf.lower', 'conf.upper', 'p.val', 'detrended') + ) + expect_equal( + names(Trend(dat1, conf = F, sign = T)), + c('trend', 'p.val', 'sign', 'detrended') + ) expect_equal( Trend(dat1)$trend, array(c(-9.7692308, 0.6593407, 0.9615385, 0.7967033), @@ -107,6 +112,10 @@ test_that("2. Output checks: dat1", { dim = c(stats = 1, dat = 1, sdate = 2)), tolerance = 0.0001 ) + expect_equal( + Trend(dat1, sign = T)$sign, + array(c(T, T), dim = c(stats = 1, dat = 1, sdate = 2)) + ) expect_equal( median(Trend(dat1)$detrended, na.rm = TRUE), 0.1153846, diff --git a/tests/testthat/test-UltimateBrier.R b/tests/testthat/test-UltimateBrier.R index 09412f00cf1ce52265328f9da89f2f404081877a..28ccb79c6e3b3c4d363a8b7ad0bb677bfb45723c 100644 --- a/tests/testthat/test-UltimateBrier.R +++ b/tests/testthat/test-UltimateBrier.R @@ -1,5 +1,3 @@ -context("s2dv::UltimateBrier tests") - ############################################## # dat1 set.seed(1) @@ -60,46 +58,46 @@ test_that("1. Input checks", { ) # exp and obs (2) expect_error( - UltimateBrier(exp1, array(1:10, dim = c(dataset = 1, sdate = 6, ftime = 2))), + UltimateBrier(exp1, array(1:10, dim = c(dataset = 1, sdate = 6, ftime = 2)), dat_dim = 'dataset'), paste0("Parameter 'exp' and 'obs' must have the same names and lengths ", "of all the dimensions except 'dat_dim' and 'memb_dim'.") ) expect_error( - UltimateBrier(exp1, array(1:10, dim = c(dataset = 1, sdate = 5, time = 2))), + UltimateBrier(exp1, array(1:10, dim = c(dataset = 1, sdate = 5, time = 2)), dat_dim = 'dataset'), paste0("Parameter 'exp' and 'obs' must have the same names and lengths ", "of all the dimensions except 'dat_dim' and 'memb_dim'.") ) # quantile expect_error( - UltimateBrier(exp1, obs1, quantile = c(0.05, 0.95)), + UltimateBrier(exp1, obs1, quantile = c(0.05, 0.95), dat_dim = 'dataset'), "Parameter 'quantile' must be one logical value." ) expect_error( - UltimateBrier(exp1, obs1, quantile = FALSE, thr = 1:3, type = 'FairEnsembleBS'), + UltimateBrier(exp1, obs1, quantile = FALSE, thr = 1:3, type = 'FairEnsembleBS', dat_dim = 'dataset'), "Parameter 'quantile' must be TRUE if 'type' is 'FairEnsembleBSS' or 'FairEnsembleBS'." ) # thr expect_error( - UltimateBrier(exp1, obs1, thr = TRUE), + UltimateBrier(exp1, obs1, thr = TRUE, dat_dim = 'dataset'), "Parameter 'thr' must be a numeric vector." ) expect_error( - UltimateBrier(exp1, obs1, quantile = TRUE, thr = 1:3), + UltimateBrier(exp1, obs1, quantile = TRUE, thr = 1:3, dat_dim = 'dataset'), "Parameter 'thr' must be between 0 and 1 when quantile is TRUE." ) # type expect_error( - UltimateBrier(exp1, obs1, type = 'UltimateBrier'), + UltimateBrier(exp1, obs1, type = 'UltimateBrier', dat_dim = 'dataset'), "Parameter 'type' must be one of 'BS', 'BSS', 'FairEnsembleBS', 'FairEnsembleBSS', 'FairStartDatesBS' or 'FairStartDatesBSS'." ) # decomposition expect_error( - UltimateBrier(exp1, obs1, decomposition = 1), + UltimateBrier(exp1, obs1, decomposition = 1, dat_dim = 'dataset'), "Parameter 'decomposition' must be one logical value." ) # ncores expect_error( - UltimateBrier(exp1, obs1, ncores = 0), + UltimateBrier(exp1, obs1, ncores = 0, dat_dim = 'dataset'), "Parameter 'ncores' must be a positive integer." ) @@ -111,130 +109,130 @@ test_that("2. Output checks: dat1", { # 'BS' expect_equal( - is.list(UltimateBrier(exp1, obs1)), + is.list(UltimateBrier(exp1, obs1, dat_dim = 'dataset')), TRUE ) expect_equal( - names(UltimateBrier(exp1, obs1)), + names(UltimateBrier(exp1, obs1, dat_dim = 'dataset')), c('bs', 'rel', 'res', 'unc') ) expect_equal( - is.list(UltimateBrier(exp1, obs1, decomposition = FALSE)), + is.list(UltimateBrier(exp1, obs1, decomposition = FALSE, dat_dim = 'dataset')), FALSE ) expect_equal( - dim(UltimateBrier(exp1, obs1, decomposition = FALSE)), + dim(UltimateBrier(exp1, obs1, decomposition = FALSE, dat_dim = 'dataset')), c(nexp = 1, nobs = 1, bin = 3, ftime = 2) ) expect_equal( - dim(UltimateBrier(exp1, obs1, decomposition = FALSE, thr = c(0.25, 0.5, 0.75))), + dim(UltimateBrier(exp1, obs1, dat_dim = 'dataset', decomposition = FALSE, thr = c(0.25, 0.5, 0.75))), c(nexp = 1, nobs = 1, bin = 4, ftime = 2) ) expect_equal( - UltimateBrier(exp1, obs1)$bs, - UltimateBrier(exp1, obs1, decomposition = FALSE) + UltimateBrier(exp1, obs1, dat_dim = 'dataset')$bs, + UltimateBrier(exp1, obs1, decomposition = FALSE, dat_dim = 'dataset') ) expect_equal( - as.vector(UltimateBrier(exp1, obs1)$bs), + as.vector(UltimateBrier(exp1, obs1, dat_dim = 'dataset')$bs), c(0.42222222, 0.44444444, 0.02222222, 0.48888889, 0.37777778, 0.02222222), tolerance = 0.0001 ) expect_equal( - as.vector(UltimateBrier(exp1, obs1)$rel), + as.vector(UltimateBrier(exp1, obs1, dat_dim = 'dataset')$rel), c(0.22222222, 0.31111111, 0.02222222, 0.28888889, 0.24444444, 0.02222222), tolerance = 0.0001 ) expect_equal( - as.vector(UltimateBrier(exp1, obs1)$res), + as.vector(UltimateBrier(exp1, obs1, dat_dim = 'dataset')$res), c(0.0400000, 0.1066667, 0.0000000, 0.0400000, 0.1066667, 0.0000000), tolerance = 0.0001 ) expect_equal( - as.vector(UltimateBrier(exp1, obs1)$unc), + as.vector(UltimateBrier(exp1, obs1, dat_dim = 'dataset')$unc), c(0.24, 0.24, 0.00, 0.24, 0.24, 0.00), tolerance = 0.0001 ) # 'BSS' expect_equal( - dim(UltimateBrier(exp1, obs1, type = 'BSS')), + dim(UltimateBrier(exp1, obs1, type = 'BSS', dat_dim = 'dataset')), c(nexp = 1, nobs = 1, bin = 3, ftime = 2) ) expect_equal( - as.vector(UltimateBrier(exp1, obs1, type = 'BSS')), + as.vector(UltimateBrier(exp1, obs1, type = 'BSS', dat_dim = 'dataset')), c(-0.7592593, -0.8518519, -Inf, -1.0370370, -0.5740741, -Inf), tolerance = 0.0001 ) # 'FairStartDatesBS' expect_equal( - is.list(UltimateBrier(exp1, obs1, type = 'FairStartDatesBS')), + is.list(UltimateBrier(exp1, obs1, type = 'FairStartDatesBS', dat_dim = 'dataset')), TRUE ) expect_equal( - names(UltimateBrier(exp1, obs1, type = 'FairStartDatesBS')), + names(UltimateBrier(exp1, obs1, type = 'FairStartDatesBS', dat_dim = 'dataset')), c('bs', 'rel', 'res', 'unc') ) expect_equal( - is.list(UltimateBrier(exp1, obs1, decomposition = FALSE, type = 'FairStartDatesBS')), + is.list(UltimateBrier(exp1, obs1, decomposition = FALSE, type = 'FairStartDatesBS', dat_dim = 'dataset')), FALSE ) expect_equal( - dim(UltimateBrier(exp1, obs1, decomposition = FALSE, type = 'FairStartDatesBS')), + dim(UltimateBrier(exp1, obs1, decomposition = FALSE, type = 'FairStartDatesBS', dat_dim = 'dataset')), c(nexp = 1, nobs = 1, bin = 3, ftime = 2) ) expect_equal( - UltimateBrier(exp1, obs1, type = 'FairStartDatesBS')$bs, - UltimateBrier(exp1, obs1, decomposition = FALSE, type = 'FairStartDatesBS') + UltimateBrier(exp1, obs1, type = 'FairStartDatesBS', dat_dim = 'dataset')$bs, + UltimateBrier(exp1, obs1, decomposition = FALSE, type = 'FairStartDatesBS', dat_dim = 'dataset') ) expect_equal( - as.vector(UltimateBrier(exp1, obs1, type = 'FairStartDatesBS')$bs), + as.vector(UltimateBrier(exp1, obs1, type = 'FairStartDatesBS', dat_dim = 'dataset')$bs), c(0.42222222, 0.44444444, 0.02222222, 0.48888889, 0.37777778, 0.02222222), tolerance = 0.0001 ) expect_equal( - as.vector(UltimateBrier(exp1, obs1, type = 'FairStartDatesBS')$rel), + as.vector(UltimateBrier(exp1, obs1, type = 'FairStartDatesBS', dat_dim = 'dataset')$rel), c(0.22222222, 0.31111111, 0.02222222, 0.28888889, 0.24444444, 0.02222222), tolerance = 0.0001 ) expect_equal( - as.vector(UltimateBrier(exp1, obs1, type = 'FairStartDatesBS')$res), + as.vector(UltimateBrier(exp1, obs1, type = 'FairStartDatesBS', dat_dim = 'dataset')$res), c(0.0400000, 0.1066667, 0.0000000, 0.0400000, 0.1066667, 0.0000000), tolerance = 0.0001 ) expect_equal( - as.vector(UltimateBrier(exp1, obs1, type = 'FairStartDatesBS')$unc), + as.vector(UltimateBrier(exp1, obs1, type = 'FairStartDatesBS', dat_dim = 'dataset')$unc), c(0.24, 0.24, 0.00, 0.24, 0.24, 0.00), tolerance = 0.0001 ) # 'FairStartDatesBSS' expect_equal( - dim(UltimateBrier(exp1, obs1, type = 'FairStartDatesBSS')), + dim(UltimateBrier(exp1, obs1, type = 'FairStartDatesBSS', dat_dim = 'dataset')), c(nexp = 1, nobs = 1, bin = 3, ftime = 2) ) expect_equal( - as.vector(UltimateBrier(exp1, obs1, type = 'FairStartDatesBSS')), + as.vector(UltimateBrier(exp1, obs1, type = 'FairStartDatesBSS', dat_dim = 'dataset')), c(-0.7592593, -0.8518519, -Inf, -1.0370370, -0.5740741, -Inf), tolerance = 0.0001 ) # 'FairEnsembleBS' expect_equal( - dim(UltimateBrier(exp1, obs1, type = 'FairEnsembleBS')), + dim(UltimateBrier(exp1, obs1, type = 'FairEnsembleBS', dat_dim = 'dataset')), c(nexp = 1, nobs = 1, bin = 3, ftime = 2) ) expect_equal( - as.vector(UltimateBrier(exp1, obs1, type = 'FairEnsembleBS')), + as.vector(UltimateBrier(exp1, obs1, type = 'FairEnsembleBS', dat_dim = 'dataset')), c(0.1333333, 0.2000000, 0.2000000, 0.1333333, 0.4000000, 0.2000000), tolerance = 0.0001 ) # 'FairEnsembleBSS' expect_equal( - dim(UltimateBrier(exp1, obs1, type = 'FairEnsembleBSS')), + dim(UltimateBrier(exp1, obs1, type = 'FairEnsembleBSS', dat_dim = 'dataset')), c(nexp = 1, nobs = 1, bin = 3, ftime = 2) ) expect_equal( - as.vector(UltimateBrier(exp1, obs1, type = 'FairEnsembleBSS')), + as.vector(UltimateBrier(exp1, obs1, type = 'FairEnsembleBSS', dat_dim = 'dataset')), c(-0.1111111, -0.6666667, -0.6666667, 0.2592593, -1.2222222, -0.6666667), tolerance = 0.0001 ) diff --git a/vignettes/NAOindex_81to91.png b/vignettes/Figures/NAOindex_81to91.png similarity index 100% rename from vignettes/NAOindex_81to91.png rename to vignettes/Figures/NAOindex_81to91.png diff --git a/vignettes/NAOpredictions.png b/vignettes/Figures/NAOpredictions.png similarity index 100% rename from vignettes/NAOpredictions.png rename to vignettes/Figures/NAOpredictions.png diff --git a/vignettes/RMSSSforNAOprediction.png b/vignettes/Figures/RMSSSforNAOprediction.png similarity index 100% rename from vignettes/RMSSSforNAOprediction.png rename to vignettes/Figures/RMSSSforNAOprediction.png diff --git a/vignettes/ex_ano_expA_obsX.png b/vignettes/Figures/ex_ano_expA_obsX.png similarity index 100% rename from vignettes/ex_ano_expA_obsX.png rename to vignettes/Figures/ex_ano_expA_obsX.png diff --git a/vignettes/ex_ano_expB_obsX.png b/vignettes/Figures/ex_ano_expB_obsX.png similarity index 100% rename from vignettes/ex_ano_expB_obsX.png rename to vignettes/Figures/ex_ano_expB_obsX.png diff --git a/vignettes/ex_clim_expA_expB_obsX.png b/vignettes/Figures/ex_clim_expA_expB_obsX.png similarity index 100% rename from vignettes/ex_clim_expA_expB_obsX.png rename to vignettes/Figures/ex_clim_expA_expB_obsX.png diff --git a/vignettes/ex_corr_expA_expB_obsX.png b/vignettes/Figures/ex_corr_expA_expB_obsX.png similarity index 100% rename from vignettes/ex_corr_expA_expB_obsX.png rename to vignettes/Figures/ex_corr_expA_expB_obsX.png diff --git a/vignettes/ex_raw_expA_obsX.png b/vignettes/Figures/ex_raw_expA_obsX.png similarity index 100% rename from vignettes/ex_raw_expA_obsX.png rename to vignettes/Figures/ex_raw_expA_obsX.png diff --git a/vignettes/ex_raw_expB_obsX.png b/vignettes/Figures/ex_raw_expB_obsX.png similarity index 100% rename from vignettes/ex_raw_expB_obsX.png rename to vignettes/Figures/ex_raw_expB_obsX.png diff --git a/vignettes/s2dv_modules.png b/vignettes/Figures/s2dv_modules.png similarity index 100% rename from vignettes/s2dv_modules.png rename to vignettes/Figures/s2dv_modules.png diff --git a/vignettes/snip1_equi_map_raw_all.png b/vignettes/Figures/snip1_equi_map_raw_all.png similarity index 100% rename from vignettes/snip1_equi_map_raw_all.png rename to vignettes/Figures/snip1_equi_map_raw_all.png diff --git a/vignettes/snip2_anim_corr_expA_obsX.gif b/vignettes/Figures/snip2_anim_corr_expA_obsX.gif similarity index 100% rename from vignettes/snip2_anim_corr_expA_obsX.gif rename to vignettes/Figures/snip2_anim_corr_expA_obsX.gif diff --git a/vignettes/snip2_anim_corr_expB_obsX.gif b/vignettes/Figures/snip2_anim_corr_expB_obsX.gif similarity index 100% rename from vignettes/snip2_anim_corr_expB_obsX.gif rename to vignettes/Figures/snip2_anim_corr_expB_obsX.gif diff --git a/vignettes/snip2_equimap_corr_raw_expA_obsX.png b/vignettes/Figures/snip2_equimap_corr_raw_expA_obsX.png similarity index 100% rename from vignettes/snip2_equimap_corr_raw_expA_obsX.png rename to vignettes/Figures/snip2_equimap_corr_raw_expA_obsX.png diff --git a/vignettes/snip2_equimap_corr_raw_expB_obsX.png b/vignettes/Figures/snip2_equimap_corr_raw_expB_obsX.png similarity index 100% rename from vignettes/snip2_equimap_corr_raw_expB_obsX.png rename to vignettes/Figures/snip2_equimap_corr_raw_expB_obsX.png diff --git a/vignettes/stat_ano_expA_Y_obsX.png b/vignettes/Figures/stat_ano_expA_Y_obsX.png similarity index 100% rename from vignettes/stat_ano_expA_Y_obsX.png rename to vignettes/Figures/stat_ano_expA_Y_obsX.png diff --git a/vignettes/stat_ano_expA_obsX.png b/vignettes/Figures/stat_ano_expA_obsX.png similarity index 100% rename from vignettes/stat_ano_expA_obsX.png rename to vignettes/Figures/stat_ano_expA_obsX.png diff --git a/vignettes/stat_ano_expB_obsX.png b/vignettes/Figures/stat_ano_expB_obsX.png similarity index 100% rename from vignettes/stat_ano_expB_obsX.png rename to vignettes/Figures/stat_ano_expB_obsX.png diff --git a/vignettes/stat_clim_expA_expB_obsX.png b/vignettes/Figures/stat_clim_expA_expB_obsX.png similarity index 100% rename from vignettes/stat_clim_expA_expB_obsX.png rename to vignettes/Figures/stat_clim_expA_expB_obsX.png diff --git a/vignettes/stat_detr_ano_expA_obsX.png b/vignettes/Figures/stat_detr_ano_expA_obsX.png similarity index 100% rename from vignettes/stat_detr_ano_expA_obsX.png rename to vignettes/Figures/stat_detr_ano_expA_obsX.png diff --git a/vignettes/stat_filter_ano_expA.png b/vignettes/Figures/stat_filter_ano_expA.png similarity index 100% rename from vignettes/stat_filter_ano_expA.png rename to vignettes/Figures/stat_filter_ano_expA.png diff --git a/vignettes/stat_raw_expA_obsX.png b/vignettes/Figures/stat_raw_expA_obsX.png similarity index 100% rename from vignettes/stat_raw_expA_obsX.png rename to vignettes/Figures/stat_raw_expA_obsX.png diff --git a/vignettes/stat_raw_expB_obsX.png b/vignettes/Figures/stat_raw_expB_obsX.png similarity index 100% rename from vignettes/stat_raw_expB_obsX.png rename to vignettes/Figures/stat_raw_expB_obsX.png diff --git a/vignettes/stat_season_mam_expA.png b/vignettes/Figures/stat_season_mam_expA.png similarity index 100% rename from vignettes/stat_season_mam_expA.png rename to vignettes/Figures/stat_season_mam_expA.png diff --git a/vignettes/stat_season_mam_obsX.png b/vignettes/Figures/stat_season_mam_obsX.png similarity index 100% rename from vignettes/stat_season_mam_obsX.png rename to vignettes/Figures/stat_season_mam_obsX.png diff --git a/vignettes/stat_smooth_ano_expA_obsX.png b/vignettes/Figures/stat_smooth_ano_expA_obsX.png similarity index 100% rename from vignettes/stat_smooth_ano_expA_obsX.png rename to vignettes/Figures/stat_smooth_ano_expA_obsX.png diff --git a/vignettes/stat_toy_forecast_ano.png b/vignettes/Figures/stat_toy_forecast_ano.png similarity index 100% rename from vignettes/stat_toy_forecast_ano.png rename to vignettes/Figures/stat_toy_forecast_ano.png diff --git a/vignettes/stat_trend_expA_expB.png b/vignettes/Figures/stat_trend_expA_expB.png similarity index 100% rename from vignettes/stat_trend_expA_expB.png rename to vignettes/Figures/stat_trend_expA_expB.png diff --git a/vignettes/vis_acc_expA_expB_obsX.png b/vignettes/Figures/vis_acc_expA_expB_obsX.png similarity index 100% rename from vignettes/vis_acc_expA_expB_obsX.png rename to vignettes/Figures/vis_acc_expA_expB_obsX.png diff --git a/vignettes/vis_anim_clim_expA.gif b/vignettes/Figures/vis_anim_clim_expA.gif similarity index 100% rename from vignettes/vis_anim_clim_expA.gif rename to vignettes/Figures/vis_anim_clim_expA.gif diff --git a/vignettes/vis_anim_clim_expA_world.gif b/vignettes/Figures/vis_anim_clim_expA_world.gif similarity index 100% rename from vignettes/vis_anim_clim_expA_world.gif rename to vignettes/Figures/vis_anim_clim_expA_world.gif diff --git a/vignettes/vis_anim_clim_expB.gif b/vignettes/Figures/vis_anim_clim_expB.gif similarity index 100% rename from vignettes/vis_anim_clim_expB.gif rename to vignettes/Figures/vis_anim_clim_expB.gif diff --git a/vignettes/vis_anim_clim_obsX.gif b/vignettes/Figures/vis_anim_clim_obsX.gif similarity index 100% rename from vignettes/vis_anim_clim_obsX.gif rename to vignettes/Figures/vis_anim_clim_obsX.gif diff --git a/vignettes/vis_anim_clim_obsX_world.gif b/vignettes/Figures/vis_anim_clim_obsX_world.gif similarity index 100% rename from vignettes/vis_anim_clim_obsX_world.gif rename to vignettes/Figures/vis_anim_clim_obsX_world.gif diff --git a/vignettes/vis_ano_exp_obs.png b/vignettes/Figures/vis_ano_exp_obs.png similarity index 100% rename from vignettes/vis_ano_exp_obs.png rename to vignettes/Figures/vis_ano_exp_obs.png diff --git a/vignettes/vis_ano_exp_points.png b/vignettes/Figures/vis_ano_exp_points.png similarity index 100% rename from vignettes/vis_ano_exp_points.png rename to vignettes/Figures/vis_ano_exp_points.png diff --git a/vignettes/vis_clim_expA_expB_obsX.png b/vignettes/Figures/vis_clim_expA_expB_obsX.png similarity index 100% rename from vignettes/vis_clim_expA_expB_obsX.png rename to vignettes/Figures/vis_clim_expA_expB_obsX.png diff --git a/vignettes/vis_conf_interval_exp.png b/vignettes/Figures/vis_conf_interval_exp.png similarity index 100% rename from vignettes/vis_conf_interval_exp.png rename to vignettes/Figures/vis_conf_interval_exp.png diff --git a/vignettes/vis_corr_expA_expB_obsX.png b/vignettes/Figures/vis_corr_expA_expB_obsX.png similarity index 100% rename from vignettes/vis_corr_expA_expB_obsX.png rename to vignettes/Figures/vis_corr_expA_expB_obsX.png diff --git a/vignettes/vis_corr_rms_expA_expB_obsX.png b/vignettes/Figures/vis_corr_rms_expA_expB_obsX.png similarity index 100% rename from vignettes/vis_corr_rms_expA_expB_obsX.png rename to vignettes/Figures/vis_corr_rms_expA_expB_obsX.png diff --git a/vignettes/vis_eno_expA_expB.png b/vignettes/Figures/vis_eno_expA_expB.png similarity index 100% rename from vignettes/vis_eno_expA_expB.png rename to vignettes/Figures/vis_eno_expA_expB.png diff --git a/vignettes/vis_equimap_box_expA.png b/vignettes/Figures/vis_equimap_box_expA.png similarity index 100% rename from vignettes/vis_equimap_box_expA.png rename to vignettes/Figures/vis_equimap_box_expA.png diff --git a/vignettes/vis_equimap_cols_raw_expA.png b/vignettes/Figures/vis_equimap_cols_raw_expA.png similarity index 100% rename from vignettes/vis_equimap_cols_raw_expA.png rename to vignettes/Figures/vis_equimap_cols_raw_expA.png diff --git a/vignettes/vis_equimap_cols_raw_obsX.png b/vignettes/Figures/vis_equimap_cols_raw_obsX.png similarity index 100% rename from vignettes/vis_equimap_cols_raw_obsX.png rename to vignettes/Figures/vis_equimap_cols_raw_obsX.png diff --git a/vignettes/vis_equimap_contour_raw_expA.png b/vignettes/Figures/vis_equimap_contour_raw_expA.png similarity index 100% rename from vignettes/vis_equimap_contour_raw_expA.png rename to vignettes/Figures/vis_equimap_contour_raw_expA.png diff --git a/vignettes/vis_equimap_contour_raw_obsX.png b/vignettes/Figures/vis_equimap_contour_raw_obsX.png similarity index 100% rename from vignettes/vis_equimap_contour_raw_obsX.png rename to vignettes/Figures/vis_equimap_contour_raw_obsX.png diff --git a/vignettes/vis_equimap_raw_expA.png b/vignettes/Figures/vis_equimap_raw_expA.png similarity index 100% rename from vignettes/vis_equimap_raw_expA.png rename to vignettes/Figures/vis_equimap_raw_expA.png diff --git a/vignettes/vis_equimap_raw_obsX.png b/vignettes/Figures/vis_equimap_raw_obsX.png similarity index 100% rename from vignettes/vis_equimap_raw_obsX.png rename to vignettes/Figures/vis_equimap_raw_obsX.png diff --git a/vignettes/vis_error_bar.png b/vignettes/Figures/vis_error_bar.png similarity index 100% rename from vignettes/vis_error_bar.png rename to vignettes/Figures/vis_error_bar.png diff --git a/vignettes/vis_iqr_expA_expB.png b/vignettes/Figures/vis_iqr_expA_expB.png similarity index 100% rename from vignettes/vis_iqr_expA_expB.png rename to vignettes/Figures/vis_iqr_expA_expB.png diff --git a/vignettes/vis_layout_complex.png b/vignettes/Figures/vis_layout_complex.png similarity index 100% rename from vignettes/vis_layout_complex.png rename to vignettes/Figures/vis_layout_complex.png diff --git a/vignettes/vis_layout_equimap_expA.png b/vignettes/Figures/vis_layout_equimap_expA.png similarity index 100% rename from vignettes/vis_layout_equimap_expA.png rename to vignettes/Figures/vis_layout_equimap_expA.png diff --git a/vignettes/vis_mad_expA_expB.png b/vignettes/Figures/vis_mad_expA_expB.png similarity index 100% rename from vignettes/vis_mad_expA_expB.png rename to vignettes/Figures/vis_mad_expA_expB.png diff --git a/vignettes/vis_maxmin_expA_expB.png b/vignettes/Figures/vis_maxmin_expA_expB.png similarity index 100% rename from vignettes/vis_maxmin_expA_expB.png rename to vignettes/Figures/vis_maxmin_expA_expB.png diff --git a/vignettes/vis_ratiorms_expA_expB_obsX.png b/vignettes/Figures/vis_ratiorms_expA_expB_obsX.png similarity index 100% rename from vignettes/vis_ratiorms_expA_expB_obsX.png rename to vignettes/Figures/vis_ratiorms_expA_expB_obsX.png diff --git a/vignettes/vis_ratiosdrms_expA_expB_obsX.png b/vignettes/Figures/vis_ratiosdrms_expA_expB_obsX.png similarity index 100% rename from vignettes/vis_ratiosdrms_expA_expB_obsX.png rename to vignettes/Figures/vis_ratiosdrms_expA_expB_obsX.png diff --git a/vignettes/vis_ratiosdrms_expA_obsX_obsXrnorm.png b/vignettes/Figures/vis_ratiosdrms_expA_obsX_obsXrnorm.png similarity index 100% rename from vignettes/vis_ratiosdrms_expA_obsX_obsXrnorm.png rename to vignettes/Figures/vis_ratiosdrms_expA_obsX_obsXrnorm.png diff --git a/vignettes/vis_raw_expA_obsX.png b/vignettes/Figures/vis_raw_expA_obsX.png similarity index 100% rename from vignettes/vis_raw_expA_obsX.png rename to vignettes/Figures/vis_raw_expA_obsX.png diff --git a/vignettes/vis_raw_expB_obsX.png b/vignettes/Figures/vis_raw_expB_obsX.png similarity index 100% rename from vignettes/vis_raw_expB_obsX.png rename to vignettes/Figures/vis_raw_expB_obsX.png diff --git a/vignettes/vis_regression_expA_expB.png b/vignettes/Figures/vis_regression_expA_expB.png similarity index 100% rename from vignettes/vis_regression_expA_expB.png rename to vignettes/Figures/vis_regression_expA_expB.png diff --git a/vignettes/vis_rms_expA_expB_obsX.png b/vignettes/Figures/vis_rms_expA_expB_obsX.png similarity index 100% rename from vignettes/vis_rms_expA_expB_obsX.png rename to vignettes/Figures/vis_rms_expA_expB_obsX.png diff --git a/vignettes/vis_rmsss_expA_expB_obsX.png b/vignettes/Figures/vis_rmsss_expA_expB_obsX.png similarity index 100% rename from vignettes/vis_rmsss_expA_expB_obsX.png rename to vignettes/Figures/vis_rmsss_expA_expB_obsX.png diff --git a/vignettes/vis_sd_expA_expB.png b/vignettes/Figures/vis_sd_expA_expB.png similarity index 100% rename from vignettes/vis_sd_expA_expB.png rename to vignettes/Figures/vis_sd_expA_expB.png diff --git a/vignettes/vis_stereomap_raw_expA.png b/vignettes/Figures/vis_stereomap_raw_expA.png similarity index 100% rename from vignettes/vis_stereomap_raw_expA.png rename to vignettes/Figures/vis_stereomap_raw_expA.png diff --git a/vignettes/vis_stereomap_raw_obsX.png b/vignettes/Figures/vis_stereomap_raw_obsX.png similarity index 100% rename from vignettes/vis_stereomap_raw_obsX.png rename to vignettes/Figures/vis_stereomap_raw_obsX.png diff --git a/vignettes/vis_trend_expA_expB.png b/vignettes/Figures/vis_trend_expA_expB.png similarity index 100% rename from vignettes/vis_trend_expA_expB.png rename to vignettes/Figures/vis_trend_expA_expB.png diff --git a/vignettes/ScoringForecast.md b/vignettes/ScoringForecast.md index 37c53e0cc1b11dcf86b6ccc69723bccf08cc0788..275619c31218b172cee6a9c11e2d54c73b75140a 100644 --- a/vignettes/ScoringForecast.md +++ b/vignettes/ScoringForecast.md @@ -103,7 +103,7 @@ PlotBoxWhisker(t(nao_exp_n), nao_obs_n, toptitle = "NAO index, DJF", legend(x = 3.8, y = 2.6, c('EUROSIP', 'EraInterim'), col = c(2, 4), pch = 15) ``` - + The figure above does not represent a good agreement between observations (blue line) and forecast (whisker boxes) due to the large dispersion through the 51 model members. The NAO signal is too weak due to the large dispersion among ensemble members thus almost disappearing (close to 0). @@ -149,7 +149,7 @@ legend(x = 4.95, y = 2.4, c('EUROSIP', 'EraInterim'), col = c(2, 4), pch = 15, cex = 0.9, lty = 0) ``` - + The above figure shows very different RMSSS for different members (left plot). Most of them have RMSSS close to 0, thus the prediction error is close to the system variability. **The RMSSS for the whole ensemble is 0.091**, what means a not very useful ensemble prediction. @@ -227,7 +227,7 @@ title('Predictions for selected-members ensemble') ``` - + For the all-members ensemble, the results are: diff --git a/vignettes/example.md b/vignettes/example.md index 9ac470da402a64a77690a8dccf52079fe3451e9d..3a71e98ec31668d96e421025c67d7968b4ca12c4 100644 --- a/vignettes/example.md +++ b/vignettes/example.md @@ -130,7 +130,7 @@ PlotEquiMap(data$mod[1, 1, 1, 1, , ], data$lon, data$lat) PlotEquiMap(data$mod[2, 1, 1, 1, , ], data$lon, data$lat) PlotEquiMap(data$obs[1, 1, 1, 1, , ], data$lon, data$lat) ``` - + See the full code used to obtain this figure in [**Snippet 1**](snippets.md#snippet1). @@ -150,8 +150,8 @@ PlotAno(mod, obs, gsub('1101', '1201', sdates), fileout = c('ex_raw_expA_obsX.eps', 'ex_raw_expB_obsX.eps')) ``` - - + + Each coloured region represents data corresponding to a single starting date. The bold line represents the mean value and the thin lines represent the values @@ -205,7 +205,7 @@ PlotClim(clim$clim_exp, clim$clim_obs, monini = 12, ytitle = "K", fileout = 'ex_clim_expA_expB_obsX.eps') ``` - + Each line in this plot represents the climatology of each member of the corresponding dataset. A single climatology of the ensemble mean could be @@ -225,8 +225,8 @@ PlotAno(ano_mod, ano_obs, gsub('1101', '1201', sdates), fileout = c('ex_ano_expA_obsX.eps', 'ex_ano_expB_obsX.eps')) ``` - - + + To fulfill the bias correction we would need to add the observed climatologies to these anomalies. The working units of the package, however, are the @@ -255,7 +255,7 @@ PlotVsLTime(corr, toptitle = "Correlations with Observation X over North Pacific fileout = 'ex_corr_expA_expB_obsX.eps') ``` - + See [**Verification**](verification.md) for a detailed explanation of the available deterministic and probabilistic scores or diff --git a/vignettes/snippets.md b/vignettes/snippets.md index 40929c2bd62acc0aa0bbb2564abb4d4e040c50d0..9f81a4bce9b01e6e0c45dc0ecb5fa762303cabf5 100644 --- a/vignettes/snippets.md +++ b/vignettes/snippets.md @@ -56,7 +56,7 @@ PlotEquiMap(data$obs[1,1,1,1,,], data$lon, data$lat, brks = brks, cols = cols, d ColorBar(brks, cols, vert = FALSE, subsampleg = 5) dev.off() ``` - + Snippet 2 --------- @@ -94,8 +94,8 @@ PlotEquiMap(corr[2, 1, 2, 1, , ], map_data$lon, map_data$lat, dots = t(corr[2, 1, 2, 1, , ] > corr[2, 1, 4, 1, , ])) dev.off() ``` - - + + And generates the animations of the actual time correlations of Experiment A and B against Observation X over the Atlantic, with black dots on values that @@ -112,6 +112,6 @@ AnimVsLTime(corr, map_data$lon, map_data$lat, monini = 12, "snip2_anim_corr_expB_obsX")) ``` - - + + diff --git a/vignettes/statistics.md b/vignettes/statistics.md index ff127c4c9fe5550da5a650954f06c8c4fe845063..e455d2c652627a9299b74d8053126cc911018422 100644 --- a/vignettes/statistics.md +++ b/vignettes/statistics.md @@ -138,14 +138,14 @@ PlotClim(clim$exp, clim$obs, monini = 12, listobs = c('Observation X'), fileout = "stat_clim_expA_expB_obsX.eps") ``` - - + + Each coloured curve in the `PlotAno()` figures corresponds to a starting date, with the various ensemble members and the ensemble mean in bold. The coloured area is delimited by the minimum and maximum ensemble values. - + Each plot in the `PlotClim()` figure corresponds to the climatology of a member of the corresponding experiment or observation. @@ -198,8 +198,8 @@ PlotAno(ano$exp, ano$obs, selected_sdates, ytitle = c("K", "K"), linezero = TRUE, fileout = paste0("stat_ano_exp", c("A", "B"), "_obsX.eps")) ``` - - + + To fulfill the bias correction of the forecasts, i.e. transforming the forecast data from the biased model mean state to the real observed mean state, the @@ -240,7 +240,7 @@ PlotVsLTime(trend_exp$trend, listexp = c('Experiment A', 'Experiment B'), fileout = 'stat_trend_expA_expB.eps') ``` - + In this case the slopes of the trends are nearly zero at all lead-times. The raw anomalies of the experiment A and observations are plotted next, side to @@ -261,8 +261,8 @@ PlotAno(InsertDim(plyr::take(trend_exp$detrended, 1, 1), 2, 1), ytitle = "K", linezero = TRUE, fileout = paste0("stat_detr_ano_expA_obsX.eps")) ``` - - + + Since the anomaly members have been averaged to compute the trend, the provided detrended data by `Trend()` is also an ensemble average. @@ -410,7 +410,7 @@ PlotAno(plyr::take(smoothed_ano_exp, 1, 1), smoothed_ano_obs, ytitle = "K", linezero = TRUE, fileout = "stat_smooth_ano_expA_obsX.eps") ``` - + ### Frequency filtering `Filter()` filters a specified frequency from the input data. The filtering is @@ -439,7 +439,7 @@ PlotAno(InsertDim(ens_mean_ano_expA, 2, 1), ano$obs, ytitle = "K", linezero = TRUE, fileout = "stat_filter_ano_expA.eps") ``` - + Generating derivative fields ---------------------------- @@ -487,8 +487,8 @@ PlotEquiMap(mam_clim_obs[1, 1, , ], data_map$lon, data_map$lat, units = "K", brks = brks, cols = cols, subsampleg = 10) dev.off() ``` - - + + ### Cathegorizing data `ProbBins()` @@ -545,7 +545,7 @@ PlotAno(ano_toy$ano_exp, ano_toy$ano_obs, sdates_toy, ytitle = "units", linezero = TRUE, fileout = "stat_toy_forecast_ano.eps") ``` - + It is possible, however, to generate model data from observational data from `Load()`. The only required parameters are, then, the predictability, error diff --git a/vignettes/visualisation.md b/vignettes/visualisation.md index 2c526ca0406f9b1300d5a14ebf2b573d080c9992..07e04fc1bb2a5b9be23fdc9bfc8c7a5ee98be0ca 100644 --- a/vignettes/visualisation.md +++ b/vignettes/visualisation.md @@ -10,7 +10,7 @@ vignette: > Visualisation ============= -s2dverification contains a set of functions to plot data at every stage of the +s2dv contains a set of functions to plot data at every stage of the verification process, most based directly on R graphics plotting tools. These functions are essential to: - Quickly inspect the results of a newly produced experiment, i.e. to check @@ -28,7 +28,7 @@ The visualisation functions, most with a name following the pattern `PlotStereoMap()`, `AnimateMap()`, `PlotLayout()` and `PlotSection()`. To master these functions it is convenient to have in mind the common array -dimension structure used throughout in s2dverification and how it evolves as +dimension structure used throughout in s2dv and how it evolves as the data objects go through the statistics and verification stages. For that you can review the introduction in [**Data retrieval**](data_retrieval.md) and the sections [**Statistics**](statistics.md) and @@ -38,7 +38,7 @@ Next an explanation of which situations they fit the best, details of the options they provide and short examples of usage. The data used hereunder will be the same as in [**Data retrieval**](data_retrieval.md): ```r -library(s2dverification) +library(s2dv) expA <- list(name = 'experimentA', path = file.path('/path/to/experiments/$EXP_NAME$/monthly_mean', '$VAR_NAME$/$VAR_NAME$_$START_DATE$.nc')) @@ -79,7 +79,7 @@ the underlying R graphics `plot()` function for a fine tuning. **Note:** A general purpose time-series plotting function, `PlotTimeSeries()`, is currently being developed. This function will agglomerate all the functionality required to generate the plots resulting from all the currently -available functions in `s2dverification` and will be based on the `ggplot2` +available functions in `s2dv` and will be based on the `ggplot2` package. The current functions will be kept as they are but will simply be an interface to `PlotTimeSeries()`. See [**this report**](https://earth.bsc.es/gitlab/es/s2dverification/blob/develop-PlotTimeSeries/inst/doc/PlotTimeSeries/PlotTimeSeries.pdf) @@ -116,7 +116,7 @@ PlotClim(clim$exp, clim$obs, monini = 12, listobs = c('Observation X'), fileout = "vis_clim_expA_expB_obsX.png") ``` - + ### Plotting multi-member raw data or anomalies @@ -156,8 +156,8 @@ PlotAno(data$mod, data$obs, selected_sdates, ytitle = c("K", "K"), fileout = paste0("vis_raw_exp", c("A", "B"), "_obsX.png")) ``` - - + + ### Plotting statistics and scores @@ -216,8 +216,8 @@ PlotVsLTime(ano_expA_Y$regression, monini = 12, freq = 1, leg = FALSE, fileout = 'vis_regression_expA_expB.png') ``` - - + + - To plot the `Spread()` across ensemble members and starting dates of area averaged data (interquartile range, maximum minus minimum, standard deviation or median absolute deviation): @@ -249,10 +249,10 @@ PlotVsLTime(spread$iqr, listexp = c('Experiment A', 'Experiment B'), fileout = 'vis_mad_expA_expB.png') ``` - - - - + + + + - To plot the correlation (`Corr()`) and RMSE (`RMS()`) between experiments (averaged across ensemble members) and observations: @@ -274,8 +274,9 @@ PlotVsLTime(rms, listexp = c('Experiment A', 'Experiment B'), fileout = 'vis_rms_expA_expB_obsX.png') ``` - - + + + - To plot the ratio between the RMSE of the ensemble mean of two experiments with a same observation at a single grid point or area averaged: @@ -291,7 +292,8 @@ PlotVsLTime(ratio_rms2, monini = 12, freq = 1, siglev = TRUE, leg = FALSE, fileout = 'vis_ratiorms_expA_expB_obsX.png') ``` - + + - To plot the ratio between the ensemble spread of the experiments and their RMSE against the observations (`RatioSDRMS()`) at a single grid point or area averaged: @@ -307,7 +309,7 @@ PlotVsLTime(ratio_sdrms2, listexp = c('Experiment A', 'Experiment B'), fileout = 'vis_ratiosdrms_expA_expB_obsX.png') ``` - + In this example, the ratio SD / RMS is calculated for the experiment A only but against two observational datasets: @@ -327,7 +329,8 @@ PlotVsLTime(ratio_sdrms2, listobs = c('Observation X', 'Observation X + rnorm(n, 0, 0.1)'), fileout = 'vis_ratiosdrms_expA_obsX_obsXrnorm.png') ``` - + + - To plot `RMSSS()` of ensemble mean at a single grid point or area averaged: ```r @@ -341,7 +344,8 @@ PlotVsLTime(rmsss2, listexp = c('Experiment A', 'Experiment B'), fileout = 'vis_rmsss_expA_expB_obsX.png') ``` - + + - To plot effective number of independent data (`Eno()`): ```r @@ -355,7 +359,7 @@ PlotVsLTime(eno2, listexp = c('Experiment A', 'Experiment B'), fileout = 'vis_eno_expA_expB.png') ``` - + `Plot2VarsVsLTime()` allows to plot two indices or scores at a time on the same plot, each with its confidence intervals. It accepts as inputs arrays of only @@ -377,7 +381,7 @@ Plot2VarsVsLTime(corr[, 1, 1:3, ], rms[, 1, , ], listvars = c('Corr', 'RMSE'), fileout = 'vis_corr_rms_expA_expB_obsX.png') ``` - + `PlotACC()`, in contrast to `PlotVsLTime()`, accepts an additional dimension for the starting dates in the input and the dimension of the confidence @@ -416,7 +420,7 @@ PlotACC(acc$ACC, selected_sdates, legends = c('Experiment A', 'Experiment B'), fileout = 'vis_acc_expA_expB_obsX.png') ``` - + `PlotBoxWhisker()` @@ -443,7 +447,7 @@ PlotTimeSeries(ano_exp) + PlotTimeSeries(ano_obs, add = T) ``` - + -------------------- The `PlotTimeSeries` function has read the x- and y- axis labels, the title and the legend from the metadata automatically. By default, a horizontal line is plotted along `y = 0`, and this line can be shifted or removed with `intercept`. Layers can be added by selecting `add = TRUE`, as in the above example, where the dataset of observations have been added to the plot. The user can plot the geometric objects (the mean, confidence intervals, curves etc.) along any of the dimensions, as well as adding points, changing the linestyles and removing any of the objects. For example the curves for the individual members can be replaced with points, with different shapes for the different members, and the shading between the minimum and maximum can be removed as follows. @@ -453,7 +457,7 @@ PlotTimeSeries(ano_exp, minmax_along = NA, points = T, shape_along = 2, curves_a ``` - + ### Plotting scores and sample statistics @@ -466,8 +470,8 @@ PlotTimeSeries(Corr, interval_type = "line") ``` - - + + ### Conclusions @@ -482,8 +486,7 @@ Plotting maps This group of functions allows to plot grid data (i.e. defined over latitudes and longitudes) on a rectangular equidistant projection or on a stereographic -projection (as of s2dverification v2.5.0) as well as depth sections (i.e. -defined over latitudes/longitudes and depth levels). +projection as well as depth sections (i.e. defined over latitudes/longitudes and depth levels). Regarding the functions to plot maps, by default each grid point is drawn on a world map with a colour as a function of the magnitude of the provided field, @@ -523,8 +526,8 @@ PlotEquiMap(MeanDims(map_data$obs, 2)[1, 1, 1, , ], map_data$lon, map_data$lat, toptitle = "Obs. X: 'tas', 1990-12-01", units = "K", filled.continents = FALSE, fileout = 'vis_equimap_raw_obsX.png') ``` - - + + ```r PlotEquiMap(Mean1Dim(map_data$mod, 2)[1, 1, 1, , ], map_data$lon, map_data$lat, @@ -539,14 +542,14 @@ PlotEquiMap(Mean1Dim(map_data$obs, 2)[1, 1, 1, , ], map_data$lon, map_data$lat, filled.continents = FALSE, fileout = 'vis_equimap_cols_raw_obsX.png') ``` - - + + Or, as seen in the example from [**Snippet 2**](snippets.md#snippet2): - - + + `PlotEquiMap()` has some other additional features: @@ -572,8 +575,8 @@ PlotEquiMap(MeanDims(map_data$obs, 2)[1, 1, 1, , ], map_data$lon, map_data$lat, fileout = 'vis_equimap_contour_raw_obsX.png') ``` - - + + - Drawing boxes on the map: `boxlim`, `boxcol` and `boxlwd` allow to specify the position of the corners, colour and thickness of a box to be drawn @@ -589,7 +592,7 @@ PlotEquiMap(MeanDims(map_data$obs, 2)[1, 1, 1, , ], map_data$lon, map_data$lat, fileout = 'vis_equimap_box_expA.png') ``` - + - Ticks on the longitude/latitude axes can be adjusted with `axelab`, `labW`, `intylat` and `intxlon`. @@ -632,8 +635,8 @@ PlotStereoMap(MeanDims(world_data$obs, 2)[1, 1, 10, , ], units = "K", fileout = 'vis_stereomap_raw_obsX.png') ``` - - + + ### AnimateMap() @@ -688,15 +691,15 @@ AnimateMap(Subset(map_clim$exp, 'dataset', 1), units = "K", brks = brks, cols = cols, fileout = "vis_anim_clim_expA.gif") ``` - + And, as seen in [**Snippet 2**](snippets.md#snippet2), the animations of the actual time correlations of Experiment A and B against Observation X over the Atlantic, with black dots on values that reach a 95% significance level: - + - + Also the entire globe and stereographic projection maps can be animated: @@ -716,9 +719,9 @@ AnimateMap(world_clim$obs, fileout = "vis_anim_clim_obsX_world.gif") ``` - + - + ### PlotLayout() @@ -740,7 +743,7 @@ PlotLayout(PlotEquiMap, c('lat', 'lon'), fileout = "vis_layout_equimap_expA.png") ``` - + But really complex layouts can be achieved thanks to the great number of available parameters: @@ -765,7 +768,7 @@ layout <- PlotLayout(fun = c('PlotEquiMap', 'plot', 'plot', 'PlotStereoMap'), fileout = 'vis_layout_complex.png') ``` - + ### PlotSection()