diff --git a/DESCRIPTION b/DESCRIPTION index 9123c589b1411638ea21e364a1592dc7be137f30..1700491ba8fcc927dfd059484765ebb3c939211b 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,7 +1,7 @@ Package: CSTools Title: Assessing Skill of Climate Forecasts on Seasonal-to-Decadal Timescales -Version: 4.1.1 +Version: 5.0.0 Authors@R: c( person("Nuria", "Perez-Zanon", , "nuria.perez@bsc.es", role = "aut", comment = c(ORCID = "0000-0001-8568-3071")), person("Louis-Philippe", "Caron", , "louis-philippe.caron@bsc.es", role = "aut", comment = c(ORCID = "0000-0001-5221-0147")), @@ -42,8 +42,9 @@ Description: Exploits dynamical seasonal forecasts in order to provide contains process-based methods for forecast calibration, bias correction, statistical and stochastic downscaling, optimal forecast combination and multivariate verification, as well as basic and advanced tools to obtain - tailored products. This package was developed in the context of the - ERA4CS project MEDSCOPE and the H2020 S2S4E project. + tailored products. This package was developed in the context of the ERA4CS + project MEDSCOPE and the H2020 S2S4E project and includes contributions from + ArticXchange project founded by EU-PolarNet 2. Pérez-Zanón et al. (2022) . Doblas-Reyes et al. (2005) . Mishra et al. (2018) . @@ -56,7 +57,7 @@ Description: Exploits dynamical seasonal forecasts in order to provide Van Schaeybroeck et al. (2019) . Yiou et al. (2013) . Depends: - R (>= 3.4.0), + R (>= 3.5.0), maps, qmap, easyVerification @@ -76,7 +77,9 @@ Imports: grDevices, stats, utils, - verification + verification, + lubridate, + scales Suggests: zeallot, testthat, @@ -85,7 +88,7 @@ Suggests: rmarkdown, startR VignetteBuilder: knitr -License: Apache License 2.0 +License: GPL-3 Encoding: UTF-8 LazyData: true RoxygenNote: 7.2.0 diff --git a/NAMESPACE b/NAMESPACE index cb900eef1b4c3d85ed441b02f9d0f0b1ca7f36f9..a2df146de80c562d887008c4185466dffb1c7021 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -1,10 +1,16 @@ # Generated by roxygen2: do not edit by hand export(AdamontAnalog) +export(AdamontQQCorr) export(Analogs) +export(BEI_EMWeighting) export(BEI_PDFBest) +export(BEI_ProbsWeighting) +export(BEI_TercilesWeighting) export(BEI_Weights) export(BiasCorrection) +export(CST_AdamontAnalog) +export(CST_AdamontQQCorr) export(CST_Analogs) export(CST_AnalogsPredictors) export(CST_Anomaly) @@ -14,6 +20,7 @@ export(CST_Calibration) export(CST_CategoricalEnsCombination) export(CST_DynBiasCorrection) export(CST_EnsClustering) +export(CST_InsertDim) export(CST_Load) export(CST_MergeDims) export(CST_MultiEOF) @@ -28,6 +35,7 @@ export(CST_RainFARM) export(CST_RegimesAssign) export(CST_SaveExp) export(CST_SplitDim) +export(CST_Subset) export(CST_WeatherRegimes) export(Calibration) export(CategoricalEnsCombination) @@ -36,11 +44,13 @@ export(EnsClustering) export(MergeDims) export(MultiEOF) export(MultiMetric) +export(PDFIndexHind) export(PlotCombinedMap) export(PlotForecastPDF) export(PlotMostLikelyQuantileMap) export(PlotPDFsOLE) export(PlotTriangles4Categories) +export(PlotWeeklyClim) export(Predictability) export(ProxiesAttractor) export(QuantileMapping) @@ -55,13 +65,16 @@ export(WeatherRegime) export(as.s2dv_cube) export(s2dv_cube) export(training_analogs) +import(RColorBrewer) import(abind) import(ggplot2) +import(lubridate) import(multiApply) import(ncdf4) import(qmap) import(rainfarmr) import(s2dv) +import(scales) import(stats) importFrom(ClimProjDiags,SelBox) importFrom(ClimProjDiags,Subset) diff --git a/NEWS.md b/NEWS.md index 72362e9bbbc6092b81aca5a1a95d0ce1428ffffd..29128b8d45dc1d417349e0ca54df2b2d3f86bea9 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,139 +1,148 @@ -### CSTools 4.1.1 -**Submission date to CRAN: 10-11-2022** -- Fixes: - + CST_Analogs corrected input of ClimProjDiags::Subset() - + PlotCombinedMap corrected use of 'cex_bar_titles' parameter - + CST_Anomaly added 'memb_dim', 'dat_dim' and 'ftime_dim' and improved use for 'dim_anom' parameters - -### CSTools 4.1.0 -**Submission date to CRAN: 25-10-2022** -- New features: - + Dependency on package 's2dverification' is changed to 's2dv' - + CST_BiasCorrection new parameters 'memb_dim', 'sdate_dim', 'ncores' - + CST_Calibration is able to calibrate forecast with new parameter 'exp_cor' - + CST_QuantileMapping uses cross-validation and provides option to remove NAs; new parameters 'memb_dim', 'sdate_dim', 'window_dim' and 'na.rm'; 'sample_dim' and 'sample_length' are removed - + s2dv_cube() new parameter 'time_dim' - -- Fixes: - + as.s2dv_cube() detects latitude and longitude structure in startR_array object - + Data correction: 'lonlat_data' is renamed to 'lonlat_temp'; 'lonlat_prec' is corrected by one-day shift - + Typo and parameter correction in vignette 'MostLikelyTercile_vignette' - + Figure and result correction in vignette 'RainFARM_vignette' - + PlotMostLikelyQuantileMap() works with s2dv::PlotLayout - -### CSTools 4.0.1 -**Submission date to CRAN: 05-10-2021** - -- New features: - + Dynamical Bias Correction method: `CST_ProxiesAttractors` and `CST_DynBiasCorrection` - (optionally `Predictability`) - + CST_BiasCorrection and BiasCorrection allows to calibrate a forecast given the calibration in the hindcast by using parameter 'exp_cor'. - + Use cases - + CST_SaveExp includes parameter extra_string - + PlotCombinedMap includes parameter cex_bar_titles - -- Fixes: - + Calibration retains correlation absolute value - + Calibration fixed when cal.methodi == rpc-based, apply_to == sign, - eval.method == 'leave-one-out' and the correlation is not significant - + PlotMostLikelyQuantileMap reoder latitudes of an array provided in 'dots' parameter. - -### CSTools 4.0.0 -**Submission date to CRAN: 23-02-2021** - -- New features: - + ADAMONT downscaling method: requires CST_AdamontAnalogs and CST_AdamontQQCor functions - + Analogs method using Predictors: requires training_analogs and CST_AnalogsPredictors - + PlotPDFsOLE includes parameters to modify legend style - + CST_RFSlope handless missing values in the temporal dimension and new 'ncores' parameter allows parallel computation - + CST_RFWeights accepts s2dv_cube objects as input and new 'ncores' paramenter allows parallel computation - + RFWeights is exposed to users - + CST_RainFARM accepts multi-dimensional slopes and weights and handless missing values in sample dimensions. - + QuantileMapping is exposed to users - + CST_MultiMetric includes 'rpss' metric and it is addapted to s2dv. - + PlotMostLikelyQuantileMap vignette - + PlotTriangles4Categories includes two parameters to adjust axis and margins - + CategoricalEnsCombination is exposed to users - + CST_SplitDims includes parameter 'insert_ftime' - + Analogs vignette - + Data Storage and retrieval vignette - -- Fixes: - + PlotForecastPDF correctly displays terciles labels - + CST_SaveExp correctly save time units - + CST_SplitDims returns ordered output following ascending order provided in indices when it is numeric - + qmap library moved from Imports to Depends - + CST_QuantileMapping correctly handles exp_cor - + Figures resize option from vignettes has been removed - + Fix Analogs to work with three diferent criteria - + Vignette PlotForecastPDF updated plots - + Decrease package size compresing vignettes figures and removing areave_data sample - -### CSTools 3.1.0 -**Submission date to CRAN: 02-07-2020** - -- New features: - + EnsClustering vignette - + EnsClustering has a new parameter 'time_dim' - + CST_BiasCorrection has na.rm paramter - + CST_Anomaly allows to smooth the climatology with filter.span parameter - + PlotTriangles4Categories new plotting function to convert any 3-d numerical array to a grid of coloured triangles. - + CST_WeatherRegimes/WeatherRegimes and CST_RegimeAssign/RegimeAssign - + PlotPDFsOLE plots two probability density gaussian functions and the optimal linear estimation - + CST_RFTemp/RF_Temp functions available for downscaling temperature - + Weather Regimes vignette - -- Fixes - + CST_Anomaly handles exp, obs or both - + PlotForecastPDF vignette displays figures correctly - + Calibration function is exposed to users - + MultiMetric vignette fixed typo text description - + RainFARM checks 'slope' is not a vector - + DESCRIPTION specifies the minimum multiApply version required - + EnsClustering has a fixed 'closest_member' output - + PlotCombinedMap handles masks correctly - + CST_SaveExp uses multiApply and save time dimension correctly - -### CSTools 3.0.0 -**Submission date to CRAN: 10-02-2020** - -- New features: - + CST_MergeDims and MergeDims - + Version working with R 3.4.2 - + PlotForecastPDF handles independent terciles, extremes and observations for each panel -- Fixes - + CST_Calibration handles missing values - + BEI functions handle missing values - - -### CSTools 2.0.0 -**Submission date to CRAN: 25-11-2019** - -- New features: - + CST_Analogs Analogs downscaling method, - + CST_MultiEOFS for multiple variables, - + Ensemble Clustering, - + Categorical Ensemble Combination, - + new Calibration methods included in CST_Calibration, - + Best Estimated Index method, - + CST_QuantileMapping, - + CST_SplitDim to split dimension, if it is a temporal dimension, it can be split by days, months and years or other inidices, - + creation and transformation to class 's2dv_cube', - + CST_SaveExp function for saving experiments to be loadable with CST_Load, +# CSTools 5.0.0 (Release date: 05-04-2023) +**Fixes** +- Correct vignettes: Analogs, MultiModelSkill and MultivarRMSE +- Add 'ncores' to s2dv function calls in CST_Anomaly +- Reduce computing time of examples and tests and improve documentation + +**New features** +- Add dat_dim parameter in CST_BiasCorrection and CST_Calibration +- New plotting function for case studies temporal visualisation: PlotWeeklyClim +- Deprecate indices in dim_anom parameter of CST_Anomaly +- Allow memb_dim to be NULL in QuantileMapping +- Uncomment tests in CST_MultivarRMSE due to correction of RMS in s2dv next release (released s2dv 1.4.0 21/03) +- New s2dv_cube object development for all the functions, unit tests, examples and vignettes +- New function CST_Subset similar to Subset with 's2dv_cubes' +- Improved CST_SaveExp function with new features +- New color set in PlotForecastPDF Vitigeoss colors +- New function CST_InsertDim + +**Other** +- Added contribution from ArticXchange project due to PlotWeeklyClim +- Update NEWS.md with the correct format +- Change Licence + +# CSTools 4.1.1 (Release date: 10-11-2022) +**Fixes** +- CST_Analogs corrected input of ClimProjDiags::Subset() +- PlotCombinedMap corrected use of 'cex_bar_titles' parameter +- CST_Anomaly added 'memb_dim', 'dat_dim' and 'ftime_dim' and improved use for 'dim_anom' parameters + +# CSTools 4.1.0 (Release date: 25-10-2022) +**New features** +- Dependency on package 's2dverification' is changed to 's2dv' +- CST_BiasCorrection new parameters 'memb_dim', 'sdate_dim', 'ncores' +- CST_Calibration is able to calibrate forecast with new parameter 'exp_cor' +- CST_QuantileMapping uses cross-validation and provides option to remove NAs; new parameters 'memb_dim', 'sdate_dim', 'window_dim' and 'na.rm'; 'sample_dim' and 'sample_length' are removed +- s2dv_cube() new parameter 'time_dim' + +**Fixes** +- as.s2dv_cube() detects latitude and longitude structure in startR_array object +- Data correction: 'lonlat_data' is renamed to 'lonlat_temp'; 'lonlat_prec' is corrected by one-day shift +- Typo and parameter correction in vignette 'MostLikelyTercile_vignette' +- Figure and result correction in vignette 'RainFARM_vignette' +- PlotMostLikelyQuantileMap() works with s2dv::PlotLayout + +# CSTools 4.0.1 (Release date: 05-10-2021) +**New features** +- Dynamical Bias Correction method: `CST_ProxiesAttractors` and `CST_DynBiasCorrection` (optionally `Predictability`) +- CST_BiasCorrection and BiasCorrection allows to calibrate a forecast given the calibration in the hindcast by using parameter 'exp_cor'. +- Use cases +- CST_SaveExp includes parameter extra_string +- PlotCombinedMap includes parameter cex_bar_titles + +**Fixes** +- Calibration retains correlation absolute value +- Calibration fixed when cal.methodi == rpc-based, apply_to == sign, eval.method == 'leave-one-out' and the correlation is not significant +- PlotMostLikelyQuantileMap reoder latitudes of an array provided in 'dots' parameter. + +# CSTools 4.0.0 (Release date: 23-02-2021) +**New features** +- ADAMONT downscaling method: requires CST_AdamontAnalogs and CST_AdamontQQCor functions +- Analogs method using Predictors: requires training_analogs and CST_AnalogsPredictors +- PlotPDFsOLE includes parameters to modify legend style +- CST_RFSlope handless missing values in the temporal dimension and new 'ncores' parameter allows parallel computation +- CST_RFWeights accepts s2dv_cube objects as input and new 'ncores' paramenter allows parallel computation +- RFWeights is exposed to users +- CST_RainFARM accepts multi-dimensional slopes and weights and handless missing values in sample dimensions. +- QuantileMapping is exposed to users +- CST_MultiMetric includes 'rpss' metric and it is addapted to s2dv. +- PlotMostLikelyQuantileMap vignette +- PlotTriangles4Categories includes two parameters to adjust axis and margins +- CategoricalEnsCombination is exposed to users +- CST_SplitDims includes parameter 'insert_ftime' +- Analogs vignette +- Data Storage and retrieval vignette + +**Fixes** +- PlotForecastPDF correctly displays terciles labels +- CST_SaveExp correctly save time units +- CST_SplitDims returns ordered output following ascending order provided in indices when it is numeric +- qmap library moved from Imports to Depends +- CST_QuantileMapping correctly handles exp_cor +- Figures resize option from vignettes has been removed +- Fix Analogs to work with three diferent criteria +- Vignette PlotForecastPDF updated plots +- Decrease package size compresing vignettes figures and removing areave_data sample + +# CSTools 3.1.0 (Release date: 02-07-2020) +**New features** +- EnsClustering vignette +- EnsClustering has a new parameter 'time_dim' +- CST_BiasCorrection has na.rm paramter +- CST_Anomaly allows to smooth the climatology with filter.span parameter +- PlotTriangles4Categories new plotting function to convert any 3-d numerical array to a grid of coloured triangles. +- CST_WeatherRegimes/WeatherRegimes and CST_RegimeAssign/RegimeAssign +- PlotPDFsOLE plots two probability density gaussian functions and the optimal linear estimation +- CST_RFTemp/RF_Temp functions available for downscaling temperature +- Weather Regimes vignette + +**Fixes** +- CST_Anomaly handles exp, obs or both +- PlotForecastPDF vignette displays figures correctly +- Calibration function is exposed to users +- MultiMetric vignette fixed typo text description +- RainFARM checks 'slope' is not a vector +- DESCRIPTION specifies the minimum multiApply version required +- EnsClustering has a fixed 'closest_member' output +- PlotCombinedMap handles masks correctly +- CST_SaveExp uses multiApply and save time dimension correctly + +# CSTools 3.0.0 (Release date: 10-02-2020) +**New features** +- CST_MergeDims and MergeDims +- Version working with R 3.4.2 +- PlotForecastPDF handles independent terciles, extremes and observations for each panel + +**Fixes** +- CST_Calibration handles missing values +- BEI functions handle missing values + +# CSTools 2.0.0 (Release date: 25-11-2019) +**New features** +- CST_Analogs Analogs downscaling method, +- CST_MultiEOFS for multiple variables, +- Ensemble Clustering, +- Categorical Ensemble Combination, +- new Calibration methods included in CST_Calibration, +- Best Estimated Index method, +- CST_QuantileMapping, +- CST_SplitDim to split dimension, if it is a temporal dimension, it can be split by days, months and years or other inidices, +- creation and transformation to class 's2dv_cube', +- CST_SaveExp function for saving experiments to be loadable with CST_Load, - Parallelization of RainFARM downscaling - Adding unit tests using testthat for BEI and RainFarm functions - New vignette Best Estimate Index -- Minor fix in CST_BiasCorrection when checking parameter 'obs' - Addapting CST_Load to use 'as.s2dv_cube' function -- Minor fix in data lonlat_prec to be of class 's2dv_cube' -- Minor fix in RainFARM vignette - Adding reference to S2S4E H2020 project into the DESCRIPTION file - Adding NEWS.md file +**Fixes** +- Minor fix in CST_BiasCorrection when checking parameter 'obs' +- Minor fix in data lonlat_prec to be of class 's2dv_cube' +- Minor fix in RainFARM vignette -### CSTools 1.0.1 -**Release date on CRAN: 19-06-2019** - +### CSTools 1.0.1 (Release date: 19-06-2019) +**Fixes and new features** - Correcting test of PlotForecastPDF for compatibility with ggplot2 release - New function PlotCombinedMap - Adding reference to MEDSCOPE ERA4CS Project into the DESCRIPTION file @@ -141,11 +150,7 @@ - Minor fix in PlotMostLikelyQuantileMap for bar_titles - MultiModelSkill vignette updated to use PlotCombinedMap - - -### CSTools 1.0.0 -**Release date on CRAN: 24-04-2019** - +### CSTools 1.0.0 (Release date: 24-04-2019) - Features included: Load, Anomaly, MultiMetric, MultivarRMSE, Calibration, BiasCorrection, RainFARM Downscaling, PlotForecastPDF, PlotMostLikelyQuantileMap - Three sample data: lonlat_data, lonlat_prec, areave_data - Unit tests using testthat: BiasCorrection, Calibration, MultiMetric, PlotForecast diff --git a/R/AnalogsPred_train.R b/R/AnalogsPred_train.R index c68c48b05cd5991de93ff5f1282fa761148693a1..468c74fa1b97c3b13b304119a990a89dc7bf17d5 100644 --- a/R/AnalogsPred_train.R +++ b/R/AnalogsPred_train.R @@ -5,235 +5,250 @@ #'@author Marta Dominguez Alonso - AEMET, \email{mdomingueza@aemet.es} #'@author Nuria Perez-Zanon - BSC, \email{nuria.perez@bsc.es} #' -#'@description This function caracterizes the synoptic situations in a past period based on -#' low resolution reanalysis data (e.g, ERAInterim 1.5º x 1.5º) and an observational high -#' resolution (HR) dataset (AEMET 5 km gridded daily precipitation and maximum and -#' minimum temperature) (Peral et al., 2017)). -#' The method uses three domains: -#' - peninsular Spain and Balearic Islands domain (5 km resolution): HR domain -#' - synoptic domain (low resolution): it should be centered over Iberian Peninsula and -#' cover enough extension to detect as much synoptic situations as possible. -#' - extended domain (low resolution): it is an extension of the synoptic -#' domain. It is used for 'slp_ext' parameter (see 'slp_lon' and 'slp_lat' below). -#'@param pred List of matrix reanalysis data in a synoptic domain. The list -#' has to contain reanalysis atmospheric variables (instantaneous 12h data) -#' that must be indentify by parenthesis name. -#' For precipitation: -#' - u component of wind at 500 hPa (u500) in m/s -#' - v component of wind at 500 hPa (v500) in m/s -#' - temperature at 500 hPa (t500) in K -#' - temperature at 850 hPa (t850) in K -#' - temperature at 1000 hPa (t1000) in K -#' - geopotential height at 500 hPa (z500) in m -#' - geopotential height at 1000 hPa (z1000) in m -#' - sea level pressure (slp) in hPa -#' - specific humidity at 700 hPa (q700) in g/kg -#' For maximum and minimum temperature: -#' - temperature at 1000 hPa (t1000) in K -#' - sea level pressure (slp) in hPa -#' All matrix must have [time,gridpoint] dimensions. -#' (time = number of training days, gridpoint = number of synoptic gridpoints). +#'@description This function caracterizes the synoptic situations in a past +#'period based on low resolution reanalysis data (e.g, ERAInterim 1.5º x 1.5º) +#'and an observational high resolution (HR) dataset (AEMET 5 km gridded daily +#'precipitation and maximum and minimum temperature) (Peral et al., 2017)). +#'The method uses three domains: +#'\itemize{ +#' \item{peninsular Spain and Balearic Islands domain (5 km resolution): HR domain} +#' \item{synoptic domain (low resolution): it should be centered over Iberian +#' Peninsula and cover enough extension to detect as much synoptic +#' situations as possible.} +#' \item{extended domain (low resolution): it is an extension of the synoptic +#' domain. It is used for 'slp_ext' parameter (see 'slp_lon' and 'slp_lat' +#' below).} +#'} +#'@param pred List of matrix reanalysis data in a synoptic domain. The list has +#' to contain reanalysis atmospheric variables (instantaneous 12h data) that +#' must be indentify by parenthesis name. For precipitation: +#' \itemize{ +#' \item{u component of wind at 500 hPa (u500) in m/s} +#' \item{v component of wind at 500 hPa (v500) in m/s} +#' \item{temperature at 500 hPa (t500) in K} +#' \item{temperature at 850 hPa (t850) in K} +#' \item{temperature at 1000 hPa (t1000) in K} +#' \item{geopotential height at 500 hPa (z500) in m} +#' \item{geopotential height at 1000 hPa (z1000) in m} +#' \item{sea level pressure (slp) in hPa} +#' \item{specific humidity at 700 hPa (q700) in g/kg} +#' } +#' For maximum and minimum temperature: +#' \itemize{ +#' \item{temperature at 1000 hPa (t1000) in K} +#' \item{sea level pressure (slp) in hPa} +#' } +#' All matrix must have [time,gridpoint] dimensions. +#' (time = number of training days, gridpoint = number of synoptic gridpoints). #'@param slp_ext Matrix with atmospheric reanalysis sea level pressure -#' (instantaneous 12h data)(hPa). It has the same resolution as 'pred' parameter -#' but with an extended domain. This domain contains extra degrees (most in the -#' north and west part) compare to synoptic domain. The matrix must have -#' [time,gridpoint] dimensions. -#' (time = number of training days, gridpoint = number of extended gridpoints). +#' (instantaneous 12h data)(hPa). It has the same resolution as 'pred' parameter +#' but with an extended domain. This domain contains extra degrees (most in the +#' north and west part) compare to synoptic domain. The matrix must have +#' [time,gridpoint] dimensions. (time = number of training days, +#' gridpoint = number of extended gridpoints). #'@param lon Vector of the synoptic longitude (from (-180º) to 180º), -#' The vector must go from west to east. -#'@param lat Vector of the synoptic latitude. The vector must go from north to south. -#'@param slp_lon Vector of the extended longitude (from (-180º) to 180º) -#' The vector must go from west to east. -#'@param slp_lat Vector of the extended latitude. The vector must go from north to south. +#' The vector must go from west to east. +#'@param lat Vector of the synoptic latitude. The vector must go from north to +#' south. +#'@param slp_lon Vector of the extended longitude (from (-180º) to 180º). +#' The vector must go from west to east. +#'@param slp_lat Vector of the extended latitude. The vector must go from north +#' to south. #'@param var Variable name to downscale. There are two options: 'prec' for -#' precipitation and 'temp' for maximum and minimum temperature. +#' precipitation and 'temp' for maximum and minimum temperature. #'@param HR_path Local path of HR observational files (maestro and pcp/tmx-tmn). -#' For precipitation can be downloaded from http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/v2/Serie_AEMET_v2_pcp_1951a202006_txt.tar.gz -#' For maximum and minimum temperature can be downloaded from http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/temperatura/v1/tmax/Serie_AEMET_v1_tmax_1951a202006_txt.tar.gz and http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/temperatura/v1/tmin/Serie_AEMET_v1_tmin_1951a202006_txt.tar.gz respetively. -#' Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), longitude (lon), latitude (lat) and -#' altitude (alt) in columns (vector structure). -#' Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km resolution spanish daily data -#' (precipitation or maximum and minimum temperature from january 1951 to june 2020. See README -#' file for more information. -#' IMPORTANT!: HR observational period must be the same as for reanalysis variables. -#' It is assumed that the training period is smaller than the HR original one (1951-2020), so it is -#' needed to make a new ascii file with the new period and the same structure as original, -#' specifying the training dates ('tdates' parameter) in the name -#' (e.g. 'pcp_red_SPAIN_19810101-19961231.txt' for '19810101-19961231' period). -#'@param tdates Training period dates in format YYYYMMDD(start)-YYYYMMDD(end) (e.g. 19810101-19961231). -#'@return matrix list (e.g. restrain) as a result of characterize the past synoptic -#' situations and the significant predictors needed to downscale seasonal forecast variables. -#' For precipitation the output includes: -#' um: u component of geostrophic wind in all period (numeric matrix with [time,gridpoint] dimensions) -#' vm: v component of geostrophic wind in all period (numeric matrix with [time,gridpoint] dimensions) -#' nger: number of synoptic situations (integer) -#' gu92: u component of geostrophic wind for each synoptic situation (numeric matrix with -#' [nger,gridpoint] dimensions) -#' gv92: v component of geostrophic wind for each synoptic situation (numeric matrix with -#' [nger,gridpoint] dimensions) -#' gu52: u component of wind at 500 hPa for each synotic situation (numeric matrix with -#' [nger,gridpoint] dimensions) -#' gv52: v component of wind at 500 hPa for each synotic situation (numeric matrix with -#' [nger,gridpoint] dimensions) -#' neni: number of reference centers where predictors are calculated (integer) -#' vdmin: minimum distances between each HR gridpoint and the four nearest synoptic -#' gridpoints (numeric matrix with [nptos,4] dimensions) (nptos = number of HR gridpoints) -#' vref: four nearest synoptic gridpoints to each HR gridpoint (integer matrix with -#' [nptos,4] dimensions) -#' ccm: multiple correlation coeficients (numeric matrix with [nger,nptos] dimensions) -#' indices: -#' - lab_pred: numeric labels of selected predictors (integer matrix -#' with [nger,nptos,11,1] dimensions) -#' - cor_pred: partial correlation of selected predictors (numeric matrix with -#' [nger,nptos,11,2] dimensions) -#' For maximum and minimum temperature the output includes: -#' um: u component of geostrophic wind in all training period (numeric matrix with [time,gridpoint] dimensions) -#' vm: v component of geostrophic wind in all training period (numeric matrix with [time,gridpoint] dimensions) -#' insol: insolation in all training period (numeric vector with [time] dimension) -#' neni: number of reference centers where predictors are calculated (integer) -#' vdmin: minimum distances between each HR gridpoint and the four nearest synoptic -#' gridpoints (numeric matrix with [nptos,4] dimensions) (nptos = number of HR gridpoints) -#' vref: four nearest synoptic gridpoints to each HR gridpoint (integer matrix with -#' [nptos,4] dimensions) -#' -#' The output can directly use as argument to 'CST_AnalogsPredictors' function -#' (e.g. resdowns <- CST_AnalogsPredictors(...,restrain)) -#' +#' For precipitation and temperature can be downloaded from the following link: +#' \url{https://www.aemet.es/en/serviciosclimaticos/cambio_climat/datos_diarios?w=2} +#' respetively. Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), +#' longitude (lon), latitude (lat) and altitude (alt) in columns (vector +#' structure). Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km +#' resolution spanish daily data (precipitation or maximum and minimum +#' temperature from january 1951 to june 2020. See README file for more +#' information. IMPORTANT!: HR observational period must be the same as for +#' reanalysis variables. It is assumed that the training period is smaller than +#' the HR original one (1951-2020), so it is needed to make a new ascii file +#' with the new period and the same structure as original, specifying the +#' training dates ('tdates' parameter) in the name (e.g. +#' 'pcp_red_SPAIN_19810101-19961231.txt' for '19810101-19961231' period). +#'@param tdates Training period dates in format YYYYMMDD(start)-YYYYMMDD(end) +#' (e.g. 19810101-19961231). +#'@return A matrix list (e.g. restrain) as a result of characterize the past +#'synoptic situations and the significant predictors needed to downscale +#'seasonal forecast variables. For precipitation the output includes: +#'\itemize{ +#' \item{'um': u component of geostrophic wind in all period (numeric matrix +#' with [time, gridpoint] dimensions).} +#' \item{'vm': v component of geostrophic wind in all period (numeric matrix +#' with [time,gridpoint] dimensions).} +#' \item{'nger': number of synoptic situations (integer).} +#' \item{'gu92': u component of geostrophic wind for each synoptic situation +#' (numeric matrix with [nger,gridpoint] dimensions).} +#' \item{'gv92': v component of geostrophic wind for each synoptic situation +#' (numeric matrix with [nger, gridpoint] dimensions).} +#' \item{'gu52': u component of wind at 500 hPa for each synotic situation +#' (numeric matrix with [nger, gridpoint] dimensions).} +#' \item{'gv52': v component of wind at 500 hPa for each synotic situation +#' (numeric matrix with [nger, gridpoint] dimensions).} +#' \item{'neni': number of reference centers where predictors are calculated +#' (integer).} +#' \item{'vdmin': minimum distances between each HR gridpoint and the four +#' nearest synoptic gridpoints (numeric matrix with [nptos,4] dimensions) +#' (nptos = number of HR gridpoints).} +#' \item{'vref': four nearest synoptic gridpoints to each HR gridpoint (integer +#' matrix with [nptos, 4] dimensions).} +#' \item{'ccm': multiple correlation coeficients (numeric matrix with [nger, nptos] +#' dimensions) indices: +#' \itemize{ +#' \item{'lab_pred': numeric labels of selected predictors (integer matrix +#' with [nger,nptos,11,1] dimensions).} +#' \item{'cor_pred': partial correlation of selected predictors (numeric +#' matrix with [nger,nptos,11,2] dimensions).} +#' } +#' } +#' } +#'For maximum and minimum temperature the output includes: +#'\itemize{ +#' \item{'um': u component of geostrophic wind in all training period (numeric +#' matrix with [time,gridpoint] dimensions).} +#' \item{'vm': v component of geostrophic wind in all training period (numeric +#' matrix with [time,gridpoint] dimensions).} +#' \item{'insol': insolation in all training period (numeric vector with [time] +#' dimension).} +#' \item{'neni': number of reference centers where predictors are calculated +#' (integer).} +#' \item{'vdmin': minimum distances between each HR gridpoint and the four +#' nearest synoptic gridpoints (numeric matrix with [nptos,4] dimensions) +#' (nptos = number of HR gridpoints).} +#' \item{'vref': four nearest synoptic gridpoints to each HR gridpoint (integer +#' matrix with [nptos,4] dimensions).} +#'} +#'The output can directly use as argument to 'CST_AnalogsPredictors' function +#'(e.g. resdowns <- CST_AnalogsPredictors(...,restrain)). #'@importFrom utils read.table -#' #'@useDynLib CSTools -#' #'@export +training_analogs <- function(pred, slp_ext, lon, lat, slp_lon, slp_lat, var, + HR_path, tdates) { -training_analogs <- function(pred, - slp_ext, - lon, - lat, - slp_lon, - slp_lat, - var, - HR_path, - tdates) { - -if (!is.list(pred)) { + if (!is.list(pred)) { stop("Parameter 'pred' must be a list of 'matrix' objects") } -if (!(all(sapply(pred, inherits, 'matrix')))) { + if (!(all(sapply(pred, inherits, 'matrix')))) { stop("Elements of the list in parameter 'pred' must be of the class ", "'matrix'.") } -if (var == "prec") { - if (length(pred) != 9) { - stop("Parameter 'pred' must be a length of 9.") - } else { - if (is.null(names(dim(pred[[1]]))) || - is.null(names(dim(pred[[2]]))) || - is.null(names(dim(pred[[3]]))) || - is.null(names(dim(pred[[4]]))) || - is.null(names(dim(pred[[5]]))) || - is.null(names(dim(pred[[6]]))) || - is.null(names(dim(pred[[7]]))) || - is.null(names(dim(pred[[8]]))) || - is.null(names(dim(pred[[9]])))) { - stop("Parameter 'pred' should have dimmension names.") - } - if (!(any(names(pred) %in% "u500"))) { - stop("Variable 'u500' in pred parameter is missed.") - } else if (!(any(names(pred) %in% "v500"))) { - stop("Variable 'v500' in pred parameter is missed.") - } else if (!(any(names(pred) %in% "t500"))) { - stop("Variable 't500' in pred parameter is missed.") - } else if (!(any(names(pred) %in% "t850"))) { - stop("Variable 't850' in pred parameter is missed.") - } else if (!(any(names(pred) %in% "t1000"))) { - stop("Variable 't1000' in pred parameter is missed.") - } else if (!(any(names(pred) %in% "z500"))) { - stop("Variable 'z500' in pred parameter is missed.") - } else if (!(any(names(pred) %in% "z1000"))) { - stop("Variable 'z1000' in pred parameter is missed.") - } else if (!(any(names(pred) %in% "slp"))) { - stop("Variable 'slp' in pred parameter is missed.") - } else if (!(any(names(pred) %in% "q700"))) { - stop("Variable 'q700' in pred parameter is missed.") + if (var == "prec") { + if (length(pred) != 9) { + stop("Parameter 'pred' must be a length of 9.") + } else { + if (is.null(names(dim(pred[[1]]))) || + is.null(names(dim(pred[[2]]))) || + is.null(names(dim(pred[[3]]))) || + is.null(names(dim(pred[[4]]))) || + is.null(names(dim(pred[[5]]))) || + is.null(names(dim(pred[[6]]))) || + is.null(names(dim(pred[[7]]))) || + is.null(names(dim(pred[[8]]))) || + is.null(names(dim(pred[[9]])))) { + stop("Parameter 'pred' should have dimmension names.") } - } -} else { - if (length(pred) != 2) { - stop("Parameter 'pred' must be a length of 2.") - } else { - if (is.null(names(dim(pred[[1]]))) || - is.null(names(dim(pred[[2]])))) { - stop("Parameter 'pred' should have dimmension names.") - } - if (!(any(names(pred) %in% "t1000"))) { - stop("Variable 't1000' in pred parameter is missed.") - } else if (!(any(names(pred) %in% "slp"))) { - stop("Variable 'slp' in pred parameter is missed.") - } - } -} + if (!(any(names(pred) %in% "u500"))) { + stop("Variable 'u500' in pred parameter is missed.") + } else if (!(any(names(pred) %in% "v500"))) { + stop("Variable 'v500' in pred parameter is missed.") + } else if (!(any(names(pred) %in% "t500"))) { + stop("Variable 't500' in pred parameter is missed.") + } else if (!(any(names(pred) %in% "t850"))) { + stop("Variable 't850' in pred parameter is missed.") + } else if (!(any(names(pred) %in% "t1000"))) { + stop("Variable 't1000' in pred parameter is missed.") + } else if (!(any(names(pred) %in% "z500"))) { + stop("Variable 'z500' in pred parameter is missed.") + } else if (!(any(names(pred) %in% "z1000"))) { + stop("Variable 'z1000' in pred parameter is missed.") + } else if (!(any(names(pred) %in% "slp"))) { + stop("Variable 'slp' in pred parameter is missed.") + } else if (!(any(names(pred) %in% "q700"))) { + stop("Variable 'q700' in pred parameter is missed.") + } + } + } else { + if (length(pred) != 2) { + stop("Parameter 'pred' must be a length of 2.") + } else { + if (is.null(names(dim(pred[[1]]))) || + is.null(names(dim(pred[[2]])))) { + stop("Parameter 'pred' should have dimmension names.") + } + if (!(any(names(pred) %in% "t1000"))) { + stop("Variable 't1000' in pred parameter is missed.") + } else if (!(any(names(pred) %in% "slp"))) { + stop("Variable 'slp' in pred parameter is missed.") + } + } + } - if (all((sapply(pred,dim))==dim(pred[[1]])) & - all((sapply(pred,function(pred){names(dim(pred))}))==names(dim(pred[[1]])))) { - dim_pred <- dim(pred[[1]]) - if (!(any(names(dim_pred) %in% "time"))) { - stop("Dimension 'time' in pred parameter is missed.") - } - if (!(any(names(dim_pred) %in% "gridpoint"))) { - stop("Dimension 'gridpoint' in pred parameter is missed.") - } - if (names(dim_pred)[1] == "gridpoint") { - pred <- lapply(pred,aperm) - } else { - pred <- pred - } - } else { - stop("All 'pred' variables must have the same dimensions and name dimensions.") - } - -if (!is.vector(lon) || !is.numeric(lon)) { + if (all((sapply(pred,dim)) == dim(pred[[1]])) & + all((sapply(pred, function(pred){names(dim(pred))})) == names(dim(pred[[1]])))) { + dim_pred <- dim(pred[[1]]) + if (!(any(names(dim_pred) %in% "time"))) { + stop("Dimension 'time' in pred parameter is missed.") + } + if (!(any(names(dim_pred) %in% "gridpoint"))) { + stop("Dimension 'gridpoint' in pred parameter is missed.") + } + if (names(dim_pred)[1] == "gridpoint") { + pred <- lapply(pred,aperm) + } else { + pred <- pred + } + } else { + stop("All 'pred' variables must have the same dimensions and name dimensions.") + } + + if (!is.vector(lon) || !is.numeric(lon)) { stop("Parameter 'lon' must be a numeric vector") -} else { + } else { if (is.unsorted(lon)) { - lon <- sort(lon) - warning("'lon' vector has been sorted in increasing order") + lon <- sort(lon) + warning("'lon' vector has been sorted in increasing order") } -} + } -if (!is.vector(lat) || !is.numeric(lat)) { + if (!is.vector(lat) || !is.numeric(lat)) { stop("Parameter 'lat' must be a numeric vector") -} else { + } else { if (!is.unsorted(lat)) { - lat <- sort(lat, decreasing = TRUE) - warning("'lat' vector has been sorted in decreasing order") + lat <- sort(lat, decreasing = TRUE) + warning("'lat' vector has been sorted in decreasing order") } -} + } -if (!is.character(HR_path)) { + if (!is.character(HR_path)) { stop("Parameter 'HR_path' must be a character.") -} else { - if (!dir.exists(HR_path)) { - stop("'HR_path' directory does not exist") - } -} + } else { + if (!dir.exists(HR_path)) { + stop("'HR_path' directory does not exist") + } + } -if (!is.character(tdates)) { - stop("Parameter 'tdates' must be a character.") -} else { - if (nchar(tdates) != "17") { - stop("Parameter 'tdates' must be a string with 17 charecters.") - } else { - dateini <- as.Date(substr(tdates,start=1,stop=8),format="%Y%m%d") - dateend <- as.Date(substr(tdates,start=10,stop=18),format="%Y%m%d") - if (dateend <= dateini) { - stop("Parameter 'tdates' must be at least of one day") - } - } -} + if (!is.character(tdates)) { + stop("Parameter 'tdates' must be a character.") + } else { + if (nchar(tdates) != "17") { + stop("Parameter 'tdates' must be a string with 17 charecters.") + } else { + dateini <- as.Date(substr(tdates,start=1,stop=8),format="%Y%m%d") + dateend <- as.Date(substr(tdates,start=10,stop=18),format="%Y%m%d") + if (dateend <= dateini) { + stop("Parameter 'tdates' must be at least of one day") + } + } + } -#! REANALYSIS GRID PARAMETERS + #! REANALYSIS GRID PARAMETERS rlon <- c(lon, NA) - c(NA, lon) rlon <- rlon[!is.na(rlon)] @@ -261,9 +276,9 @@ if (!is.character(tdates)) { nlon <- ((lon[length(lon)] - lon[1]) / rlon) + 1 ic <- nlat * nlon -# slp_rlon <- c(slp_lon, NA) - c(NA, slp_lon) slp_rlon <- slp_rlon[!is.na(slp_rlon)] + if (!all(slp_rlon == slp_rlon[1])) { stop("Parameter 'slp_lon' must be in regular grid.") } else { @@ -305,230 +320,230 @@ if (!is.character(tdates)) { stop("All 'pred' variables must be in the same period.") } -#!!!!! COMPROBAR QUE SLP TAMBIEN TIENE EL MISMO NROW + #!!!!! COMPROBAR QUE SLP TAMBIEN TIENE EL MISMO NROW seqdates <- seq(as.Date(substr(tdates,start=1,stop=8),format="%Y%m%d"), - as.Date(substr(tdates,start=10,stop=18),format="%Y%m%d"),by="days") + as.Date(substr(tdates,start=10,stop=18),format="%Y%m%d"),by="days") month <- format(seqdates,format="%m") day <- format(seqdates,format="%d") -#! TRAINING REANALYSIS VARIABLES -t1000 <- pred[['t1000']] -msl_si <- pred[['slp']] -msl_lr <- slp_ext - -if (var == "prec") { -u500 <- pred[['u500']] -v500 <- pred[['v500']] -t500 <- pred[['t500']] -t850 <- pred[['t850']] -z500 <- pred[['z500']] -z1000 <- pred[['z1000']] -q700 <- pred[['q700']] -} + #! TRAINING REANALYSIS VARIABLES + t1000 <- pred[['t1000']] + msl_si <- pred[['slp']] + msl_lr <- slp_ext + + if (var == "prec") { + u500 <- pred[['u500']] + v500 <- pred[['v500']] + t500 <- pred[['t500']] + t850 <- pred[['t850']] + z500 <- pred[['z500']] + z1000 <- pred[['z1000']] + q700 <- pred[['q700']] + } -#! HIGH-RESOLUTION (HR) OBSERVATIONAL DATASET -maestro_hr_file <- paste(HR_path, "maestro_red_hr_SPAIN.txt",sep="") -if (!file.exists(maestro_hr_file)) { + #! HIGH-RESOLUTION (HR) OBSERVATIONAL DATASET + maestro_hr_file <- paste(HR_path, "maestro_red_hr_SPAIN.txt",sep="") + if (!file.exists(maestro_hr_file)) { stop("'maestro_red_hr_SPAIN.txt' does not exist.") -} else { + } else { maestro <- read.table(maestro_hr_file) lon_hr <- unlist(maestro[2]) lat_hr <- unlist(maestro[3]) nptos <- length(readLines(maestro_hr_file)) -} + } -if (var == "prec") { - prec_hr_file <- paste(HR_path, "pcp_red_SPAIN_",tdates,".txt",sep="") - if (!file.exists(prec_hr_file)) { - stop(sprintf("precipitation HR file for %s does not exist.",tdates)) - } else { - nd_hr <- length(readLines(prec_hr_file)) - preprec_hr <- matrix(scan(prec_hr_file), nrow=nd_hr ,ncol= nptos+1, byrow=TRUE) - prec_hr <- preprec_hr[1:nd_hr,-c(1)] - } -} else { - tmx_hr_file <- paste(HR_path, "tmx_red_SPAIN_",tdates,".txt",sep="") - tmn_hr_file <- paste(HR_path, "tmn_red_SPAIN_",tdates,".txt",sep="") - if (!file.exists(tmx_hr_file)) { - stop(sprintf("maximum temperature HR file for %s does not exist.",tdates)) - } else if (!file.exists(tmn_hr_file)) { - stop(sprintf("minimum temperature HR file for %s does not exist.",tdates)) - } else if (length(readLines(tmx_hr_file)) != length(readLines(tmn_hr_file))) { - stop("maximum and minimum temperature HR observation files must have the same period.") - } else { - nd_hr <- length(readLines(tmx_hr_file)) - pretmx_hr <- matrix(scan(tmx_hr_file), nrow=nd_hr ,ncol= nptos+1, byrow=TRUE) - tmx_hr <- pretmx_hr[1:nd_hr,-c(1)] - pretmn_hr <- matrix(scan(tmn_hr_file), nrow=nd_hr ,ncol= nptos+1, byrow=TRUE) - tmn_hr <- pretmn_hr[1:nd_hr,-c(1)] - } -} + if (var == "prec") { + prec_hr_file <- paste(HR_path, "pcp_red_SPAIN_",tdates,".txt",sep="") + if (!file.exists(prec_hr_file)) { + stop(sprintf("precipitation HR file for %s does not exist.",tdates)) + } else { + nd_hr <- length(readLines(prec_hr_file)) + preprec_hr <- matrix(scan(prec_hr_file), nrow=nd_hr ,ncol= nptos+1, byrow=TRUE) + prec_hr <- preprec_hr[1:nd_hr,-c(1)] + } + } else { + tmx_hr_file <- paste(HR_path, "tmx_red_SPAIN_",tdates,".txt",sep="") + tmn_hr_file <- paste(HR_path, "tmn_red_SPAIN_",tdates,".txt",sep="") + if (!file.exists(tmx_hr_file)) { + stop(sprintf("maximum temperature HR file for %s does not exist.",tdates)) + } else if (!file.exists(tmn_hr_file)) { + stop(sprintf("minimum temperature HR file for %s does not exist.",tdates)) + } else if (length(readLines(tmx_hr_file)) != length(readLines(tmn_hr_file))) { + stop("maximum and minimum temperature HR observation files must have the same period.") + } else { + nd_hr <- length(readLines(tmx_hr_file)) + pretmx_hr <- matrix(scan(tmx_hr_file), nrow=nd_hr ,ncol= nptos+1, byrow=TRUE) + tmx_hr <- pretmx_hr[1:nd_hr,-c(1)] + pretmn_hr <- matrix(scan(tmn_hr_file), nrow=nd_hr ,ncol= nptos+1, byrow=TRUE) + tmn_hr <- pretmn_hr[1:nd_hr,-c(1)] + } + } if (nd_hr != nd) { stop("Reanalysis variables and HR observations must have the same period.") } -#! OTHER PARAMETERS that should not be changed -#! Number of analog situations to consider -nanx <- 155 -#! Number of predictors -npx <- 11 - -#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - -if (var == "prec") { - - prePro <- .Fortran("training_part1_prec", - u500 = as.numeric(u500), - v500 = as.numeric(v500), - t1000 = as.numeric(t1000), - z500 = as.numeric(z500), - z1000 = as.numeric(z1000), - msl_si = as.numeric(msl_si), - msl_lr = as.numeric(msl_lr), - ngridd = as.integer(ngridd), - nlat = as.integer(nlat), - nlon = as.integer(nlon), - ic = as.integer(ic), - nlatt = as.integer(nlatt), - nlont = as.integer(nlont), - id = as.integer(id), - slat = as.numeric(slat), - slon = as.numeric(slon), - rlat = as.numeric(rlat), - rlon = as.numeric(rlon), - slatt = as.numeric(slatt), - slont = as.numeric(slont), - nd = as.integer(nd), - um = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), - vm = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), - gu92 = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), - gv92 = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), - gu52 = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), - gv52 = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), - nger = as.integer(1), - PACKAGE = 'CSTools') - - a <- prePro$um - b <- prePro$vm - c <- prePro$gu92[1:prePro$nger,] - d <- prePro$gv92[1:prePro$nger,] - e <- prePro$gu52[1:prePro$nger,] - f <- prePro$gv52[1:prePro$nger,] - - g <- prePro$nger - - predSig <- .Fortran("training_part2_prec", - u500 = as.numeric(u500), - v500 = as.numeric(v500), - t500 = as.numeric(t500), - t850 = as.numeric(t850), - msl_si = as.numeric(msl_si), - q700 = as.numeric(q700), - lon_hr = as.numeric(lon_hr), - lat_hr = as.numeric(lat_hr), - prec_hr = as.numeric(prec_hr), - nanx = as.integer(nanx), - nlat = as.integer(nlat), - nlon = as.integer(nlon), - ic = as.integer(ic), - nlatt = as.integer(nlatt), - nlont = as.integer(nlont), - id = as.integer(id), - slat = as.numeric(slat), - slon = as.numeric(slon), - rlat = as.numeric(rlat), - rlon = as.numeric(rlon), - slatt = as.numeric(slatt), - slont = as.numeric(slont), - nd = as.integer(nd), - um = as.double(a), - vm = as.double(b), - gu92 = as.double(c), - gv92 = as.double(d), - gu52 = as.double(e), - gv52 = as.double(f), - nger = as.integer(g), - vdmin = matrix(as.double(seq(1,nptos*4)),c(nptos,4)), - vref = matrix(as.integer(seq(1,nptos*4)),c(nptos,4)), - neni = as.integer(1), - mi = matrix(as.integer(seq(1,prePro$nger*nptos)),c(prePro$nger,nptos)), - ccm = matrix(as.double(seq(1,prePro$nger*nptos)),c(prePro$nger,nptos)), - lab_pred = matrix(as.integer(seq(1,prePro$nger*nptos*npx)),c(prePro$nger,nptos,npx)), - cor_pred = matrix(as.double(seq(1,prePro$nger*nptos*npx)),c(prePro$nger,nptos,npx)), - PACKAGE = 'CSTools') - - h <- predSig$mi - i <- predSig$ccm - j <- predSig$lab_pred - k <- predSig$cor_pred - l <- predSig$vdmin - m <- predSig$vref - - indices <- array(c(j,k),c(g,nptos,npx,2)) - dimnames(indices)[[4]] <- c("lab_pred","cor_pred") - - output <- list("um" = a, - "vm" = b, - "nger" = g, - "gu92" = c, - "gv92" = d, - "gu52" = e, - "gv52" = f, - "neni" = predSig$neni, - "vdmin" = l, - "vref" = m, - "ccm" = i, - "indices" = indices) -} else { - - prePro <- .Fortran("training_temp", - t1000 = as.numeric(t1000), - msl_si = as.numeric(msl_si), - msl_lr = as.numeric(msl_lr), - lon_hr = as.numeric(lon_hr), - lat_hr = as.numeric(lat_hr), - ngridd = as.integer(ngridd), - nlat = as.integer(nlat), - nlon = as.integer(nlon), - ic = as.integer(ic), - nlatt = as.integer(nlatt), - nlont = as.integer(nlont), - id = as.integer(id), - slat = as.numeric(slat), - slon = as.numeric(slon), - rlat = as.numeric(rlat), - rlon = as.numeric(rlon), - slatt = as.numeric(slatt), - slont = as.numeric(slont), - nd = as.integer(nd), - day = as.integer(day), - month = as.integer(month), - um = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), - vm = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), - insol = vector(mode="double",length=nd), - vdmin = matrix(as.double(seq(1,nptos*4)),c(nptos,4)), - vref = matrix(as.integer(seq(1,nptos*4)),c(nptos,4)), - neni = as.integer(1), - PACKAGE = 'CSTools') - - a <- prePro$um - b <- prePro$vm - c <- prePro$insol - d <- prePro$vdmin - e <- prePro$vref - f <- prePro$neni - - output <- list("um" = a, - "vm" = b, - "insol" = c, - "vdmin" = d, - "vref" = e, - "neni" = f) - -} + #! OTHER PARAMETERS that should not be changed + #! Number of analog situations to consider + nanx <- 155 + #! Number of predictors + npx <- 11 + + #!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + + if (var == "prec") { + + prePro <- .Fortran("training_part1_prec", + u500 = as.numeric(u500), + v500 = as.numeric(v500), + t1000 = as.numeric(t1000), + z500 = as.numeric(z500), + z1000 = as.numeric(z1000), + msl_si = as.numeric(msl_si), + msl_lr = as.numeric(msl_lr), + ngridd = as.integer(ngridd), + nlat = as.integer(nlat), + nlon = as.integer(nlon), + ic = as.integer(ic), + nlatt = as.integer(nlatt), + nlont = as.integer(nlont), + id = as.integer(id), + slat = as.numeric(slat), + slon = as.numeric(slon), + rlat = as.numeric(rlat), + rlon = as.numeric(rlon), + slatt = as.numeric(slatt), + slont = as.numeric(slont), + nd = as.integer(nd), + um = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), + vm = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), + gu92 = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), + gv92 = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), + gu52 = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), + gv52 = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), + nger = as.integer(1), + PACKAGE = 'CSTools') + + a <- prePro$um + b <- prePro$vm + c <- prePro$gu92[1:prePro$nger,] + d <- prePro$gv92[1:prePro$nger,] + e <- prePro$gu52[1:prePro$nger,] + f <- prePro$gv52[1:prePro$nger,] + + g <- prePro$nger + + predSig <- .Fortran("training_part2_prec", + u500 = as.numeric(u500), + v500 = as.numeric(v500), + t500 = as.numeric(t500), + t850 = as.numeric(t850), + msl_si = as.numeric(msl_si), + q700 = as.numeric(q700), + lon_hr = as.numeric(lon_hr), + lat_hr = as.numeric(lat_hr), + prec_hr = as.numeric(prec_hr), + nanx = as.integer(nanx), + nlat = as.integer(nlat), + nlon = as.integer(nlon), + ic = as.integer(ic), + nlatt = as.integer(nlatt), + nlont = as.integer(nlont), + id = as.integer(id), + slat = as.numeric(slat), + slon = as.numeric(slon), + rlat = as.numeric(rlat), + rlon = as.numeric(rlon), + slatt = as.numeric(slatt), + slont = as.numeric(slont), + nd = as.integer(nd), + um = as.double(a), + vm = as.double(b), + gu92 = as.double(c), + gv92 = as.double(d), + gu52 = as.double(e), + gv52 = as.double(f), + nger = as.integer(g), + vdmin = matrix(as.double(seq(1,nptos*4)),c(nptos,4)), + vref = matrix(as.integer(seq(1,nptos*4)),c(nptos,4)), + neni = as.integer(1), + mi = matrix(as.integer(seq(1,prePro$nger*nptos)),c(prePro$nger,nptos)), + ccm = matrix(as.double(seq(1,prePro$nger*nptos)),c(prePro$nger,nptos)), + lab_pred = matrix(as.integer(seq(1,prePro$nger*nptos*npx)),c(prePro$nger,nptos,npx)), + cor_pred = matrix(as.double(seq(1,prePro$nger*nptos*npx)),c(prePro$nger,nptos,npx)), + PACKAGE = 'CSTools') + + h <- predSig$mi + i <- predSig$ccm + j <- predSig$lab_pred + k <- predSig$cor_pred + l <- predSig$vdmin + m <- predSig$vref + + indices <- array(c(j,k),c(g,nptos,npx,2)) + dimnames(indices)[[4]] <- c("lab_pred","cor_pred") + + output <- list("um" = a, + "vm" = b, + "nger" = g, + "gu92" = c, + "gv92" = d, + "gu52" = e, + "gv52" = f, + "neni" = predSig$neni, + "vdmin" = l, + "vref" = m, + "ccm" = i, + "indices" = indices) + } else { + + prePro <- .Fortran("training_temp", + t1000 = as.numeric(t1000), + msl_si = as.numeric(msl_si), + msl_lr = as.numeric(msl_lr), + lon_hr = as.numeric(lon_hr), + lat_hr = as.numeric(lat_hr), + ngridd = as.integer(ngridd), + nlat = as.integer(nlat), + nlon = as.integer(nlon), + ic = as.integer(ic), + nlatt = as.integer(nlatt), + nlont = as.integer(nlont), + id = as.integer(id), + slat = as.numeric(slat), + slon = as.numeric(slon), + rlat = as.numeric(rlat), + rlon = as.numeric(rlon), + slatt = as.numeric(slatt), + slont = as.numeric(slont), + nd = as.integer(nd), + day = as.integer(day), + month = as.integer(month), + um = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), + vm = matrix(as.double(seq(1,nd*ic)),c(nd,ic)), + insol = vector(mode="double",length=nd), + vdmin = matrix(as.double(seq(1,nptos*4)),c(nptos,4)), + vref = matrix(as.integer(seq(1,nptos*4)),c(nptos,4)), + neni = as.integer(1), + PACKAGE = 'CSTools') + + a <- prePro$um + b <- prePro$vm + c <- prePro$insol + d <- prePro$vdmin + e <- prePro$vref + f <- prePro$neni + + output <- list("um" = a, + "vm" = b, + "insol" = c, + "vdmin" = d, + "vref" = e, + "neni" = f) + + } - return(output) + return(output) } diff --git a/R/BEI_PDFBest.R b/R/BEI_PDFBest.R index 46833b2c57bc716f8459c90bfebb1fac9a189350..d2d98af8ccf2b47cc048c1d232c712d2099066f0 100644 --- a/R/BEI_PDFBest.R +++ b/R/BEI_PDFBest.R @@ -6,66 +6,68 @@ #'Probability Density Functions (PDFs) (e.g. NAO index) obtained to combining #'the Index PDFs for two Seasonal Forecast Systems (SFSs), the Best Index #'estimation (see Sanchez-Garcia, E. et al (2019), -#'https://doi.org/10.5194/asr-16-165-2019 for more details about the +#'\doi{10.5194/asr-16-165-2019} for more details about the #'methodology applied to estimate the Best Index). #' #'@references Regionally improved seasonal forecast of precipitation through #'Best estimation of winter NAO, Sanchez-Garcia, E. et al., -#' Adv. Sci. Res., 16, 165174, 2019, https://doi.org/10.5194/asr-16-165-2019 +#' Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} #' #'@param index_obs Index (e.g. NAO index) array from an observational database -#' or reanalysis with at least a temporal dimension (by default 'time'), -#' which must be greater than 2. +#' or reanalysis with at least a temporal dimension (by default 'time'), +#' which must be greater than 2. #'@param index_hind1 Index (e.g. NAO index) array from a SFS (named SFS1) -#' with at least two dimensions (time , member) or (time, statistic). -#' The temporal dimension, by default 'time', must be greater than 2. -#' The dimension 'member' must be greater than 1. -#' The dimension 'statistic' must be equal to 2, for containing the two paramenters of -#' a normal distribution (mean and sd) representing the ensemble of a SFS. -#' It is not possible to have the dimension 'member' and 'statistic' at the same time. +#' with at least two dimensions (time , member) or (time, statistic). +#' The temporal dimension, by default 'time', must be greater than 2. +#' The dimension 'member' must be greater than 1. The dimension 'statistic' +#' must be equal to 2, for containing the two paramenters of a normal +#' distribution (mean and sd) representing the ensemble of a SFS. It is not +#' possible to have the dimension 'member' and 'statistic' +#' at the same time. #'@param index_hind2 Index (e.g. NAO index) array from a SFS (named SFS2) -#' with at least two dimensions (time , member) or (time, statistic). -#' The temporal dimension, by default 'time', must be greater than 2. -#' The dimension 'member' must be greater than 1. -#' The dimension 'statistic' must be equal to 2, for containing the two paramenters of -#' a normal distribution (mean and sd) representing the ensemble of a SFS. -#' It is not possible to have the dimension 'member' and 'statistic' together. -#'@param index_fcst1 (optional, default = NULL) Index (e.g. NAO index) array from forescating of SFS1 -#' with at least two dimensions (time , member) or (time, statistic). -#' The temporal dimension, by default 'time', must be equal to 1, the forecast year target. -#' The dimension 'member' must be greater than 1. -#' The dimension 'statistic' must be equal to 2, for containing the two paramenters of -#' a normal distribution (mean and sd) representing the ensemble of a SFS. -#' It is not possible to have the dimension 'member' and 'statistic' together. -#'@param index_fcst2 (optional, default = NULL) Index (e.g. NAO index) array from forescating of SFS2 -#' with at least two dimensions (time , member) or (time, statistic). -#' The temporal dimension, by default 'time', must be equal to 1, the forecast year target. -#' The dimension 'member' must be greater than 1. -#' The dimension 'statistic' must be equal to 2, for containing the two paramenters of -#' a normal distribution (mean and sd) representing the ensemble of a SFS. -#' It is not possible to have the dimension 'member' and 'statistic' together. +#' with at least two dimensions (time , member) or (time, statistic). +#' The temporal dimension, by default 'time', must be greater than 2. +#' The dimension 'member' must be greater than 1. +#' The dimension 'statistic' must be equal to 2, for containing the two +#' paramenters of a normal distribution (mean and sd) representing the ensemble +#' of a SFS. It is not possible to have the dimension 'member' and 'statistic' +#' together. +#'@param index_fcst1 (optional, default = NULL) Index (e.g. NAO index) array +#' from forescating of SFS1 with at least two dimensions (time , member) or +#' (time, statistic). The temporal dimension, by default 'time', must be equal +#' to 1, the forecast year target. The dimension 'member' must be greater than +#' 1. The dimension 'statistic' must be equal to 2, for containing the two +#' paramenters of a normal distribution (mean and sd) representing the ensemble +#' of a SFS. It is not possible to have the dimension 'member' and 'statistic' +#' together. +#'@param index_fcst2 (optional, default = NULL) Index (e.g. NAO index) array +#' from forescating of SFS2 with at least two dimensions (time , member) or +#' (time, statistic). The temporal dimension, by default 'time', must be equal +#' to 1, the forecast year target. The dimension 'member' must be greater than +#' 1. The dimension 'statistic' must be equal to 2, for containing the two +#' paramenters of a normal distribution (mean and sd) representing the ensemble +#' of a SFS. It is not possible to have the dimension 'member' and 'statistic' +#' together. #'@param method_BC A character vector of maximun length 2 indicating the bias -#'correction methodology to be applied on each SFS. If it is 'none' or any of -#'its elements is 'none', the bias correction won't be applied. -#'Available methods developped are "ME" (a bias correction scheme based on the -#'mean error or bias between observation and predictions to correct the -#'predicted values), and "LMEV" (a bias correction scheme based on a linear -#'model using ensemble variance of index as predictor). (see Sanchez-Garcia, -#'E. et al (2019), https://doi.org/10.5194/asr-16-165-2019 for more details). +#' correction methodology to be applied on each SFS. If it is 'none' or any of +#' its elements is 'none', the bias correction won't be applied. Available +#' methods developped are "ME" (a bias correction scheme based on the mean +#' error or bias between observation and predictions to correct the predicted +#' values), and "LMEV" (a bias correction scheme based on a linear model using +#' ensemble variance of index as predictor). (see Sanchez-Garcia, E. et al +#' (2019), \doi{10.5194/asr-16-165-2019} for more details). #'@param time_dim_name A character string indicating the name of the temporal -#'dimension, by default 'time'. +#' dimension, by default 'time'. #'@param na.rm Logical (default = FALSE). Should missing values be removed? #' -#' @return BEI_PDFBest() returns an array with the parameters that caracterize -#' the PDFs, with at least a temporal dimension, by default 'time' and dimension -#' 'statistic' equal to 2. -#' The firt statistic is the parameter 'mean' of the PDF for the best estimation -#' combining the two SFSs PDFs. -#' The second statistic is the parameter 'standard deviation' of the PDF for -#' the best estimation combining the two SFSs PDFs. -#' If index_fcst1 and/or index_fcst2 are null, returns the values for hindcast period. -#' Otherwise, it returns the values for a forecast year. -#'@import multiApply +#'@return BEI_PDFBest() returns an array with the parameters that caracterize +#'the PDFs, with at least a temporal dimension, by default 'time' and dimension +#''statistic' equal to 2. The firt statistic is the parameter 'mean' of the PDF +#'for the best estimation combining the two SFSs PDFs. The second statistic is +#'the parameter 'standard deviation' of the PDF for the best estimation +#'combining the two SFSs PDFs. If index_fcst1 and/or index_fcst2 are null, +#'returns the values for hindcast period. Otherwise, it returns the values for a +#'forecast year. #' #'@examples #' # Example 1 for the BEI_PDFBest function @@ -81,10 +83,7 @@ #' dim(index_fcst2) <- c(time = 1, member = 9, season = 2) #' method_BC <- 'ME' #' res <- BEI_PDFBest(index_obs, index_hind1, index_hind2, index_fcst1, -#' index_fcst2, method_BC) -#' dim(res) -#' # time statistic season -#' # 1 2 2 +#' index_fcst2, method_BC) #' # Example 2 for the BEI_PDFBest function #' index_obs<- rnorm(10, sd = 3) #' dim(index_obs) <- c(time = 5, season = 2) @@ -97,15 +96,14 @@ #' index_fcst2 <- rnorm(18, mean = -0.5, sd = 4) #' dim(index_fcst2) <- c(time = 1, member = 9, season = 2) #' method_BC <- c('LMEV', 'ME') -#' res <- BEI_PDFBest(index_obs, index_hind1, index_hind2, index_fcst1, index_fcst2, method_BC) -#' dim(res) -#' # time statistic season -#' # 1 2 2 +#' res <- BEI_PDFBest(index_obs, index_hind1, index_hind2, index_fcst1, +#' index_fcst2, method_BC) +#'@import multiApply +#'@importFrom verification verify #'@export - -BEI_PDFBest <- function(index_obs, index_hind1, index_hind2, - index_fcst1 = NULL, index_fcst2 = NULL, method_BC = 'none', - time_dim_name = 'time', na.rm = FALSE) { +BEI_PDFBest <- function(index_obs, index_hind1, index_hind2, index_fcst1 = NULL, + index_fcst2 = NULL, method_BC = 'none', + time_dim_name = 'time', na.rm = FALSE) { if (!is.logical(na.rm)) { stop("Parameter 'na.rm' must be a logical value.") @@ -122,7 +120,7 @@ BEI_PDFBest <- function(index_obs, index_hind1, index_hind2, if (!is.character(method_BC) || !is.vector(method_BC)){ stop("Parameter 'method_BC' must be a character vector.") } - if (!(length(method_BC) == 1 || length(method_BC) == 2)){ + if (!(length(method_BC) == 1 || length(method_BC) == 2)) { stop("Length of parameter 'method_BC' must be 1 or 2.") } if(!all(method_BC %in% c('ME', 'LMEV', 'none'))){ @@ -340,34 +338,28 @@ BEI_PDFBest <- function(index_obs, index_hind1, index_hind2, #' Atomic BEI_PDFBest #'@param pdf_1 Statistics array for the first SFS PDF with one dimension -#' 'statistic' equal to 4. -#' @param pdf_2 Statistics array for the second SFS PDF with one dimension -#' 'statistic' equal to 4. -#' @param bc_dataset1 Logical (default = TRUE). -#' If TRUE the Index PDFs for the first SFS has been computed -#' with bias corrected. -#' @param bc_dataset2 Logical (default = TRUE). -#' If TRUE the Index PDFs for the second SFS has been computed -#' with bias corrected. +#' 'statistic' equal to 4. +#'@param pdf_2 Statistics array for the second SFS PDF with one dimension +#' 'statistic' equal to 4. +#'@param bc_dataset1 Logical (default = TRUE). +#' If TRUE the Index PDFs for the first SFS has been computed with bias +#' corrected. +#'@param bc_dataset2 Logical (default = TRUE). If TRUE the Index PDFs for the +#' second SFS has been computed with bias corrected. #' -#' @return .BEI_PDFBest returns an array with dimensions (statistic = 2). -#' The firt statistic is the parameter 'mean' of the PDF for the best estimation -#' combining the two SFSs PDF. -#' The second statistic is the parameter 'standard deviation' of the PDF for -#' the best estimation combining the two SFSs PDF. +#'@return .BEI_PDFBest returns an array with dimensions (statistic = 2). +#'The firt statistic is the parameter 'mean' of the PDF for the best estimation +#'combining the two SFSs PDF. The second statistic is the parameter 'standard +#'deviation' of the PDF for the best estimation combining the two SFSs PDF. #' -#' @examples +#'@examples #' # Example for the Atomic BEI_PDFBest function #' pdf_1 <- c(1.1,0.6,1.6,0.9) #' dim(pdf_1) <- c(statistic = 4) #' pdf_2 <- c(1,0.5,1.5,0.8) #' dim(pdf_2) <- c(statistic = 4) #' res <- .BEI_PDFBest(pdf_1, pdf_2, bc_dataset1 = TRUE, bc_dataset2 = FALSE) -#' str(res) -#' dim(res) -#' # statistic -#' # 2 -#' @noRd +#'@noRd .BEI_PDFBest <- function(pdf_1, pdf_2, bc_dataset1 = TRUE, bc_dataset2 = TRUE) { if(bc_dataset1){ # apply bias correction to model 1 @@ -404,33 +396,34 @@ BEI_PDFBest <- function(index_obs, index_hind1, index_hind2, #' (e.g. NAO index) to improve the index estimate from SFSs for a hindcast period. #' #'@references Regionally improved seasonal forecast of precipitation through Best -#' estimation of winter NAO, Sanchez-Garcia, E. et al., -#' Adv. Sci. Res., 16, 165174, 2019, https://doi.org/10.5194/asr-16-165-2019 +#'estimation of winter NAO, Sanchez-Garcia, E. et al., +#'Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} #' #'@param index_hind Index (e.g. NAO index) array from SFSs -#' with at least two dimensions (time , member) or (time, statistic). -#' The temporal dimension, by default 'time', must be greater than 2. -#' The dimension 'member' must be greater than 1. -#' The dimension 'statistic' must be equal to 2, for containing the two paramenters of -#' a normal distribution (mean and sd) representing the ensemble of a SFS. -#' It is not possible to have the dimension 'member' and 'statistic' together. +#' with at least two dimensions (time , member) or (time, statistic). +#' The temporal dimension, by default 'time', must be greater than 2. +#' The dimension 'member' must be greater than 1. +#' The dimension 'statistic' must be equal to 2, for containing the two +#' paramenters of a normal distribution (mean and sd) representing the ensemble +#' of a SFS. It is not possible to have the dimension 'member' and 'statistic' +#' together. #'@param index_obs Index (e.g. NAO index) array from an observational database -#' or reanalysis with at least a temporal dimension (by default 'time'), -#' which must be greater than 2. +#' or reanalysis with at least a temporal dimension (by default 'time'), +#' which must be greater than 2. #'@param method A character string indicating which methodology is applied -#' to compute the PDFs. One of "ME" (default) or "LMEV". -#'@param time_dim_name A character string indicating the name of the temporal dimension, by default 'time'. +#' to compute the PDFs. One of "ME" (default) or "LMEV". +#'@param time_dim_name A character string indicating the name of the temporal +#' dimension, by default 'time'. #'@param na.rm Logical (default = FALSE). Should missing values be removed? #' -#'@return an array with at least two dimensions (time, statistic = 4). -#' The firt statistic is the parameter 'mean' of the PDF with not bias corrected -#' The second statistic is the parameter 'standard deviation' of the PDF -#' with not bias corrected -#' The third statistic is the parameter 'mean' of the PDF with bias corrected -#' The fourth statistic is the parameter 'standard deviation' of the PDF -#' with bias corrected -#' @import multiApply -#' @examples +#'@return An array with at least two dimensions (time, statistic = 4). The firt +#'statistic is the parameter 'mean' of the PDF with not bias corrected. +#'The second statistic is the parameter 'standard deviation' of the PDF with not +#'bias corrected. The third statistic is the parameter 'mean' of the PDF with +#'bias corrected. The fourth statistic is the parameter 'standard deviation' of +#'the PDF with bias corrected. +#'@import multiApply +#'@examples #' # Example for the PDFIndexHind function #' # Example 1 #' index_obs <- 1 : (5 * 3 ) @@ -447,10 +440,9 @@ BEI_PDFBest <- function(index_obs, index_hind1, index_hind2, #' index_hind <- 1 : (5 * 2 * 3) #' dim(index_hind) <- c(time = 5, statistic = 2, season = 3) #' res <- PDFIndexHind(index_hind, index_obs) -#' dim(res) -#' # time statistic season -#' # 5 4 3 -#'@noRd +#'@import multiApply +#'@importFrom verification verify +#'@export PDFIndexHind <- function(index_hind, index_obs, method ='ME', time_dim_name = 'time', na.rm = FALSE) { if (!is.character(time_dim_name)) { @@ -529,42 +521,35 @@ PDFIndexHind <- function(index_hind, index_obs, method ='ME', return(PDFstatistics$output1) } - - #' Atomic PDFIndexHind -#' @param index_hind Index (e.g. NAO index) array from a SFS with dimensions -#' (time, member) or (time, statistic) for a hindcast period. -#' The temporal dimension, by default 'time', must be greater -#' than 2. -#' @param index_obs Index (e.g. NAO index) array from an observational dataset -#' or reanalysis with dimension (time). The temporal dimension, -#' by default 'time', must be greater than 2. -#' @param method A character string indicating which methodology is applied -#' to compute the PDF. One of "ME" (default) or "LMEV". -#' @param time_dim_name A character string indicating the name of the temporal dimension, by default 'time'. -#' @param na.rm Logical. Should missing values be removed? -#' @return .PDFIndexHind returns an array with dimensions (time, statistic = 4). -#' The firt statistic is the parameter 'mean' of the PDF with not bias corrected -#' for the hindcast period. -#' The second statistic is the parameter 'standard deviation' of the PDF -#' with not bias corrected for the hindcast period. -#' The third statistic is the parameter 'mean' of the PDF with bias corrected -#' for the hindcast period. -#' The fourth statistic is the parameter 'standard deviation' of the PDF -#' with bias corrected for the hindcast period. -#' @import multiApply -#' @importFrom verification verify -#' @examples +#'@param index_hind Index (e.g. NAO index) array from a SFS with dimensions +#' (time, member) or (time, statistic) for a hindcast period. +#' The temporal dimension, by default 'time', must be greater than 2. +#'@param index_obs Index (e.g. NAO index) array from an observational dataset +#' or reanalysis with dimension (time). The temporal dimension, +#' by default 'time', must be greater than 2. +#'@param method A character string indicating which methodology is applied +#' to compute the PDF. One of "ME" (default) or "LMEV". +#'@param time_dim_name A character string indicating the name of the temporal +#' dimension, by default 'time'. +#'@param na.rm Logical. Should missing values be removed? +#'@return .PDFIndexHind returns an array with dimensions (time, statistic = 4). +#'The firt statistic is the parameter 'mean' of the PDF with not bias corrected +#'for the hindcast period. The second statistic is the parameter 'standard +#'deviation' of the PDF with not bias corrected for the hindcast period. +#'The third statistic is the parameter 'mean' of the PDF with bias corrected +#'for the hindcast period. The fourth statistic is the parameter 'standard +#'deviation' of the PDF with bias corrected for the hindcast period. +#'@examples #' # Example for the Atomic PDFIndexHind function #' index_obs <- 1 : 10 #' dim(index_obs) <- c(time = length(index_obs)) #' index_hind <- 1 : (10 * 3) #' dim(index_hind) <- c(time = 10, member = 3) #' res <- .PDFIndexHind(index_hind, index_obs) -#' dim(res) -#' # time statistic -#' # 10 4 -#' @noRd +#'@import multiApply +#'@importFrom verification verify +#'@noRd .PDFIndexHind <- function(index_hind, index_obs, method = 'ME', time_dim_name = 'time', na.rm = FALSE) { dimnameshind <- names(dim(index_hind)) @@ -630,39 +615,40 @@ PDFIndexHind <- function(index_hind, index_obs, method ='ME', #' #'@references Regionally improved seasonal forecast of precipitation through Best #' estimation of winter NAO, Sanchez-Garcia, E. et al., -#' Adv. Sci. Res., 16, 165174, 2019, https://doi.org/10.5194/asr-16-165-2019 +#' Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} #' #'@param index_hind Index (e.g. NAO index) array from SFSs -#' with at least two dimensions (time , member) or (time, statistic). -#' The temporal dimension, by default 'time', must be greater than 2. -#' The dimension 'member' must be greater than 1. -#' The dimension 'statistic' must be equal to 2, for containing the two paramenters of -#' a normal distribution (mean and sd) representing the ensemble of a SFS. -#' It is not possible to have the dimension 'member' and 'statistic' together. +#' with at least two dimensions (time , member) or (time, statistic). +#' The temporal dimension, by default 'time', must be greater than 2. +#' The dimension 'member' must be greater than 1. +#' The dimension 'statistic' must be equal to 2, for containing the two +#' paramenters of a normal distribution (mean and sd) representing the ensemble +#' of a SFS. It is not possible to have the dimension 'member' and 'statistic' +#' together. #'@param index_obs Index (e.g. NAO index) array from an observational database -#' or reanalysis with at least a temporal dimension (by default 'time'), -#' which must be greater than 2. -#'@param index_fcst Index (e.g. NAO index) array from SFSs -#' with at least two dimensions (time , member) or (time, statistic). -#' The temporal dimension, by default 'time', must be equal to 1, the forecast year target. -#' The dimension 'member' must be greater than 1. -#' The dimension 'statistic' must be equal to 2, for containing the two paramenters of -#' a normal distribution (mean and sd) representing the ensemble of a SFS. -#' It is not possible to have the dimension 'member' and 'statistic' together. +#' or reanalysis with at least a temporal dimension (by default 'time'), +#' which must be greater than 2. +#'@param index_fcst Index (e.g. NAO index) array from SFSs with at least two +#' dimensions (time , member) or (time, statistic). The temporal dimension, by +#' default 'time', must be equal to 1, the forecast year target. The dimension +#' 'member' must be greater than 1. The dimension 'statistic' must be equal to +#' 2, for containing the two paramenters of a normal distribution (mean and sd) +#' representing the ensemble of a SFS. It is not possible to have the dimension +#' 'member' and 'statistic' together. #'@param method A character string indicating which methodology is applied -#' to compute the PDFs. One of "ME" (default) or "LMEV". -#'@param time_dim_name A character string indicating the name of the temporal dimension, by default 'time'. +#' to compute the PDFs. One of "ME" (default) or "LMEV". +#'@param time_dim_name A character string indicating the name of the temporal +#' dimension, by default 'time'. #'@param na.rm Logical (default = FALSE). Should missing values be removed? #' -#'@return an array with at least two dimensions (time = 1, statistic = 4). -#' The firt statistic is the parameter 'mean' of the PDF with not bias corrected -#' The second statistic is the parameter 'standard deviation' of the PDF -#' with not bias corrected -#' The third statistic is the parameter 'mean' of the PDF with bias corrected -#' The fourth statistic is the parameter 'standard deviation' of the PDF -#' with bias corrected -#' @import multiApply -#' @examples +#'@return An array with at least two dimensions (time = 1, statistic = 4). +#'The firt statistic is the parameter 'mean' of the PDF with not bias corrected +#'The second statistic is the parameter 'standard deviation' of the PDF with not +#'bias corrected. The third statistic is the parameter 'mean' of the PDF with +#'bias corrected. The fourth statistic is the parameter 'standard deviation' of +#'the PDF with bias corrected. +#'@import multiApply +#'@examples #' # Example for the PDFIndexFcst function #' index_fcst <- 1 : (8 * 4) #' dim(index_fcst) <- c(time = 1, member = 8, season = 4) @@ -787,38 +773,35 @@ PDFIndexFcst <- function(index_hind, index_obs, index_fcst, return(PDFstatistics$output1) } -#' Atomic PDFIndexFcst -#' @param index_hind Index (e.g. NAO index) array from a SFS with dimensions -#' (time, member) or (time, statistic) for a hindcast period. -#' The temporal dimension, by default 'time', must be greater -#' than 2. -#' @param index_obs Index (e.g. NAO index) array from an observational dataset -#' or reanalysis with dimension (time). The temporal dimension, -#' by default 'time', must be greater than 2. -#' @param index_fcst Index (e.g. NAO index) array from SFSs -#' with dimensions (time , member) or (time, statistic). -#' The temporal dimension, by default 'time', must be equal to 1, -#' the forecast year target. -#' The dimension 'member' must be greater than 1. -#' The dimension 'statistic' must be equal to 2, for containing the two paramenters of -#' a normal distribution (mean and sd) representing the ensemble of a SFS. -#' It is not possible to have the dimension 'member' and 'statistic' together. -#' @param method A character string indicating which methodology is applied -#' to compute the PDF. One of "ME" (default) or "LMEV". -#' @param time_dim_name A character string indicating the name of the temporal dimension, by default 'time'. -#' @param na.rm Logical. Should missing values be removed? -#' @return .PDFIndexFcst returns an array with dimensions (time = 1, statistic=4). -#' The firt statistic is the parameter 'mean' of the PDF with not bias corrected -#' for the forecast year. -#' The second statistic is the parameter 'standard deviation' of the PDF -#' with not bias corrected for the forecast year. -#' The third statistic is the parameter 'mean' of the PDF with bias corrected -#' for the forecast year. -#' The fourth statistic is the parameter 'standard deviation' of the PDF -#' with bias corrected for the forecast year. -#' @import multiApply -#' @importFrom verification verify -#' @examples +#'Atomic PDFIndexFcst +#'@param index_hind Index (e.g. NAO index) array from a SFS with dimensions +#' (time, member) or (time, statistic) for a hindcast period. The temporal +#' dimension, by default 'time', must be greater than 2. +#'@param index_obs Index (e.g. NAO index) array from an observational dataset +#' or reanalysis with dimension (time). The temporal dimension, by default +#' 'time', must be greater than 2. +#'@param index_fcst Index (e.g. NAO index) array from SFSs with dimensions +#' (time , member) or (time, statistic). The temporal dimension, by default +#' 'time', must be equal to 1, the forecast year target. The dimension 'member' +#' must be greater than 1. The dimension 'statistic' must be equal to 2, for +#' containing the two paramenters of a normal distribution (mean and sd) +#' representing the ensemble of a SFS. It is not possible to have the dimension +#' 'member' and 'statistic' together. +#'@param method A character string indicating which methodology is applied +#' to compute the PDF. One of "ME" (default) or "LMEV". +#'@param time_dim_name A character string indicating the name of the temporal +#' dimension, by default 'time'. +#'@param na.rm Logical. Should missing values be removed? +#'@return .PDFIndexFcst Returns an array with dimensions +#'(time = 1, statistic = 4). The firt statistic is the parameter 'mean' of the +#'PDF with not bias corrected for the forecast year. The second statistic is the +#'parameter 'standard deviation' of the PDF with not bias corrected for the +#'forecast year. The third statistic is the parameter 'mean' of the PDF with +#'bias corrected for the forecast year. The fourth statistic is the parameter +#''standard deviation' of the PDF with bias corrected for the forecast year. +#'@import multiApply +#'@importFrom verification verify +#'@examples #' # Example 1 for the Atomic PDFIndexFcst function #' index_fcst <- 1 : (1 * 6) #' dim(index_fcst) <- c(time = 1, member = 6) diff --git a/R/BEI_Weights.R b/R/BEI_Weights.R index 63fb22c311618ecbe8e9259203ccc4796c7d9d9c..e550af1eba5d70846b46528f4fe30cc42955ca4d 100644 --- a/R/BEI_Weights.R +++ b/R/BEI_Weights.R @@ -3,32 +3,30 @@ #'@author Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} #' #'@description This function implements the computation to obtain the -#' normalized weights for each member of each Seasonal Forecast Systems (SFS) -#' or dataset using the Probability Density Functions (PDFs) indicated by the -#' parameter 'pdf_weight' (for instance the Best Index estimation obtained -#' using the 'PDFBest' function). The weight of each member is proportional to -#' the probability of its index calculated with the PDF "pdf_weight". +#'normalized weights for each member of each Seasonal Forecast Systems (SFS) +#'or dataset using the Probability Density Functions (PDFs) indicated by the +#'parameter 'pdf_weight' (for instance the Best Index estimation obtained +#'using the 'PDFBest' function). The weight of each member is proportional to +#'the probability of its index calculated with the PDF "pdf_weight". #' -#'@references Regionally improved seasonal forecast of precipitation through Best -#' estimation of winter NAO, Sanchez-Garcia, E. et al., -#' Adv. Sci. Res., 16, 165174, 2019, https://doi.org/10.5194/asr-16-165-2019 +#'@references Regionally improved seasonal forecast of precipitation through +#'Best estimation of winter NAO, Sanchez-Garcia, E. et al., +#'Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} #' #'@param index_weight Index (e.g. NAO index) array, from a dataset of SFSs -#' for a period of years, with at least dimensions 'member'. -#' Additional dimensions, for instance, a temporal dimension as 'time', -#' must have the same lenght in both parameters, -#' 'index_weight' and 'pdf_weight'. -#' @param pdf_weight Statistics array to define a Gaussian PDF with at least -#' dimensions 'statistic'. -#' The firt statistic is the parameter 'mean' of the PDF and -#' the second statistic is the parameter 'standard deviation' of the PDF. -#' @param time_dim_name A character string indicating the name of the temporal dimension, by default 'time'. +#' for a period of years, with at least dimensions 'member'. +#' Additional dimensions, for instance, a temporal dimension as 'time', +#' must have the same lenght in both parameters, 'index_weight' and +#' 'pdf_weight'. +#'@param pdf_weight Statistics array to define a Gaussian PDF with at least +#' dimensions 'statistic'. The firt statistic is the parameter 'mean' of the PDF +#' and the second statistic is the parameter 'standard deviation' of the PDF. +#'@param time_dim_name A character string indicating the name of the temporal +#' dimension, by default 'time'. #' #'@return BEI_Weights() returns a normalized weights array with the same #' dimensions that index_weight. #' -#'@import multiApply -#' #'@examples #' # Example for the BEI_Weights function #' index_weight <- 1 : (10 * 3 * 5 * 1) @@ -40,6 +38,7 @@ #' # sdate dataset member season #' # 10 3 5 1 #' +#'@import multiApply #'@export BEI_Weights <- function(index_weight, pdf_weight, time_dim_name = 'time') { @@ -98,16 +97,16 @@ BEI_Weights <- function(index_weight, pdf_weight, time_dim_name = 'time') { return(aweights) } -#' Atomic BEI_Weights -#' @param index_weight Index (e.g. NAO index) array from a SFS with dimensions -#' (member) -#' @param pdf_weight Statistics array to define a Gaussian PDF with dimensions -#' (statistic = 2). -#' The firt statistic is the parameter 'mean' of the PDF and -#' the second statistic is the parameter 'standard deviation' of the PDF. -#' @return .BEI_Weights returns an array of with dimensions (member), -#' the normalized weights for each member of a SFS using a Best NAO PDF. -#' @examples +#'Atomic BEI_Weights +#'@param index_weight Index (e.g. NAO index) array from a SFS with dimensions +#' (member) +#'@param pdf_weight Statistics array to define a Gaussian PDF with dimensions +#' (statistic = 2). +#' The firt statistic is the parameter 'mean' of the PDF and +#' the second statistic is the parameter 'standard deviation' of the PDF. +#'@return .BEI_Weights returns an array of with dimensions (member), +#'the normalized weights for each member of a SFS using a Best NAO PDF. +#'@examples #' # Example for the Atomic BEI_Weights function #' index_weight <- c(1.3,3,-1) #' dim(index_weight) <- c(member = 3) @@ -117,7 +116,7 @@ BEI_Weights <- function(index_weight, pdf_weight, time_dim_name = 'time') { #' dim(res) #' # member #' # 3 -#' @noRd +#'@noRd .BEI_Weights <- function(index_weight, pdf_weight) { aweights <- apply(index_weight, 1, dnorm, mean = pdf_weight[1], sd = pdf_weight[2]) dim(aweights) <- dim(index_weight) diff --git a/R/CST_AdamontAnalog.R b/R/CST_AdamontAnalog.R index 4020d1c265f1d6fc33ecfc8f26e6b1764a4743b8..23bdb5315f7b0b0f25ea67b64823a22eb4c35b07 100644 --- a/R/CST_AdamontAnalog.R +++ b/R/CST_AdamontAnalog.R @@ -7,154 +7,192 @@ #'@author Paola Marson, \email{paola.marson@meteo.fr} for PROSNOW version #'@author Lauriane Batté, \email{lauriane.batte@meteo.fr} for CSTools adaptation #' -#'@param exp \itemize{ -#'\item\code{CST_AdamontAnalog}{experiment data an object of class \code{s2dv_cube}, can be output -#'from quantile correction using CST_AdamontQQCorr} -#'\item\code{AdamontAnalog}{experiment data array with named dimension}} -#'@param wt_exp corresponding weather types (same dimensions as \code{exp$data} -#'but lat/lon) -#'@param obs \itemize{ -#'\item\code{CST_AdamontAnalog}{reference data, also of class \code{s2dv_cube}.} -#'\item\code{AdamontAnalog}{reference data array with named dimension.}} -#'Note that lat/lon dimensions need to be the same as \code{exp} -#'@param wt_obs corresponding weather types (same dimensions as \code{obs$data} -#'but lat/lon) -#'@param nanalogs integer defining the number of analog values to return -#'(default: 5) -#'@param method a character string indicating the method used for analog -#'definition -#' Coded are 'pattcorr': pattern correlation -#' 'rain1' (for precip patterns): rain occurrence consistency -#' 'rain01' (for precip patterns): rain occurrence/non -#' occurrence consistency -#'@param thres real number indicating the threshold to define rain -#'occurrence/non occurrence in rain(0)1 -#'@param search_obsdims list of dimensions in \code{obs} along which analogs are -#'searched for -#'@param londim name of longitude dimension -#'@param latdim name of latitude dimension -#'@return analog_vals -#'\itemize{ -#'\item\code{CST_AdamontAnalog}{an object of class \code{s2dv_cube} containing nanalogs -#'analog values for each value of \code{exp} input data} -#'\item\code{AdamontAnalog}{an array containing nanalogs analog values}} +#'@param exp Experiment data an object of class \code{s2dv_cube}, can be output +#' from quantile correction using CST_AdamontQQCorr. +#'@param wt_exp Corresponding weather types (same dimensions as \code{exp$data} +#' but lat/lon). +#'@param obs Reference data, also of class \code{s2dv_cube}. Note that lat/lon +#' dimensions need to be the same as \code{exp}. +#'@param wt_obs Corresponding weather types (same dimensions as \code{obs$data} +#' but lat/lon) +#'@param nanalogs Integer defining the number of analog values to return +#' (default: 5). +#'@param method A character string indicating the method used for analog +#' definition. It can be: +#' \itemize{ +#' \item{'pattcorr': pattern correlation.} +#' \item{'rain1' (for precip patterns): rain occurrence consistency.} +#' \item{'rain01' (for precip patterns): rain occurrence/non occurrence +#' consistency} +#' } +#'@param thres Real number indicating the threshold to define rain +#' occurrence/non occurrence in rain (0)1. +#'@param search_obsdims List of dimensions in \code{obs} along which analogs are +#' searched for. +#'@param londim Name of longitude dimension. +#'@param latdim Name of latitude dimension. +#'@return analog_vals An object of class \code{s2dv_cube} containing +#' nanalogs analog values for each value of \code{exp} input data. +#'@examples +#'wt_exp <- sample(1:3, 15*6*3, replace = TRUE) +#'dim(wt_exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3) +#'wt_obs <- sample(1:3, 6*3, replace = TRUE) +#'dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3) +#'exp <- NULL +#'exp$data <- 1 : c(1 * 15 * 6 * 3 * 8 * 8) +#'dim(exp$data) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3, +#' lat = 8, lon = 8) +#'class(exp) <- 's2dv_cube' +#'obs <- NULL +#'obs$data <- 101 : c(100 + 1 * 1 * 6 * 3 * 8 * 8) +#'dim(obs$data) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3, +#' lat = 8, lon = 8) +#'class(obs) <- 's2dv_cube' +#'analog_vals <- CST_AdamontAnalog(exp = exp, obs = obs, wt_exp = wt_exp, +#' wt_obs = wt_obs, nanalogs = 2) #'@import multiApply #'@importFrom ClimProjDiags Subset -#'@examples -#'\dontrun{ -#'wt_exp <- sample(1:3, 15*6*3, replace=T) -#'dim(wt_exp) <- c(dataset=1, member=15, sdate=6, ftime=3) -#'wt_obs <- sample(1:3, 6*3, replace=T) -#'dim(wt_obs) <- c(dataset=1, member=1, sdate=6, ftime=3) -# analog_vals <- CST_AdamontAnalog(exp=lonlat_temp$exp, obs=lonlat_temp$obs, wt_exp=wt_exp, wt_obs=wt_obs, nanalogs=2) -#'} -#'\dontrun{ -#'wt_exp <- sample(1:3, 15*6*3, replace=T) -#'dim(wt_exp) <- c(dataset=1, member=15, sdate=6, ftime=3) -#'wt_obs <- sample(1:3, 6*3, replace=T) -#'dim(wt_obs) <- c(dataset=1, member=1, sdate=6, ftime=3) -# analog_vals <- AdamontAnalog(exp=lonlat_temp$exp$data, -#' obs=lonlat_temp$obs$data, wt_exp=wt_exp, wt_obs=wt_obs, nanalogs=2) -#'} - +#'@export CST_AdamontAnalog <- function(exp, obs, wt_exp, wt_obs, nanalogs, - method = 'pattcorr', thres = NULL, - search_obsdims = c('member', 'sdate', 'ftime'), - londim = 'lon', latdim = 'lat') { - + method = 'pattcorr', thres = NULL, + search_obsdims = c('member', 'sdate', 'ftime'), + londim = 'lon', latdim = 'lat') { dimnames <- names(dim(obs$data)) dimnamesexp <- names(dim(exp$data)) + if (!inherits(exp, 's2dv_cube') || !inherits(obs, 's2dv_cube')) { - stop("Inputs 'exp' and 'obs' must be of class 's2dv_cube', ", - "as output by CSTools::CST_Load.") + stop("Inputs 'exp' and 'obs' must be of class 's2dv_cube', ", + "as output by CSTools::CST_Load.") } if (!(method %in% c('pattcorr','rain1','rain01'))) { - stop("Input parameter 'method' must be 'pattcorr', 'rain1', or 'rain01'") + stop("Input parameter 'method' must be 'pattcorr', 'rain1', or 'rain01'") } - if (is.null(nanalogs)){ + if (is.null(nanalogs)) { nanalogs <- 5 } if (!(latdim %in% dimnames) || !(londim %in% dimnames)){ - stop("'londim' or 'latdim' input doesn't match with 'obs$data' dimension", - " names") + stop("'londim' or 'latdim' input doesn't match with 'obs$data' dimension", + " names") } if (!(latdim %in% dimnamesexp) || !(londim %in% dimnamesexp)){ - stop("'londim' or 'latdim' input doesn't match with 'exp$data' dimension", - " names") + stop("'londim' or 'latdim' input doesn't match with 'exp$data' dimension", + " names") } - if (!all(search_obsdims %in% dimnames)){ - stop("Names in parameter 'search_obsdims' should match 'obs$data' ", - "dimension names.") + if (!all(search_obsdims %in% dimnames)) { + stop("Names in parameter 'search_obsdims' should match 'obs$data' ", + "dimension names.") } - if (!all(dim(wt_exp) %in% dim(exp$data))){ - stop("Dimensions for 'wt_exp' should match 'exp$data' except lat/lon") + if (!all(dim(wt_exp) %in% dim(exp$data))) { + stop("Dimensions for 'wt_exp' should match 'exp$data' except lat/lon") } - if (!all(dim(wt_obs) %in% dim(obs$data))){ - stop("Dimensions for 'wt_obs' should match 'obs$data' except lat/lon") + if (!all(dim(wt_obs) %in% dim(obs$data))) { + stop("Dimensions for 'wt_obs' should match 'obs$data' except lat/lon") } - plat_exp <- which(dimnamesexp==latdim) - plon_exp <- which(dimnamesexp==londim) - plat_obs <- which(dimnames==latdim) - plon_obs <- which(dimnames==londim) - if ((dim(obs$data)[plon_obs]!=dim(exp$data)[plon_exp]) || - (dim(obs$data)[plat_obs]!=dim(exp$data)[plat_exp])){ + plat_exp <- which(dimnamesexp == latdim) + plon_exp <- which(dimnamesexp == londim) + plat_obs <- which(dimnames == latdim) + plon_obs <- which(dimnames == londim) + if ((dim(obs$data)[plon_obs] != dim(exp$data)[plon_exp]) || + (dim(obs$data)[plat_obs] != dim(exp$data)[plat_exp])){ stop("Element 'data' from parameters 'obs' and 'exp' should have", - "same lon / lat dimensions if working with regular grids.") + "same lon / lat dimensions if working with regular grids.") } # End of sanity checks; call AdamontAnalog function analog_vals <- AdamontAnalog(exp = exp$data, obs = obs$data, wt_exp = wt_exp, - wt_obs=wt_obs, nanalogs = nanalogs, - method = method, thres = thres, - search_obsdims = search_obsdims, londim = londim, - latdim = latdim ) - + wt_obs = wt_obs, nanalogs = nanalogs, + method = method, thres = thres, + search_obsdims = search_obsdims, londim = londim, + latdim = latdim ) return(analog_vals) } - -#'AdamontAnalog finds analogous data in the reference dataset to experiment data -#'based on weather types +#'AdamontAnalog finds analogous data in the reference dataset to experiment +#'data based on weather types #' +#'@description This function searches for analogs in a reference dataset for +#'experiment data, based on corresponding weather types. The experiment data is +#'typically a hindcast, observations are typically provided by reanalysis data. +#'@author Paola Marson, \email{paola.marson@meteo.fr} for PROSNOW version +#'@author Lauriane Batté, \email{lauriane.batte@meteo.fr} for CSTools adaptation +#' +#' +#'@param exp A multidimensional array with named dimensions containing the +#' experiment data. +#'@param wt_exp Corresponding weather types (same dimensions as \code{exp$data} +#' but lat/lon). +#'@param obs A multidimensional array with named dimensions containing the +#' reference data. Note that lat/lon dimensions need to be the same as +#' \code{exp}. +#'@param wt_obs Corresponding weather types (same dimensions as \code{obs$data} +#' but lat/lon). +#'@param nanalogs Integer defining the number of analog values to return +#' (default: 5). +#'@param method A character string indicating the method used for analog +#' definition. It can be: +#' \itemize{ +#' \item{'pattcorr': pattern correlation.} +#' \item{'rain1' (for precip patterns): rain occurrence consistency.} +#' \item{'rain01' (for precip patterns): rain occurrence/non occurrence +#' consistency} +#' } +#'@param thres Real number indicating the threshold to define rain +#' occurrence/non occurrence in rain (0)1. +#'@param search_obsdims List of dimensions in \code{obs} along which analogs are +#' searched for. +#'@param londim Name of longitude dimension. +#'@param latdim Name of latitude dimension. +#'@return analog_vals An array containing nanalogs analog values. +#'@examples +#'wt_exp <- sample(1:3, 15*6*3, replace = TRUE) +#'dim(wt_exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3) +#'wt_obs <- sample(1:3, 6*3, replace = TRUE) +#'dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3) +#'exp <- 1 : c(1 * 15 * 6 * 3 * 8 * 8) +#'dim(exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3, lat = 8, lon = 8) +#'obs <- 101 : c(100 + 1 * 1 * 6 * 3 * 8 * 8) +#'dim(obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3, lat = 8, lon = 8) +#'analog_vals <- AdamontAnalog(exp = exp, obs = obs, wt_exp = wt_exp, +#' wt_obs = wt_obs, nanalogs = 2) #'@import multiApply #'@importFrom ClimProjDiags Subset #'@rdname CST_AdamontAnalog #'@export -AdamontAnalog <- function(exp, obs, wt_exp, wt_obs, nanalogs=5, - method = 'pattcorr', thres = NULL, - search_obsdims = c('member', 'sdate', 'ftime'), - londim = 'lon', latdim = 'lat') { +AdamontAnalog <- function(exp, obs, wt_exp, wt_obs, nanalogs = 5, + method = 'pattcorr', thres = NULL, + search_obsdims = c('member', 'sdate', 'ftime'), + londim = 'lon', latdim = 'lat') { # exp: lat, lon, sdate, ftime, member # obs: lat, lon, dims for searching 'sdate' 'ftime'... # wt_exp: sdate, ftime, member # wt_obs: the dims for searching dimnames <- names(dim(obs)) dimnamesexp <- names(dim(exp)) - if (method %in% c('rain1','rain01') & is.null(thres)){ - stop("Threshold 'thres' must be defined with methods 'rain1' and 'rain01'") + if (method %in% c('rain1','rain01') & is.null(thres)) { + stop("Threshold 'thres' must be defined with methods 'rain1' and 'rain01'") } - if (method == 'pattcorr' & !is.null(thres)){ - warning("Parameter 'thres' is not used with method 'pattcorr'.") + if (method == 'pattcorr' & !is.null(thres)) { + warning("Parameter 'thres' is not used with method 'pattcorr'.") } - if (!(latdim %in% dimnamesexp) || !(londim %in% dimnamesexp)){ - stop("'londim' or 'latdim' input doesn't match with 'exp' dimension names") + if (!(latdim %in% dimnamesexp) || !(londim %in% dimnamesexp)) { + stop("'londim' or 'latdim' input doesn't match with 'exp' dimension names") } # Position of lat/lon dimensions in exp data poslatexp <- which(dimnamesexp == latdim) poslonexp <- which(dimnamesexp == londim) poslatobs <- which(dimnames == latdim) poslonobs <- which(dimnames == londim) - if (!all(search_obsdims %in% dimnames)){ - stop("Names in parameter 'search_obsdims' should match 'obs' ", - "dimension names.") + if (!all(search_obsdims %in% dimnames)) { + stop("Names in parameter 'search_obsdims' should match 'obs' ", + "dimension names.") } if (!all(dim(wt_exp) %in% dim(exp))){ - stop("Dimensions for 'wt_exp' should match 'exp' except lat/lon") + stop("Dimensions for 'wt_exp' should match 'exp' except lat/lon") } if (!all(dim(wt_obs) %in% dim(obs))){ - stop("Dimensions for 'wt_obs' should match 'obs' except lat/lon") + stop("Dimensions for 'wt_obs' should match 'obs' except lat/lon") } if ((dim(obs)[poslonobs]!=dim(exp)[poslonexp]) || (dim(obs)[poslatobs]!=dim(exp)[poslatexp])){ - stop("Parameters 'obs' and 'exp' should have same lon / lat dimensions.") + stop("Parameters 'obs' and 'exp' should have same lon / lat dimensions.") } ## Reshaping obs: @@ -162,9 +200,9 @@ AdamontAnalog <- function(exp, obs, wt_exp, wt_obs, nanalogs=5, if (length(search_obsdims) > 1) { for (i in 1:(length(search_obsdims) - 1)) { obs <- MergeDims(obs, search_obsdims[i:(i + 1)], - rename_dim = search_obsdims[i + 1]) + rename_dim = search_obsdims[i + 1]) wt_obs <- MergeDims(wt_obs, search_obsdims[i:(i + 1)], - rename_dim = search_obsdims[i + 1]) + rename_dim = search_obsdims[i + 1]) } } names(dim(obs))[which(names(dim(obs)) == search_obsdims[length(search_obsdims)])] <- 'time' @@ -177,20 +215,20 @@ AdamontAnalog <- function(exp, obs, wt_exp, wt_obs, nanalogs=5, target_dims = list(c(londim, latdim), c(londim, latdim, 'time', 'type'), NULL), - .analogs, method = method, thres = thres)$output1 + .aanalogs, method = method, thres = thres)$output1 # Reshaping output: - analog_vals <- Subset(analog_vals,along='type',indices=1,drop='selected') + analog_vals <- Subset(analog_vals, along = 'type', indices = 1, drop = 'selected') poslat <- which(names(dim(analog_vals)) == latdim) poslon <- which(names(dim(analog_vals)) == londim) postime <- which(names(dim(analog_vals)) == 'time') # Dimension with N analogs pos <- 1:length(dim(analog_vals)) if (poslatexp > poslonexp){ analog_vals <- aperm(analog_vals,c(pos[-c(poslon,poslat,postime)], - postime,poslon,poslat)) + postime,poslon,poslat)) } else { analog_vals <- aperm(analog_vals,c(pos[-c(poslon,poslat,postime)], - postime,poslat,poslon)) + postime,poslat,poslon)) } # Renaming 'time' dim to 'analog' names(dim(analog_vals))[which(names(dim(analog_vals)) == 'time')] <- 'analog' @@ -198,23 +236,24 @@ AdamontAnalog <- function(exp, obs, wt_exp, wt_obs, nanalogs=5, } -.analogs <- function(exp, obs, wt_exp, nanalogs = 5, method = 'pattcorr', - thres = NULL, londimexp = 'lon', latdimexp = 'lat', - londimobs = 'lon', latdimobs = 'lat') { +.aanalogs <- function(exp, obs, wt_exp, nanalogs = 5, method = 'pattcorr', + thres = NULL, londimexp = 'lon', latdimexp = 'lat', + londimobs = 'lon', latdimobs = 'lat') { # exp: lon, lat # obs: lon, lat, time, wt # wt_exp: wt single scalar search_analog <- switch(method, 'rain1' = .rain1, 'rain01' = .rain01, - 'pattcorr' = .pattcor, + 'pattcorr' = .pattcor, stop(paste0("Adamont Analog function only supports ", - "methods 'rain1', 'rain01', 'pattcorr'"))) + "methods 'rain1', 'rain01', 'pattcorr'"))) obs <- Subset(obs, along = 'type', indices = wt_exp) accuracy <- Apply(list(exp, obs), target_dims = list(c(londimexp, latdimexp), c(londimobs, latdimobs)), search_analog, thres = thres)$output1 - obs <- Subset(obs, along = 'time', indices = order(accuracy, decreasing = TRUE)[1:nanalogs]) + obs <- Subset(obs, along = 'time', + indices = order(accuracy, decreasing = TRUE)[1:nanalogs]) return(obs) } diff --git a/R/CST_AdamontQQCorr.R b/R/CST_AdamontQQCorr.R index 71a768e21437283e5134241aab9eefd61d8e4734..3309180358435ca8793c437c698c239b81c4be4e 100644 --- a/R/CST_AdamontQQCorr.R +++ b/R/CST_AdamontQQCorr.R @@ -8,94 +8,106 @@ #'@author Paola Marson, \email{paola.marson@meteo.fr} #'@author Gildas Dayon, \email{gildas.dayon@meteo.fr} #' -#'@param exp experiment data an object of class \code{s2dv_cube} -#'@param wt_exp corresponding weather types (same dimensions as \code{exp$data} -#' but lat/lon) -#'@param obs reference data, also of class \code{s2dv_cube}. lat/lon dimensions -#' can differ from \code{exp} if non rectilinear latlon grids are used, -#' in which case regrid should be set to TRUE and .NearestNeighbors \code{NN} -#' output should be provided -#'@param wt_obs corresponding weather types (same dimensions as \code{obs} but -#'lat/lon) -#'@param corrdims list of dimensions in \code{exp} for which quantile mapping -#' correction is applied -#'@param londim character name of longitude dimension in \code{exp} and -#' \code{obs} -#'@param latdim character name of latitude dimension in \code{exp} and -#' \code{obs} +#'@param exp Experiment data an object of class \code{s2dv_cube}. +#'@param wt_exp Corresponding weather types (same dimensions as \code{exp$data} +#' but lat/lon). +#'@param obs Reference data, also of class \code{s2dv_cube}. lat/lon dimensions +#' can differ from \code{exp} if non rectilinear latlon grids are used, +#' in which case regrid should be set to TRUE and .NearestNeighbors \code{NN} +#' output should be provided. +#'@param wt_obs Corresponding weather types (same dimensions as \code{obs} but +#' lat/lon). +#'@param corrdims List of dimensions in \code{exp} for which quantile mapping +#' correction is applied. +#'@param londim Character name of longitude dimension in \code{exp} and +#' \code{obs}. +#'@param latdim Character name of latitude dimension in \code{exp} and +#' \code{obs}. #' -#'@return an object of class \code{s2dv_cube} containing experiment data on the -#' lat/lon grid of \code{obs} input data, corrected by quantile mapping -#' depending on the weather types \code{wt_exp} +#'@return An object of class \code{s2dv_cube} containing experiment data on the +#'lat/lon grid of \code{obs} input data, corrected by quantile mapping +#'depending on the weather types \code{wt_exp}. #' +#'@examples +#'wt_exp <- c(1,1,2,3,3,2,2,1,1,2,2,3) +#'dim(wt_exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) +#'wt_obs <- c(3,3,1,2,2,2,2,1,3,1,1,2) +#'dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) +#'exp <- NULL +#'exp$data <- 1 : c(1 * 1 * 4 * 3 * 4 * 4) +#'dim(exp$data) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, +#' lat = 4, lon = 4) +#'class(exp) <- 's2dv_cube' +#'obs <- NULL +#'obs$data <- 101 : c(100 + 1 * 1 * 4 * 3 * 4 * 4) +#'dim(obs$data) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, +#' lat = 4, lon = 4) +#'class(obs) <- 's2dv_cube' +#'exp_corr <- CST_AdamontQQCorr(exp = exp, wt_exp = wt_exp, +#' obs = obs, wt_obs = wt_obs, +#' corrdims = c('dataset','member','sdate','ftime')) #'@import qmap #'@importFrom ClimProjDiags Subset #'@import multiApply #'@import abind -#'@examples -#'\dontrun{ -#'wt_exp <- sample(1:3, 15*6*3, replace=T) -#'dim(wt_exp) <- c(dataset=1, member=15, sdate=6, ftime=3) -#'wt_obs <- sample(1:3, 6*3, replace=T) -#'dim(wt_obs) <- c(dataset=1, member=1, sdate=6, ftime=3) -#'exp_corr <- CST_AdamontQQCorr(exp = lonlat_temp$exp, wt_exp = wt_exp, -#' obs=lonlat_temp$obs, wt_obs = wt_obs, -#' corrdims = c('dataset','member','sdate','ftime')) -#'} +#'@export CST_AdamontQQCorr <- function(exp, wt_exp, obs, wt_obs, - corrdims = c('member','sdate','ftime'), - londim='lon', latdim='lat') { + corrdims = c('member', 'sdate', 'ftime'), + londim = 'lon', latdim = 'lat') { - if (!inherits(exp, 's2dv_cube') || !inherits(obs, 's2dv_cube')){ - stop("Inputs 'exp' and 'obs' must be of class 's2dv_cube', ", - "as output by CSTools::CST_Load.") - } - dimnames <- names(dim(obs$data)) - dimnamesexp <- names(dim(exp$data)) - if (!(latdim %in% dimnames) || !(londim %in% dimnames)){ - stop("'londim' or 'latdim' input doesn't match with 'obs$data' dimension", - " names") - } - if (!(latdim %in% dimnamesexp) || !(londim %in% dimnamesexp)){ - stop("'londim' or 'latdim' input doesn't match with 'exp$data' dimension", - " names") - } - if (!(('time' %in% corrdims) || ('ftime' %in% corrdims))){ - warning("Forecast time should be one of the dimensions for the correction - specified in corrdims input list") - } - if (!all(corrdims %in% dimnamesexp)){ - stop("Names in parameter 'corrdims' should match input dimension names.") - } - if (!all(dim(wt_exp) %in% dim(exp$data))){ - stop("Dimensions for 'wt_exp' should match 'exp$data' except lat/lon") - } - if (!all(dim(wt_obs) %in% dim(obs$data))){ - stop("Dimensions for 'wt_obs' should match 'obs$data' except lat/lon") - } - if ((length(dim(exp$lon))==2) || (length(dim(obs$lon))==2)){ - myNN <- .NearestNeighbors(exp=exp, obs=obs, method='ADA') - exp_corr <- AdamontQQCorr(exp=exp$data, wt_exp=wt_exp, obs=obs$data, - wt_obs=wt_obs, corrdims=corrdims, - londim=londim, latdim=latdim, - regrid=TRUE, NN=myNN) - } else { - ## If not (standard case) - ## exp$data lat/lon dimensions should match obs$data - plat_exp <- which(dimnamesexp==latdim) - plon_exp <- which(dimnamesexp==londim) - plat_obs <- which(dimnames==latdim) - plon_obs <- which(dimnames==londim) - if ((dim(obs$data)[plon_obs]!=dim(exp$data)[plon_exp]) || - (dim(obs$data)[plat_obs]!=dim(exp$data)[plat_exp])){ - stop("Element 'data' from parameters 'obs' and 'exp' should have", - "same lon / lat dimensions if working with regular grids.") - } - exp_corr <- AdamontQQCorr(exp=exp$data, wt_exp=wt_exp, obs=obs$data, - wt_obs=wt_obs, corrdims=corrdims, - londim=londim, latdim=latdim, regrid=FALSE) - } - return(exp_corr) + if (!inherits(exp, 's2dv_cube') || !inherits(obs, 's2dv_cube')){ + stop("Inputs 'exp' and 'obs' must be of class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + } + dimnames <- names(dim(obs$data)) + dimnamesexp <- names(dim(exp$data)) + if (!(latdim %in% dimnames) || !(londim %in% dimnames)) { + stop("'londim' or 'latdim' input doesn't match with 'obs$data' dimension", + " names") + } + if (!(latdim %in% dimnamesexp) || !(londim %in% dimnamesexp)) { + stop("'londim' or 'latdim' input doesn't match with 'exp$data' dimension", + " names") + } + if (!(('time' %in% corrdims) || ('ftime' %in% corrdims))) { + warning("Forecast time should be one of the dimensions for the correction ", + "specified in corrdims input list") + } + if (!all(corrdims %in% dimnamesexp)) { + stop("Names in parameter 'corrdims' should match input dimension names.") + } + if (!all(dim(wt_exp) %in% dim(exp$data))) { + stop("Dimensions for 'wt_exp' should match 'exp$data' except lat/lon") + } + if (!all(dim(wt_obs) %in% dim(obs$data))) { + stop("Dimensions for 'wt_obs' should match 'obs$data' except lat/lon") + } + if ((length(dim(exp$coords[[londim]])) == 2) || + (length(dim(obs$coords[[londim]])) == 2)) { + myNN <- .NearestNeighbors(exp = exp, obs = obs, method = 'ADA') + exp_corr <- AdamontQQCorr(exp = exp$data, wt_exp = wt_exp, obs = obs$data, + wt_obs = wt_obs, corrdims = corrdims, + londim = londim, latdim = latdim, regrid = TRUE, + NN = myNN) + } else { + ## If not (standard case) + ## exp$data lat/lon dimensions should match obs$data + plat_exp <- which(dimnamesexp == latdim) + plon_exp <- which(dimnamesexp == londim) + plat_obs <- which(dimnames == latdim) + plon_obs <- which(dimnames == londim) + if ((dim(obs$data)[plon_obs] != dim(exp$data)[plon_exp]) || + (dim(obs$data)[plat_obs] != dim(exp$data)[plat_exp])) { + stop("Element 'data' from parameters 'obs' and 'exp' should have ", + "same lon / lat dimensions if working with regular grids.") + } + exp_corr <- AdamontQQCorr(exp = exp$data, wt_exp = wt_exp, obs = obs$data, + wt_obs = wt_obs, corrdims = corrdims, + londim = londim, latdim = latdim, + regrid = FALSE) + } + exp$data <- exp_corr + return(exp) } @@ -108,286 +120,335 @@ CST_AdamontQQCorr <- function(exp, wt_exp, obs, wt_obs, #'@author Paola Marson, \email{paola.marson@meteo.fr} for PROSNOW version #'@author Lauriane Batté, \email{lauriane.batte@meteo.fr} for CSTools adaptation #' -#'@param exp array with named dimensions (such as \code{$data} array of -#'experiment data from an object of class \code{s2dv_cube}) -#'@param wt_exp corresponding weather types (same dimensions as \code{exp} but -#'lat/lon) -#'@param obs array with named dimensions with reference data (can also be -#'\code{$data} array of class \code{s2dv_cube}). lat/lon dimensions can differ -#'from \code{exp} if non rectilinear latlon grids are used, in which case -#'regrid should be set to TRUE and .NearestNeighbors \code{NN} output should be -#'provided -#'@param wt_obs corresponding weather types (same dimensions as \code{obs} but -#'lat/lon) -#'@param corrdims list of dimensions in \code{exp} for which quantile mapping -#'correction is applied -#'@param londim character name of longitude dimension in \code{exp} and -#'\code{obs} -#'@param latdim character name of latitude dimension in \code{exp} and -#'\code{obs} -#'@param regrid (optional) boolean indicating whether .NearestNeighbors -#'regridding is needed -#'@param NN (optional, if regrid=TRUE) list (output from .NearestNeighbors) -#'maps (nlat, nlon) onto (nlat_o, nlon_o) +#'@param exp Array with named dimensions (such as \code{$data} array of +#' experiment data from an object of class \code{s2dv_cube}). +#'@param wt_exp Corresponding weather types (same dimensions as \code{exp} but +#' lat/lon). +#'@param obs Array with named dimensions with reference data (can also be +#' \code{$data} array of class \code{s2dv_cube}). lat/lon dimensions can differ +#' from \code{exp} if non rectilinear latlon grids are used, in which case +#' regrid should be set to TRUE and .NearestNeighbors \code{NN} output should +#' be provided. +#'@param wt_obs Corresponding weather types (same dimensions as \code{obs} but +#' lat/lon). +#'@param corrdims List of dimensions in \code{exp} for which quantile mapping +#' correction is applied. +#'@param londim Character name of longitude dimension in \code{exp} and +#' \code{obs}. +#'@param latdim Character name of latitude dimension in \code{exp} and +#' \code{obs}. +#'@param regrid (optional) Boolean indicating whether .NearestNeighbors +#' regridding is needed. +#'@param NN (optional, if regrid = TRUE) List (output from .NearestNeighbors) +#' maps (nlat, nlon) onto (nlat_o, nlon_o). #' -#'@return an array (such as \code{$data} array from an object of class +#'@return An array (such as \code{$data} array from an object of class #'\code{s2dv_cube}) with named dimensions, containing experiment data on the #'lat/lon grid of \code{obs} array, corrected by quantile mapping depending on #'the weather types \code{wt_exp} #' +#'@examples +#'wt_exp <- c(1,1,2,3,3,2,2,1,1,2,2,3) +#'dim(wt_exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) +#'wt_obs <- c(3,3,1,2,2,2,2,1,3,1,1,2) +#'dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) +#'exp <- 1 : c(1 * 1 * 4 * 3 * 4 * 4) +#'dim(exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, +#' lat = 4, lon = 4) +#'obs <- 101 : c(100 + 1 * 1 * 4 * 3 * 4 * 4) +#'dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, +#' lat = 4, lon = 4) +#'exp_corr <- AdamontQQCorr(exp = exp, wt_exp = wt_exp, +#' obs = obs, wt_obs = wt_obs, +#' corrdims = c('dataset', 'member', 'sdate', 'ftime')) #'@import qmap #'@importFrom ClimProjDiags Subset #'@import multiApply #'@import abind -#'@examples -#'\dontrun{ -#'wt_exp <- sample(1:3, 15*6*3, replace=T) -#'dim(wt_exp) <- c(dataset=1, member=15, sdate=6, ftime=3) -#'wt_obs <- sample(1:3, 6*3, replace=T) -#'dim(wt_obs) <- c(dataset=1, member=1, sdate=6, ftime=3) -#'exp_corr <- AdamontQQCorr(exp=lonlat_temp$exp$data, wt_exp=wt_exp, -#' obs=lonlat_temp$obs$data, wt_obs=wt_obs, -#' corrdims = c('dataset','member','sdate','ftime')) -#'} +#'@export AdamontQQCorr <- function(exp, wt_exp, obs, wt_obs, - corrdims = c('member', 'sdate', 'ftime'), - londim='lon', latdim='lat', regrid=FALSE, NN=NULL) { + corrdims = c('member', 'sdate', 'ftime'), + londim = 'lon', latdim = 'lat', regrid = FALSE, + NN = NULL) { - dimnames <- names(dim(obs)) - dimnamesexp <- names(dim(exp)) - if (!(latdim %in% dimnames) || !(londim %in% dimnames)){ - stop("'londim' or 'latdim' input doesn't match with 'obs' dimension names") - } - if (!(('time' %in% corrdims) || ('ftime' %in% corrdims))){ - warning("Forecast time should be one of the dimensions for the correction", - " specified in corrdims input list") - } - if (!all(corrdims %in% dimnamesexp)){ - stop("Names in parameter 'corrdims' should match input dimension names.") - } - if (!all(dim(wt_exp) %in% dim(exp))){ - stop("Dimensions for 'wt_exp' should match 'exp' except lat/lon") - } - if (!all(dim(wt_obs) %in% dim(obs))){ - stop("Dimensions for 'wt_obs' should match 'obs' except lat/lon") - } - if ((regrid == 'TRUE') & is.null(NN)){ - stop("regrid set to TRUE: provide nearest neighbors input NN") - } - # The regridding part should only be done if lat/lon dimensions of obs and - # exp differ. - if (regrid == 'TRUE'){ - obsdims <- names(dim(obs)) - poslat <- which(obsdims == latdim) - poslon <- which(obsdims == londim) - nlat_o <- dim(obs)[poslat] - nlon_o <- dim(obs)[poslon] - ilat_o <- array(c(1:nlat_o)) - names(dim(ilat_o))[1] <- latdim - ilon_o <- array(c(1:nlon_o)) - names(dim(ilon_o))[1] <- londim - ## First step if obs data is higher resolution than exp data is to use - ## nearest neighbor to compute downscaling of exp data - exp_corr <- Apply(list(exp,ilat_o,ilon_o), - target_dims=list(c(latdim,londim),latdim,londim), - .getNN,NN=NN)$output1 + dimnames <- names(dim(obs)) + dimnamesexp <- names(dim(exp)) - ## Reorder exp_corr dimensions to match exp dimensions - dexpc <- match(names(dim(exp)), names(dim(exp_corr))) - exp_corr <- aperm(exp_corr,dexpc) - dimnames(exp_corr) <- dimnames(exp)[dexpc] - ## Keep original wt_exp for remapping data - wt_exp2 <- wt_exp - ## Both exp and obs data are now on the same grid - } else { - ## exp lat/lon dimensions should match obs - plat_exp <- which(dimnamesexp==latdim) - plon_exp <- which(dimnamesexp==londim) - plat_obs <- which(dimnames==latdim) - plon_obs <- which(dimnames==londim) - if ((dim(obs)[plon_obs]!=dim(exp)[plon_exp]) || - (dim(obs)[plat_obs]!=dim(exp)[plat_exp])){ - stop("Parameters 'obs' and 'exp' should have same lon / lat", - " dimensions if regrid set to 'FALSE' (regular grid case).") - } - exp_corr <- exp - ## Keep original wt_exp for remapping data - wt_exp2 <- wt_exp - } + if (!(latdim %in% dimnames) || !(londim %in% dimnames)) { + stop("'londim' or 'latdim' input doesn't match with 'obs' dimension names") + } + if (!(('time' %in% corrdims) || ('ftime' %in% corrdims))) { + warning("Forecast time should be one of the dimensions for the correction", + " specified in corrdims input list") + } + if (!all(corrdims %in% dimnamesexp)) { + stop("Names in parameter 'corrdims' should match input dimension names.") + } + if (!all(dim(wt_exp) %in% dim(exp))) { + stop("Dimensions for 'wt_exp' should match 'exp' except lat/lon") + } + if (!all(dim(wt_obs) %in% dim(obs))) { + stop("Dimensions for 'wt_obs' should match 'obs' except lat/lon") + } + if ((regrid == 'TRUE') & is.null(NN)) { + stop("regrid set to TRUE: provide nearest neighbors input NN") + } + # The regridding part should only be done if lat/lon dimensions of obs and + # exp differ. + if (regrid == 'TRUE') { + obsdims <- names(dim(obs)) + poslat <- which(obsdims == latdim) + poslon <- which(obsdims == londim) + nlat_o <- dim(obs)[poslat] + nlon_o <- dim(obs)[poslon] + ilat_o <- array(c(1:nlat_o)) + names(dim(ilat_o))[1] <- latdim + ilon_o <- array(c(1:nlon_o)) + names(dim(ilon_o))[1] <- londim + ## First step if obs data is higher resolution than exp data is to use + ## nearest neighbor to compute downscaling of exp data + exp_corr <- Apply(list(exp, ilat_o, ilon_o), + target_dims = list(c(latdim,londim), latdim, londim), .getNN, NN = NN)$output1 + ## Reorder exp_corr dimensions to match exp dimensions + dexpc <- match(names(dim(exp)), names(dim(exp_corr))) + exp_corr <- aperm(exp_corr, dexpc) + dimnames(exp_corr) <- dimnames(exp)[dexpc] + ## Keep original wt_exp for remapping data + wt_exp2 <- wt_exp + ## Both exp and obs data are now on the same grid + } else { + ## exp lat/lon dimensions should match obs + plat_exp <- which(dimnamesexp == latdim) + plon_exp <- which(dimnamesexp == londim) + plat_obs <- which(dimnames == latdim) + plon_obs <- which(dimnames == londim) + if ((dim(obs)[plon_obs] != dim(exp)[plon_exp]) || + (dim(obs)[plat_obs] != dim(exp)[plat_exp])) { + stop("Parameters 'obs' and 'exp' should have same lon / lat ", + "dimensions if regrid set to 'FALSE' (regular grid case).") + } + exp_corr <- exp + ## Keep original wt_exp for remapping data + wt_exp2 <- wt_exp + } - ## Use CST_QuantileMapping function for quantile mapping - ## depending on weather type - for (i in 1:(length(corrdims) - 1)) { - obs <- MergeDims(obs, corrdims[i:(i+1)], rename_dim=corrdims[i+1]) - wt_obs <- MergeDims(wt_obs, corrdims[i:(i+1)], rename_dim=corrdims[i+1]) - exp_corr <- MergeDims(exp_corr, corrdims[i:(i+1)], rename_dim=corrdims[i+1]) - wt_exp2 <- MergeDims(wt_exp2, corrdims[i:(i+1)], rename_dim=corrdims[i+1]) - } - names(dim(obs))[which(names(dim(obs)) == corrdims[length(corrdims)])] <- 'time' - names(dim(wt_obs))[which(names(dim(wt_obs)) == corrdims[length(corrdims)])] <- 'time' - names(dim(exp_corr))[which(names(dim(exp_corr)) == corrdims[length(corrdims)])] <- 'time' - names(dim(wt_exp2))[which(names(dim(wt_exp2)) == corrdims[length(corrdims)])] <- 'time' - # Split 'time' dim in weather types - obs <- SplitDim(obs, split_dim='time',indices=as.vector(wt_obs), - new_dim_name='type') - exp_corr <- SplitDim(exp_corr, split_dim='time',indices=as.vector(wt_exp2), - new_dim_name='type') - ## Add NAs to exp_corr if needed to have compatible sample dimensions - numtobs <- dim(obs)[which(names(dim(obs))=='time')] - numtexp <- dim(exp_corr)[which(names(dim(exp_corr))=='time')] - if (numtexp%%numtobs > 0){ - ## Create extra dimension and include NAs - ndimexp <- names(dim(exp_corr)) - ndimobs <- names(dim(obs)) - postime <- which(ndimexp=='time') - dimadd <- dim(exp_corr) - dimadd[postime] <- ceiling(numtexp/numtobs)*numtobs-numtexp - exp_corr <- abind::abind(exp_corr,array(NA,dimadd),along=postime) - names(dim(exp_corr)) <- ndimexp - exp_corr <- SplitDim(exp_corr,'time',freq=numtobs,indices=NULL) - dimobs <- c(dim(obs),1) - dim(obs) <- dimobs - names(dim(obs)) <- c(ndimobs,'index') - res <- QuantileMapping(exp=exp_corr,obs=obs,sample_dims=c('time','index'), - method='RQUANT') - res <- MergeDims(res,c('time','index')) - ## Remove the extra NA values added previously - res <- Subset(res,along = 'time', indices = 1:numtexp) - } else { - ## Apply QuantileMapping to exp_corr depending on weather type - res <- QuantileMapping(exp = exp_corr, obs = obs, sample_dims = 'time', - samplemethod = 'RQUANT') - } - rm(exp_corr) # Save space in memory - ## Reshape exp_corr data onto time dimension before 'Split' - rep_pos <- array(NA,c(time=length(wt_exp2))) - pos_time <- which(names(dim(res)) == 'time') - pos_type <- which(names(dim(res)) == 'type') - for (x in unique(wt_exp2)){ - rep_pos[which(wt_exp2==x)]<-1:length(which(wt_exp2==x)) - } - exp_corr <- .unsplit_wtype(exp=res,wt_exp=wt_exp2,rep_pos=rep_pos, - pos_time=pos_time) - # Now reshape exp_corr data onto original dimensions - dim(exp_corr) <- c(dim(wt_exp), dim(exp_corr)[-c(pos_time,pos_type)]) - return(exp_corr) + ## Use CST_QuantileMapping function for quantile mapping + ## depending on weather type + for (i in 1:(length(corrdims) - 1)) { + obs <- MergeDims(obs, corrdims[i:(i+1)], rename_dim = corrdims[i+1]) + wt_obs <- MergeDims(wt_obs, corrdims[i:(i+1)], rename_dim = corrdims[i+1]) + exp_corr <- MergeDims(exp_corr, corrdims[i:(i+1)], rename_dim = corrdims[i+1]) + wt_exp2 <- MergeDims(wt_exp2, corrdims[i:(i+1)], rename_dim = corrdims[i+1]) + } + + names(dim(obs))[which(names(dim(obs)) == corrdims[length(corrdims)])] <- 'time' + names(dim(wt_obs))[which(names(dim(wt_obs)) == corrdims[length(corrdims)])] <- 'time' + names(dim(exp_corr))[which(names(dim(exp_corr)) == corrdims[length(corrdims)])] <- 'time' + names(dim(wt_exp2))[which(names(dim(wt_exp2)) == corrdims[length(corrdims)])] <- 'time' + + # Split 'time' dim in weather types + obs <- SplitDim(obs, split_dim = 'time', indices = as.vector(wt_obs), + new_dim_name = 'type') + exp_corr <- SplitDim(exp_corr, split_dim = 'time', indices = as.vector(wt_exp2), + new_dim_name = 'type') + ## Add NAs to exp_corr if needed to have compatible sample dimensions + numtobs <- dim(obs)[which(names(dim(obs)) == 'time')] + numtexp <- dim(exp_corr)[which(names(dim(exp_corr)) == 'time')] + + if (numtexp%%numtobs > 0) { + ## Create extra dimension and include NAs + ndimexp <- names(dim(exp_corr)) + ndimobs <- names(dim(obs)) + postime <- which(ndimexp == 'time') + dimadd <- dim(exp_corr) + dimadd[postime] <- ceiling(numtexp/numtobs) * numtobs - numtexp + exp_corr <- abind::abind(exp_corr, array(NA, dimadd), along = postime) + names(dim(exp_corr)) <- ndimexp + exp_corr <- SplitDim(exp_corr, 'time', freq = numtobs, indices = NULL) + dimobs <- c(dim(obs), 1) + dim(obs) <- dimobs + names(dim(obs)) <- c(ndimobs, 'index') + res <- QuantileMapping(exp = exp_corr, obs = obs, memb_dim = 'index', + sdate_dim = 'time', method = 'RQUANT', na.rm = TRUE) + res <- MergeDims(res, c('time','index')) + ## Remove the extra NA values added previously + res <- Subset(res, along = 'time', indices = 1:numtexp) + } else { + ## Apply QuantileMapping to exp_corr depending on weather type + exp_corr <- InsertDim(exp_corr, posdim = 1, lendim = 1, name = 'member') + res <- QuantileMapping(exp = exp_corr, obs = obs, sdate_dim = 'time', + samplemethod = 'RQUANT', na.rm = TRUE) + dim(res) <- dim(res)[-which(names(dim(res)) == 'member')] + } + rm(exp_corr) # Save space in memory + ## Reshape exp_corr data onto time dimension before 'Split' + rep_pos <- array(NA, c(time = length(wt_exp2))) + pos_time <- which(names(dim(res)) == 'time') + pos_type <- which(names(dim(res)) == 'type') + for (x in unique(wt_exp2)) { + rep_pos[which(wt_exp2 == x)] <- 1:length(which(wt_exp2 == x)) + } + exp_corr <- .unsplit_wtype(exp = res, wt_exp = wt_exp2, rep_pos = rep_pos, + pos_time = pos_time) + # Now reshape exp_corr data onto original dimensions + dim(exp_corr) <- c(dim(wt_exp), dim(exp_corr)[-c(pos_time,pos_type)]) + return(exp_corr) } -.getNN <- function(exp,ilat,ilon,NN){ - return(exp[NN$imin_lat[ilat,ilon],NN$imin_lon[ilat,ilon]]) +.getNN <- function(exp, ilat, ilon, NN) { + return(exp[NN$imin_lat[ilat, ilon], NN$imin_lon[ilat, ilon]]) } -.unsplit_wtype <- function(exp=exp,dim_wt='type',wt_exp=wt_exp, - dim_time='time',rep_pos=rep_pos,pos_time=1){ - # Initiate output - new <- Subset(Subset(exp, along=dim_wt, indices=wt_exp[1]), along=dim_time, - indices=rep_pos[1]) - dimnames <- names(dim(new)) - for (x in 2:length(wt_exp)){ - dat <- Subset(Subset(exp, along=dim_wt, indices=wt_exp[x]), - along=dim_time, indices=rep_pos[x]) - new <- abind::abind(new,dat,along=pos_time) - } - names(dim(new)) <- dimnames - return(new) +.unsplit_wtype <- function(exp = exp,dim_wt = 'type', wt_exp = wt_exp, + dim_time = 'time', rep_pos = rep_pos, pos_time = 1) { + # Initiate output + new <- Subset(Subset(exp, along = dim_wt, indices = wt_exp[1]), + along = dim_time, indices = rep_pos[1]) + dimnames <- names(dim(new)) + for (x in 2:length(wt_exp)) { + dat <- Subset(Subset(exp, along = dim_wt, indices = wt_exp[x]), + along = dim_time, indices = rep_pos[x]) + new <- abind::abind(new, dat, along = pos_time) + } + names(dim(new)) <- dimnames + return(new) } -#' ADAMONT Nearest Neighbors computes the distance between reference data grid centroid and SF data grid + +#'ADAMONT Nearest Neighbors computes the distance between reference data grid +#'centroid and SF data grid #' #'@author Paola Marson, \email{paola.marson@meteo.fr} for PROSNOW version -#'@author Lauriane Batté, \email{lauriane.batte@meteo.fr} for CSTools adaptation -#'@description This function computes the nearest neighbor for each reference data (lon, lat) point in the experiment dataset by computing the distance between the reference dataset grid and the experiment data. This is the first step in the ADAMONT method adapted from Verfaillie et al. (2018). +#'@author Lauriane Batté, \email{lauriane.batte@meteo.fr} for CSTools adaptation +#'@description This function computes the nearest neighbor for each reference +#'data (lon, lat) point in the experiment dataset by computing the distance +#'between the reference dataset grid and the experiment data. This is the first +#'step in the ADAMONT method adapted from Verfaillie et al. (2018). #' -#'@param method a string among three options ('ADA': standard ADAMONT distance, 'simple': lon/lat straight euclidian distance, 'radius': distance on the sphere) -#'@param exp an object of class \code{s2dv_cube} as returned by \code{CST_Load} function, containing the seasonal forecast experiment longitudes in \code{$lon} and latitudes in \code{$lat} -#'@param obs an object of class \code{s2dv_cube} as returned by \code{CST_Load} function, containing the reference data on a different grid, with longitudes in \code{$lon} and latitudes in \code{$lat}. +#'@param method A string among three options ('ADA': standard ADAMONT distance, +#' 'simple': lon/lat straight euclidian distance, 'radius': distance on the +#' sphere). +#'@param exp An object of class \code{s2dv_cube} as returned by \code{CST_Load} +#' function, containing the seasonal forecast experiment longitudes in +#' \code{$lon} and latitudes in \code{$lat}. +#'@param obs An object of class \code{s2dv_cube} as returned by \code{CST_Load} +#' function, containing the reference data on a different grid, with longitudes +#' in \code{$lon} and latitudes in \code{$lat}. #' #'@return NN a list, containing the following: -#' min_lon: array of dimensions \code{obs$lon} giving the longitude of closest gridpoint in exp -#' min_lat: array of dimensions \code{obs$lat} giving the latitude of closest gridpoint in exp -#' imin_lon: array of dimensions \code{obs$lon} giving the longitude index of closest gridpoint in exp -#' imin_lat: array of dimensions \code{obs$lat} giving the latitude index of closest gridpoint in exp -#' +#'\itemize{ +#' \item{'min_lon': array of dimensions \code{obs$lon} giving the longitude of +#' closest gridpoint in exp.} +#' \item{'min_lat': array of dimensions \code{obs$lat} giving the latitude of +#' closest gridpoint in exp.} +#' \item{'imin_lon': array of dimensions \code{obs$lon} giving the longitude +#' index of closest gridpoint in exp.} +#' \item{'imin_lat': array of dimensions \code{obs$lat} giving the latitude +#' index of closest gridpoint in exp.} +#'} +#' #'@importFrom ClimProjDiags Subset #'@import ncdf4 #'@noRd -.NearestNeighbors <- function (exp, obs, method='ADA') { - - if (!inherits(exp, 's2dv_cube') || !inherits(obs, 's2dv_cube')) { - stop("Inputs 'exp' and 'obs' must be of class 's2dv_cube', ", - "as output by CSTools::CST_Load.") - } - exp_lon <- exp$lon - exp_lat <- exp$lat - obs_lon <- obs$lon - obs_lat <- obs$lat - dim_exp_lon <- dim(exp_lon) - dim_exp_lat <- dim(exp_lat) - dim_obs_lon <- dim(obs_lon) - dim_obs_lat <- dim(obs_lat) - # Check if one of the grids is non-regular: - if ((length(dim_exp_lon)==2) || (length(dim_obs_lon)==2)){ - # Flatten longitudes and latitudes in case of 2-D longitudes and latitudes (Lambert grids, etc.) - if ((length(dim_exp_lon)==2) & (length(dim_exp_lat)==2)){ - dim(exp_lon) <- c(dim_exp_lon[1]*dim_exp_lon[2]) - dim(exp_lat) <- c(dim_exp_lat[1]*dim_exp_lat[2]) +.NearestNeighbors <- function (exp, obs, method = 'ADA') { + # Check 's2dv_cube' + if (!inherits(exp, 's2dv_cube') || !inherits(obs, 's2dv_cube')) { + stop("Inputs 'exp' and 'obs' must be of class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + } + # Check 'exp' and 'obs' object structure + if (!all(c('data', 'coords') %in% names(exp))) { + stop("Parameter 'exp' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + } + + if (!any(names(exp$coords) %in% .KnownLonNames()) | + !any(names(exp$coords) %in% .KnownLatNames())) { + stop("Spatial coordinate names of parameter 'exp' do not match any ", + "of the names accepted by the package.") + } + if (!all(names(exp$coords) %in% names(obs$coords))) { + stop("Coordinates names must be equal in 'exp' and in 'obs'.") + } + + lon_name <- names(exp$coords)[[which(names(exp$coords) %in% .KnownLonNames())]] + lat_name <- names(exp$coords)[[which(names(exp$coords) %in% .KnownLatNames())]] + exp_lon <- exp$coords[[lon_name]] + exp_lat <- exp$coords[[lat_name]] + obs_lon <- obs$coords[[lon_name]] + obs_lat <- obs$coords[[lat_name]] + dim_exp_lon <- dim(exp_lon) + dim_exp_lat <- dim(exp_lat) + dim_obs_lon <- dim(obs_lon) + dim_obs_lat <- dim(obs_lat) + + # Check if one of the grids is non-regular: + if ((length(dim_exp_lon) == 2) || (length(dim_obs_lon) == 2)) { + # Flatten longitudes and latitudes in case of 2-D longitudes and latitudes (Lambert grids, etc.) + if ((length(dim_exp_lon) == 2) & (length(dim_exp_lat) == 2)) { + dim(exp_lon) <- c(dim_exp_lon[1] * dim_exp_lon[2]) + dim(exp_lat) <- c(dim_exp_lat[1] * dim_exp_lat[2]) + } + if ((length(dim_obs_lon) == 2) & (length(dim_obs_lat) == 2)) { + dim(obs_lon) <- c(dim_obs_lon[1] * dim_obs_lon[2]) + dim(obs_lat) <- c(dim_obs_lat[1] * dim_obs_lat[2]) + } + # Now lat and lon arrays have 1 dimension, length npt (= nlat*nlon) + OBS_grid <- cbind(obs_lon, obs_lat) + EXP_grid <- cbind(exp_lon, exp_lat) + dist_min <- min_lon <- min_lat <- imin_lon <- imin_lat <- array(dim = nrow(OBS_grid)) + if (method == 'ADA') { + C <- cos(OBS_grid[,2] * pi/180)^2 + for (i in 1:nrow(OBS_grid)) { + dist <- (OBS_grid[i, 2] - EXP_grid[, 2])^2 + + C[i] * (OBS_grid[i, 1] - EXP_grid[, 1])^2 + dist_min[i] < -min(dist) + min_lon[i] <- EXP_grid[which.min(dist), 1] + min_lat[i] <- EXP_grid[which.min(dist), 2] + imin_lon[i] <- which(exp_lon == min_lon[i]) + imin_lat[i] <- which(exp_lat == min_lat[i]) } - if ((length(dim_obs_lon)==2) & (length(dim_obs_lat)==2)){ - dim(obs_lon) <- c(dim_obs_lon[1]*dim_obs_lon[2]) - dim(obs_lat) <- c(dim_obs_lat[1]*dim_obs_lat[2]) + } else if (method == 'simple') { + for (i in 1:nrow(OBS_grid)) { + dist <- (OBS_grid[i, 2] - EXP_grid[, 2])^2 + (OBS_grid[i, 1] - EXP_grid[, 1])^2 + dist_min[i] <- min(dist) + min_lon[i] <- EXP_grid[which.min(dist), 1] + min_lat[i] <- EXP_grid[which.min(dist), 2] + imin_lon[i] < -which(exp_lon == min_lon[i]) + imin_lat[i] <- which(exp_lat == min_lat[i]) } - # Now lat and lon arrays have 1 dimension, length npt (= nlat*nlon) - OBS_grid <- cbind(obs_lon,obs_lat) - EXP_grid <- cbind(exp_lon,exp_lat) - dist_min<-min_lon<-min_lat<-imin_lon<-imin_lat<-array(dim=nrow(OBS_grid)) - if (method == 'ADA'){ - C<-cos(OBS_grid[,2]*pi/180)^2 - for (i in 1:nrow(OBS_grid)){ - dist<-(OBS_grid[i,2]-EXP_grid[,2])^2+C[i]*(OBS_grid[i,1]-EXP_grid[,1])^2 - dist_min[i]<-min(dist) - min_lon[i]<-EXP_grid[which.min(dist),1] - min_lat[i]<-EXP_grid[which.min(dist),2] - imin_lon[i]<-which(exp_lon==min_lon[i]) - imin_lat[i]<-which(exp_lat==min_lat[i]) - } - } else if (method == 'simple'){ - for (i in 1:nrow(OBS_grid)){ - dist<-(OBS_grid[i,2]-EXP_grid[,2])^2+(OBS_grid[i,1]-EXP_grid[,1])^2 - dist_min[i]<-min(dist) - min_lon[i]<-EXP_grid[which.min(dist),1] - min_lat[i]<-EXP_grid[which.min(dist),2] - imin_lon[i]<-which(exp_lon==min_lon[i]) - imin_lat[i]<-which(exp_lat==min_lat[i]) - } - } else if (method == 'radius'){ - R <- 6371e3 # metres, Earth radius - EXP_gridr<-EXP_grid*pi/180 - OBS_gridr<-OBS_grid*pi/180 - for (i in 1:nrow(OBS_grid)){ - a<-sin((OBS_gridr[i,2]-EXP_gridr[,2])/2)^2 + cos(OBS_gridr[i,2])*cos(EXP_gridr[,2])*sin((OBS_gridr[i,1]-EXP_gridr[,1])/2)^2 - c<-2*atan2(sqrt(a),sqrt(1-a)) - dist<-R*c - dist_min[i]<-min(dist) - min_lon[i]<-EXP_grid[which.min(dist),1] - min_lat[i]<-EXP_grid[which.min(dist),2] - imin_lon[i]<-which(exp_lon==min_lon[i]) - imin_lat[i]<-which(exp_lat==min_lat[i]) - } - } else { - stop("AdamontNearestNeighbors supports method = 'ADA', 'simple' or 'radius' only.") + } else if (method == 'radius') { + R <- 6371e3 # metres, Earth radius + EXP_gridr <- EXP_grid * pi/180 + OBS_gridr <- OBS_grid * pi/180 + for (i in 1:nrow(OBS_grid)) { + a <- sin((OBS_gridr[i,2] - EXP_gridr[,2])/2)^2 + cos(OBS_gridr[i, 2]) * + cos(EXP_gridr[, 2]) * sin((OBS_gridr[i, 1] - EXP_gridr[, 1])/2)^2 + c <- 2*atan2(sqrt(a), sqrt(1 - a)) + dist <- R*c + dist_min[i] <- min(dist) + min_lon[i] <- EXP_grid[which.min(dist), 1] + min_lat[i] <- EXP_grid[which.min(dist), 2] + imin_lon[i] <- which(exp_lon == min_lon[i]) + imin_lat[i] <- which(exp_lat == min_lat[i]) } - - # Reshape outputs to original grid - dim(min_lon)=dim_obs_lon - dim(min_lat)=dim_obs_lat - dim(imin_lon)=dim_obs_lon - dim(imin_lat)=dim_obs_lat + } else { + stop("AdamontNearestNeighbors supports method = 'ADA', 'simple' or 'radius' only.") + } + + # Reshape outputs to original grid + dim(min_lon)=dim_obs_lon + dim(min_lat)=dim_obs_lat + dim(imin_lon)=dim_obs_lon + dim(imin_lat)=dim_obs_lat - } else { - # Regular lon/lat grid case: has been handled by CST_Load() - stop("AdamontNearestNeighbors is meant for non-regular lat/lon grids; use e.g. CST_Load to interpolate exp onto obs grid") - } + } else { + # Regular lon/lat grid case: has been handled by CST_Load() + stop(paste0("AdamontNearestNeighbors is meant for non-regular lat/lon ", + "grids; use e.g. CST_Load to interpolate exp onto obs grid")) + } - NN=list(min_lon=min_lon, min_lat=min_lat, imin_lon=imin_lon, imin_lat=imin_lat) + NN = list(min_lon = min_lon, min_lat = min_lat, imin_lon = imin_lon, + imin_lat = imin_lat) - return(NN) + return(NN) } diff --git a/R/CST_Analogs.R b/R/CST_Analogs.R index 6437868091a5b8c9e4f9731dc32953ffbcee159a..b6cbfa4e3fb5f4764985a6ae0ec13db84069274b 100644 --- a/R/CST_Analogs.R +++ b/R/CST_Analogs.R @@ -41,7 +41,9 @@ #' criterias. If parameter 'expVar' is not provided, the function will return #' the expL analog. The element 'data' in the 's2dv_cube' object must have, at #' least, latitudinal and longitudinal dimensions. The object is expect to be -#' already subset for the desired large scale region. +#' already subset for the desired large scale region. Latitudinal dimension +#' accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal +#' dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'. #'@param obsL An 's2dv_cube' object containing the observational field on the #' large scale. The element 'data' in the 's2dv_cube' object must have the same #' latitudinal and longitudinal dimensions as parameter 'expL' and a temporal @@ -72,10 +74,10 @@ #' time_obsL), by default time_expL will be removed during the search of analogs. #'@param time_expL A character string indicating the date of the experiment #' in the same format than time_obsL (i.e. "yyyy-mm-dd"). By default it is NULL -#' and dates are taken from element \code{$Dates$start} from expL. +#' and dates are taken from element \code{$attrs$Dates} from expL. #'@param time_obsL A character string indicating the date of the observations #' in the date format (i.e. "yyyy-mm-dd"). By default it is NULL and dates are -#' taken from element \code{$Dates$start} from obsL. +#' taken from element \code{$attrs$Dates} from obsL. #'@param region A vector of length four indicating the minimum longitude, #' the maximum longitude, the minimum latitude and the maximum latitude. #'@param nAnalogs Number of Analogs to be selected to apply the criterias @@ -99,7 +101,7 @@ #' best analog, for instance for downscaling. #'@param ncores The number of cores to use in parallel computation #' -#'@seealso code{\link{CST_Load}}, \code{\link[s2dv]{Load}} and +#'@seealso \code{\link{CST_Load}}, \code{\link[s2dv]{Load}} and #'\code{\link[s2dv]{CDORemap}} #' #'@return An 's2dv_cube' object containing an array with the dowscaled values of @@ -108,17 +110,22 @@ #'elements 'analogs', 'metric' and 'dates'. #'@examples #'expL <- rnorm(1:200) -#'dim(expL) <- c(member = 10,lat = 4, lon = 5) -#'obsL <- c(rnorm(1:180),expL[1,,]*1.2) -#'dim(obsL) <- c(time = 10,lat = 4, lon = 5) -#'time_obsL <- paste(rep("01", 10), rep("01", 10), 1994:2003, sep = "-") +#'dim(expL) <- c(member = 10, lat = 4, lon = 5) +#'obsL <- c(rnorm(1:180), expL[1, , ]*1.2) +#'dim(obsL) <- c(time = 10, lat = 4, lon = 5) +#'time_obsL <- as.POSIXct(paste(rep("01", 10), rep("01", 10), 1994:2003, sep = "-"), +#' format = "%d-%m-%y") +#'dim(time_obsL) <- c(time = 10) #'time_expL <- time_obsL[1] -#'lon <- seq(-1,5,1.5) -#'lat <- seq(30,35,1.5) -#'expL <- s2dv_cube(data = expL, lat = lat, lon = lon, -#' Dates = list(start = time_expL, end = time_expL)) -#'obsL <- s2dv_cube(data = obsL, lat = lat, lon = lon, -#' Dates = list(start = time_obsL, end = time_obsL)) +#'lon <- seq(-1, 5, 1.5) +#'lat <- seq(30, 35, 1.5) +#'coords <- list(lon = seq(-1, 5, 1.5), lat = seq(30, 35, 1.5)) +#'attrs_expL <- list(Dates = time_expL) +#'attrs_obsL <- list(Dates = time_obsL) +#'expL <- list(data = expL, coords = coords, attrs = attrs_expL) +#'obsL <- list(data = obsL, coords = coords, attrs = attrs_obsL) +#'class(expL) <- 's2dv_cube' +#'class(obsL) <- 's2dv_cube' #'region <- c(min(lon), max(lon), min(lat), max(lat)) #'downscaled_field <- CST_Analogs(expL = expL, obsL = obsL, region = region) #' @@ -131,6 +138,8 @@ CST_Analogs <- function(expL, obsL, expVar = NULL, obsVar = NULL, region = NULL, time_expL = NULL, time_obsL = NULL, nAnalogs = NULL, AnalogsInfo = FALSE, ncores = NULL) { + + # Check 's2dv_cube' if (!inherits(expL, "s2dv_cube") || !inherits(obsL, "s2dv_cube")) { stop("Parameter 'expL' and 'obsL' must be of the class 's2dv_cube', ", "as output by CSTools::CST_Load.") @@ -140,15 +149,44 @@ CST_Analogs <- function(expL, obsL, expVar = NULL, obsVar = NULL, region = NULL, "as output by CSTools::CST_Load.") } if (!is.null(obsVar) && !inherits(obsVar, "s2dv_cube")) { - stop("Parameter 'expVar' must be of the class 's2dv_cube', ", + stop("Parameter 'obsVar' must be of the class 's2dv_cube', ", "as output by CSTools::CST_Load.") } - if (any(is.na(expL))) { - warning("Parameter 'expL' contains NA values.") + + # Check 'obsL' object structure + if (!all(c('data', 'coords', 'attrs') %in% names(obsL))) { + stop("Parameter 'obsL' must have 'data', 'coords' and 'attrs' elements ", + "within the 's2dv_cube' structure.") } - if (any(is.na(obsL))) { - warning("Parameter 'obsL' contains NA values.") + + if (!any(names(obsL$coords) %in% .KnownLonNames()) | + !any(names(obsL$coords) %in% .KnownLatNames())) { + stop("Spatial coordinate names of parameter 'obsL' do not match any ", + "of the names accepted by the package.") } + + lon_name <- names(obsL$coords)[[which(names(obsL$coords) %in% .KnownLonNames())]] + lat_name <- names(obsL$coords)[[which(names(obsL$coords) %in% .KnownLatNames())]] + + # Check 'obsVar' object structure + if (!is.null(obsVar)) { + if (!all(c('data', 'coords', 'attrs') %in% names(obsVar))) { + stop("Parameter 'obsVar' must have 'data', 'coords' and 'attrs' elements ", + "within the 's2dv_cube' structure.") + } + if (!any(names(obsVar$coords) %in% .KnownLonNames()) | + !any(names(obsVar$coords) %in% .KnownLatNames())) { + stop("Spatial coordinate names of parameter 'obsVar' do not match any ", + "of the names accepted by the package.") + } + lonVar <- obsVar$coords[[which(names(obsVar$coords) %in% .KnownLonNames())]] + latVar <- obsVar$coords[[which(names(obsVar$coords) %in% .KnownLatNames())]] + } else { + lonVar <- NULL + latVar <- NULL + } + + # Check temporal dimensions if (any(names(dim(obsL$data)) %in% 'sdate')) { if (any(names(dim(obsL$data)) %in% 'ftime')) { obsL <- CST_MergeDims(obsL, c('ftime', 'sdate'), rename_dim = 'time') @@ -166,36 +204,47 @@ CST_Analogs <- function(expL, obsL, expVar = NULL, obsVar = NULL, region = NULL, } } if (is.null(time_expL)) { - time_expL <- expL$Dates$start + time_expL <- expL$attrs$Dates } if (is.null(time_obsL)) { - time_obsL <- obsL$Dates$start + time_obsL <- obsL$attrs$Dates } + res <- Analogs(expL$data, obsL$data, time_obsL = time_obsL, - time_expL = time_expL, lonL = expL$lon, - latL = expL$lat, expVar = expVar$data, + time_expL = time_expL, + lonL = as.vector(obsL$coords[[lon_name]]), + latL = as.vector(obsL$coords[[lat_name]]), + expVar = expVar$data, obsVar = obsVar$data, criteria = criteria, excludeTime = excludeTime, region = region, - lonVar = as.vector(obsVar$lon), latVar = as.vector(obsVar$lat), + lonVar = as.vector(lonVar), latVar = as.vector(latVar), nAnalogs = nAnalogs, AnalogsInfo = AnalogsInfo, ncores = ncores) + if (AnalogsInfo) { if (is.numeric(res$dates)) { res$dates <- as.POSIXct(res$dates, origin = '1970-01-01', tz = 'UTC') } } + expL$data <- res - if (is.null(region)) { - expL$lon <- obsL$lon - expL$lat <- obsL$lat - } else { - expL$lon <- SelBox(obsL$data, lon = as.vector(obsL$lon), - lat = as.vector(obsL$lat), - region = region)$lon - expL$lat <- SelBox(obsL$data, lon = as.vector(obsL$lon), - lat = as.vector(obsL$lat), - region = region)$lat + + if (!is.null(obsL$coords[[lon_name]]) | !is.null(obsL$coords[[lat_name]])) { + if (is.null(region)) { + expL$coords[[lon_name]] <- obsL$coords[[lon_name]] + expL$coords[[lat_name]] <- obsL$coords[[lat_name]] + } else { + expL$coords[[lon_name]] <- SelBox(obsL$data, + lon = as.vector(obsL$coords[[lon_name]]), + lat = as.vector(obsL$coords[[lat_name]]), + region = region)$lon + expL$coords[[lat_name]] <- SelBox(obsL$data, + lon = as.vector(obsL$coords[[lon_name]]), + lat = as.vector(obsL$coords[[lat_name]]), + region = region)$lat + } } + return(expL) } @@ -246,16 +295,19 @@ CST_Analogs <- function(expL, obsL, expVar = NULL, obsVar = NULL, region = NULL, #' all the criterias. If parameter 'expVar' is not provided, the function will #' return the expL analog. The element 'data' in the 's2dv_cube' object must #' have, at least, latitudinal and longitudinal dimensions. The object is -#' expect to be already subset for the desired large scale region. +#' expect to be already subset for the desired large scale region. Latitudinal +#' dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. +#' Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', +#' 'nav_lon'. #'@param obsL An array of N named dimensions containing the observational field #' on the large scale. The element 'data' in the 's2dv_cube' object must have #' the same latitudinal and longitudinal dimensions as parameter 'expL' and a #' single temporal dimension with the maximum number of available observations. #'@param time_obsL A character string indicating the date of the observations -#' in the format "dd/mm/yyyy". Reference time to search for analogs. +#' in the format "dd-mm-yyyy". Reference time to search for analogs. #'@param time_expL An array of N named dimensions (coinciding with time #' dimensions in expL) of character string(s) indicating the date(s) of the -#' experiment in the format "dd/mm/yyyy". Time(s) to find the analogs. +#' experiment in the format "dd-mm-yyyy". Time(s) to find the analogs. #'@param lonL A vector containing the longitude of parameter 'expL'. #'@param latL A vector containing the latitude of parameter 'expL'. #'@param excludeTime An array of N named dimensions (coinciding with time @@ -281,7 +333,7 @@ CST_Analogs <- function(expL, obsL, expVar = NULL, obsVar = NULL, region = NULL, #' correlation, while for Large_dist criteria the best analog will be the day #' with minimum Euclidean distance). Set to FALSE to get a single analog, the #' best analog, for instance for downscaling. -#'@param criteria a character string indicating the criteria to be used for the +#'@param criteria A character string indicating the criteria to be used for the #' selection of analogs: #' \itemize{\item{Large_dist} minimum Euclidean distance in the large scale pattern; #' \item{Local_dist} minimum Euclidean distance in the large scale pattern @@ -289,11 +341,11 @@ CST_Analogs <- function(expL, obsL, expVar = NULL, obsVar = NULL, region = NULL, #' \item{Local_cor} minimum Euclidean distance in the large scale pattern, #' minimum Euclidean distance in the local scale pattern and highest #' correlation in the local variable to downscale.} -#'@param lonVar a vector containing the longitude of parameter 'expVar'. -#'@param latVar a vector containing the latitude of parameter 'expVar'. -#'@param region a vector of length four indicating the minimum longitude, +#'@param lonVar A vector containing the longitude of parameter 'expVar'. +#'@param latVar A vector containing the latitude of parameter 'expVar'. +#'@param region A vector of length four indicating the minimum longitude, #' the maximum longitude, the minimum latitude and the maximum latitude. -#'@param nAnalogs number of Analogs to be selected to apply the criterias +#'@param nAnalogs Number of Analogs to be selected to apply the criterias #' 'Local_dist' or 'Local_cor'. This is not the necessary the number of analogs #' that the user can get, but the number of events with minimum distance in #' which perform the search of the best Analog. The default value for the @@ -302,10 +354,7 @@ CST_Analogs <- function(expL, obsL, expVar = NULL, obsVar = NULL, region = NULL, #' NULL for 'Local_dist' and 'Local_cor' the default value will be set at the #' length of 'time_obsL'. If AnalogsInfo is FALSE the function returns just #' the best analog. -#'@param ncores the number of cores to use in parallel computation. -#'@import multiApply -#'@import abind -#'@importFrom ClimProjDiags SelBox Subset +#'@param ncores The number of cores to use in parallel computation. #' #'@return An array with the dowscaled values of the best analogs for the criteria #'selected. If 'AnalogsInfo' is set to TRUE it returns a list with an array @@ -313,14 +362,14 @@ CST_Analogs <- function(expL, obsL, expVar = NULL, obsVar = NULL, region = NULL, #''metric' and 'dates'. #' #'@examples -#'# Example 1:Downscaling using criteria 'Large_dist' and a single variable: +#'# Example 1: Downscaling using criteria 'Large_dist' and a single variable: #'expSLP <- rnorm(1:20) #'dim(expSLP) <- c(lat = 4, lon = 5) #'obsSLP <- c(rnorm(1:180), expSLP * 1.2) #'dim(obsSLP) <- c(time = 10, lat = 4, lon = 5) #'time_obsSLP <- paste(rep("01", 10), rep("01", 10), 1994 : 2003, sep = "-") #'downscale_field <- Analogs(expL = expSLP, obsL = obsSLP, -#' time_obsL = time_obsSLP,time_expL = "01-01-1994") +#' time_obsL = time_obsSLP,time_expL = "01-01-1994") #' #'# Example 2: Downscaling using criteria 'Large_dist' and 2 variables: #'obs.pr <- c(rnorm(1:200) * 0.001) @@ -328,23 +377,7 @@ CST_Analogs <- function(expL, obsL, expVar = NULL, obsVar = NULL, region = NULL, #'downscale_field <- Analogs(expL = expSLP, obsL = obsSLP, obsVar = obs.pr, #' time_obsL = time_obsSLP, time_expL = "01-01-1994") #' -#'# Example 3:List of best Analogs using criteria 'Large_dist' and a single -#'obsSLP <- c(rnorm(1:1980), expSLP * 1.5) -#'dim(obsSLP) <- c(lat = 4, lon = 5, time = 100) -#'time_obsSLP <- paste(rep("01", 100), rep("01", 100), 1920 : 2019, sep = "-") -#'downscale_field <- Analogs(expL = expSLP, obsL = obsSLP, time_obsSLP, -#' nAnalogs = 5, time_expL = "01-01-2003", -#' AnalogsInfo = TRUE, excludeTime = "01-01-2003") -#' -#'# Example 4:List of best Analogs using criteria 'Large_dist' and 2 variables: -#'obsSLP <- c(rnorm(1:180), expSLP * 2) -#'dim(obsSLP) <- c(lat = 4, lon = 5, time = 10) -#'time_obsSLP <- paste(rep("01", 10), rep("01", 10), 1994 : 2003, sep = "-") -#'downscale_field <- Analogs(expL = expSLP, obsL = obsSLP, obsVar = obs.pr, -#' time_obsL = time_obsSLP,nAnalogs=5, -#' time_expL = "01-10-2003", AnalogsInfo = TRUE) -#' -#'# Example 5: Downscaling using criteria 'Local_dist' and 2 variables: +#'# Example 3: Downscaling using criteria 'Local_dist' and 2 variables: #'# analogs of local scale using criteria 2 #'region = c(lonmin = -1 ,lonmax = 2, latmin = 30, latmax = 33) #'Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, @@ -353,65 +386,34 @@ CST_Analogs <- function(expL, obsL, expVar = NULL, obsVar = NULL, region = NULL, #' region = region,time_expL = "01-10-2000", #' nAnalogs = 10, AnalogsInfo = TRUE) #' -#'# Example 6: list of best analogs using criteria 'Local_dist' and 2 -#'Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, -#' criteria = "Local_dist", lonL = seq(-1, 5, 1.5), -#' latL = seq(30, 35, 1.5), region = region, -#' time_expL = "01-10-2000", nAnalogs = 5, -#' AnalogsInfo = TRUE) -#' -#'# Example 7: Downscaling using Local_dist criteria -#'Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, -#' criteria = "Local_dist", lonL = seq(-1, 5, 1.5), -#' latL = seq(30, 35, 1.5), region = region, -#' time_expL = "01-10-2000", -#' nAnalogs = 10, AnalogsInfo = FALSE) -#' -#'# Example 8: Downscaling using criteria 'Local_cor' and 2 variables: +#'# Example 4: Downscaling using criteria 'Local_cor' and 2 variables: #'exp.pr <- c(rnorm(1:20) * 0.001) #'dim(exp.pr) <- dim(expSLP) -#'Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP,time_obsL = time_obsSLP, +#'Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, #' obsVar = obs.pr, expVar = exp.pr, #' criteria = "Local_cor", lonL = seq(-1, 5, 1.5), #' time_expL = "01-10-2000", latL = seq(30, 35, 1.5), #' lonVar = seq(-1, 5, 1.5), latVar = seq(30, 35, 1.5), #' nAnalogs = 8, region = region, AnalogsInfo = FALSE) -#'# same but without imposing nAnalogs,so nAnalogs will be set by default as 10 -#'Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP,time_obsL = time_obsSLP, -#' obsVar = obs.pr, expVar = exp.pr, -#' lonVar = seq(-1, 5, 1.5), latVar = seq(30, 35, 1.5), -#' criteria = "Local_cor", lonL = seq(-1,5,1.5), -#' time_expL = "01-10-2000", latL =seq(30, 35, 1.5), -#' region = region, AnalogsInfo = TRUE) #' -#'#'Example 9: List of best analogs in the three criterias Large_dist, +#'# Example 5: List of best analogs in the three criterias Large_dist, #'Large_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, #' criteria = "Large_dist", time_expL = "01-10-2000", #' nAnalogs = 7, AnalogsInfo = TRUE) #'Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, #' time_expL = "01-10-2000", criteria = "Local_dist", #' lonL = seq(-1, 5, 1.5), latL = seq(30, 35, 1.5), -#' nAnalogs = 7,region = region, AnalogsInfo = TRUE) -#'Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP,time_obsL = time_obsSLP, +#' nAnalogs = 7, region = region, AnalogsInfo = TRUE) +#'Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, #' obsVar = obsSLP, expVar = expSLP, -#' time_expL = "01-10-2000",criteria = "Local_cor", +#' time_expL = "01-10-2000", criteria = "Local_cor", #' lonL = seq(-1, 5, 1.5), latL = seq(30, 35, 1.5), #' lonVar = seq(-1, 5, 1.5), latVar = seq(30, 35, 1.5), -#' nAnalogs = 7,region = region, +#' nAnalogs = 7, region = region, #' AnalogsInfo = TRUE) -#'#Example 10: Downscaling using criteria 'Large_dist' and a single variable, -#'# more than 1 sdate: -#'expSLP <- rnorm(1:40) -#'dim(expSLP) <- c(sdate = 2, lat = 4, lon = 5) -#'obsSLP <- c(rnorm(1:180), expSLP * 1.2) -#'dim(obsSLP) <- c(time = 11, lat = 4, lon = 5) -#'time_obsSLP <- paste(rep("01", 11), rep("01", 11), 1993 : 2003, sep = "-") -#'time_expSLP <- paste(rep("01", 2), rep("01", 2), 1994 : 1995, sep = "-") -#'excludeTime <- c("01-01-2003", "01-01-2003") -#'dim(excludeTime) <- c(sdate = 2) -#'downscale_field_exclude <- Analogs(expL = expSLP, obsL = obsSLP, -#' time_obsL = time_obsSLP, time_expL = time_expSLP, -#' excludeTime = excludeTime, AnalogsInfo = TRUE) +#'@import multiApply +#'@import abind +#'@importFrom ClimProjDiags SelBox Subset #'@export Analogs <- function(expL, obsL, time_obsL, time_expL = NULL, lonL = NULL, latL = NULL, expVar = NULL, @@ -419,30 +421,100 @@ Analogs <- function(expL, obsL, time_obsL, time_expL = NULL, excludeTime = NULL, lonVar = NULL, latVar = NULL, region = NULL, nAnalogs = NULL, AnalogsInfo = FALSE, ncores = NULL) { - if (!all(c('lon', 'lat') %in% names(dim(expL)))) { - stop("Parameter 'expL' must have the dimensions 'lat' and 'lon'.") + # Check inputs + # expL, obsL + if (!is.array(expL) || !is.numeric(expL)) { + stop("Parameter 'expL' must be a numeric array.") + } + if (!is.array(obsL) || !is.numeric(obsL)) { + stop("Parameter 'obsL' must be a numeric array.") + } + obsdims <- names(dim(obsL)) + expdims <- names(dim(expL)) + if (is.null(expdims)) { + stop("Parameter 'expL' must have dimension names.") } - if (!all(c('lat', 'lon') %in% names(dim(obsL)))) { - stop("Parameter 'obsL' must have the dimension 'lat' and 'lon'.") + if (is.null(obsdims)) { + stop("Parameter 'obsL' must have dimension names.") } if (any(is.na(expL))) { - warning("Parameter 'exp' contains NA values.") + warning("Parameter 'expL' contains NA values.") + } + if (any(is.na(obsL))) { + warning("Parameter 'obsL' contains NA values.") } - if (any(is.na(obsL))) { - warning("Parameter 'obs' contains NA values.") + if (!any(.KnownLonNames() %in% obsdims) | !any(.KnownLonNames() %in% expdims)) { + stop("Parameter 'expL' and 'obsL' must have longitudinal dimension.") } + if (!any(.KnownLatNames() %in% obsdims) | !any(.KnownLatNames() %in% expdims)) { + stop("Parameter 'expL' and 'obsL' must have latitudinal dimension.") + } + + # Know spatial coordinates names + if (!any(obsdims %in% .KnownLonNames()) | + !any(obsdims %in% .KnownLatNames())) { + stop("Spatial coordinate names do not match any of the names accepted by ", + "the package.") + } + + lon_name <- obsdims[[which(obsdims %in% .KnownLonNames())]] + lat_name <- obsdims[[which(obsdims %in% .KnownLatNames())]] + + # criteria + if (!criteria %in% c('Large_dist', 'Local_dist', 'Local_cor')) { + stop("Parameter 'criteria' can only be: 'Large_dist', 'Local_dist' or 'Local_cor'.") + } + if (length(criteria) > 1) { + warning("Only first element of 'criteria' parameter will be used.") + criteria <- criteria[1] + } + # lonL, latL, lonVar, latVar + if (criteria == "Local_dist" | criteria == "Local_cor") { + if (is.null(lonL) | is.null(latL)) { + stop("Parameters 'lonL' and 'latL' cannot be NULL.") + } + if (!is.numeric(lonL) | !is.numeric(latL)) { + stop("Parameters 'lonL' and 'latL' must be numeric.") + } + if (!is.null(dim(lonL)) | !is.null(dim(latL))) { + if (length(dim(lonL)) == 1 & length(dim(latL)) == 1) { + lonL <- as.vector(lonL) + latL <- as.vector(latL) + } else { + stop("Parameters 'lonL' and 'latL' need to be a vector.") + } + } + } + if (criteria == "Local_cor") { + if (is.null(lonVar) | is.null(latVar)) { + stop("Parameters 'lonVar' and 'latVar' cannot be NULL.") + } + if (!is.numeric(lonVar) | !is.numeric(latVar)) { + stop("Parameters 'lonVar' and 'latVar' must be numeric.") + } + if (!is.null(dim(lonVar)) | !is.null(dim(latVar))) { + if (length(dim(lonVar)) == 1 & length(dim(latVar)) == 1) { + lonVar <- as.vector(lonVar) + latVar <- as.vector(latVar) + } else { + stop("Parameters 'lonVar' and 'latVar' need to be a vector.") + } + } + } + # expVar and obsVar if (!is.null(expVar) & is.null(obsVar)) { expVar <- NULL warning("Parameter 'expVar' is set to NULL as parameter 'obsVar', - large scale field will be returned.") + large scale field will be returned.") } if (is.null(expVar) & is.null(obsVar)) { warning("Parameter 'expVar' and 'obsVar' are NULLs, downscaling/listing - same variable as obsL and expL'.") + same variable as obsL and expL'.") } if (!is.null(obsVar) & is.null(expVar) & criteria == "Local_cor") { stop("Parameter 'expVar' cannot be NULL.") } + # nAnalogs if (is.null(nAnalogs) & criteria != "Large_dist") { nAnalogs = length(time_obsL) warning("Parameter 'nAnalogs' is NULL and is set to the same length of", @@ -451,29 +523,32 @@ Analogs <- function(expL, obsL, time_obsL, time_expL = NULL, if (is.null(nAnalogs) & criteria == "Large_dist") { nAnalogs <- 1 } + # time_obsL, time_expL + if (is.null(time_obsL)) { + stop("Parameter 'time_obsL' cannot be NULL.") + } if (is.null(time_expL)) { - stop("Parameter 'time_expL' cannot be NULL") + stop("Parameter 'time_expL' cannot be NULL.") } - if(any(class(time_obsL)!="character")){ + if (!inherits(time_obsL, "character")) { warning('imposing time_obsL to be a character') - time_obsL=format(as.Date(time_obsL),'%d-%m-%Y') + time_obsL <- format(as.Date(time_obsL), '%d-%m-%Y') } - if(any(class(time_expL)!="character")){ + if (!inherits(time_expL, "character")) { warning('imposing time_expL to be a character') - time_expL=format(as.Date(time_expL),'%d-%m-%Y') + time_expL <- format(as.Date(time_expL), '%d-%m-%Y') } - if(!is.null(excludeTime)){ - if(any(class(excludeTime)!="character")){ + # excludeTime + if (!is.null(excludeTime)) { + if (!inherits(excludeTime, "character")) { warning('imposing excludeTime to be a character') - excludeTime=format(as.Date(excludeTime),'%d-%m-%Y') + excludeTime <- format(as.Date(excludeTime),'%d-%m-%Y') } - } + } + # time_obsL if (is.null(time_obsL)) { stop("Parameter 'time_obsL' cannot be NULL") } - if (is.null(expL)) { - stop("Parameter 'expL' cannot be NULL") - } if (any(names(dim(obsL)) %in% 'ftime')) { if (any(names(dim(obsL)) %in% 'time')) { stop("Multiple temporal dimensions ('ftime' and 'time') found", @@ -526,8 +601,8 @@ Analogs <- function(expL, obsL, time_obsL, time_expL = NULL, dims_obsL <- dim(obsL) pos_time <- which(names(dim(obsL)) == 'time') if(length(time_obsL) != dim(obsL)[pos_time]) { - stop(" 'time_obsL' and 'obsL' must have same length in the temporal - dimension.") + stop("'time_obsL' and 'obsL' must have same length in the temporal + dimension.") } pos <- 1 : length(dim(obsL)) pos <- c(pos_time, pos[-c(pos_time)]) @@ -589,8 +664,7 @@ Analogs <- function(expL, obsL, time_obsL, time_expL = NULL, c('ftime', 'leadtime', 'ltime') == TRUE)) > 1) { stop("Parameter 'expL' cannot have multiple forecast time dimensions") } else { - names(dim(expL))[which(names(dim(expL)) %in% c('ftime','leadtime','ltime'))] <- - 'time' + names(dim(expL))[which(names(dim(expL)) %in% c('ftime','leadtime','ltime'))] <- 'time' } } # remove dimension length 1 to simplify outputs: @@ -611,14 +685,19 @@ Analogs <- function(expL, obsL, time_obsL, time_expL = NULL, } } names(dim(expL)) <- replace_repeat_dimnames(names(dim(expL)), - names(dim(obsL))) + names(dim(obsL)), + lon_name = lon_name, + lat_name = lat_name) if (!is.null(expVar)) { names(dim(expVar)) <- replace_repeat_dimnames(names(dim(expVar)), - names(dim(obsVar))) + names(dim(obsVar)), + lon_name = lon_name, + lat_name = lat_name) } if (is.null(excludeTime)) { - excludeTime <- vector(mode="character", length=length(time_expL)) + excludeTime <- vector(mode = "character", length = length(time_expL)) + } if (length(time_expL) == length(excludeTime)) { if (any(names(dim(expL)) %in% c('sdate_exp'))) { @@ -645,68 +724,73 @@ Analogs <- function(expL, obsL, time_obsL, time_expL = NULL, if (!AnalogsInfo) { if (is.null(obsVar)) { res <- Apply(list(expL, obsL), - target_dims = list(c('lat', 'lon'), c('time','lat','lon')), + target_dims = list(c(lat_name, lon_name), c('time', lat_name, lon_name)), fun = .analogs, time_obsL, expVar = expVar, time_expL = time_expL, excludeTime = excludeTime, obsVar = obsVar, criteria = criteria, lonL = lonL, latL = latL, lonVar = lonVar, latVar = latVar, region = region, - nAnalogs = nAnalogs, AnalogsInfo = AnalogsInfo, - output_dims = c('nAnalogs', 'lat', 'lon'), + nAnalogs = nAnalogs, AnalogsInfo = AnalogsInfo, + lon_name = lon_name, lat_name = lat_name, + output_dims = c('nAnalogs', lat_name, lon_name), ncores = ncores)$output1 } else if (!is.null(obsVar) && is.null(expVar)) { res <- Apply(list(expL, obsL, obsVar), - target_dims = list(c('lat', 'lon'), c('time','lat','lon'), - c('time', 'lat', 'lon')), + target_dims = list(c(lat_name, lon_name), c('time', lat_name, lon_name), + c('time', lat_name, lon_name)), fun = .analogs, time_obsL, time_expL = time_expL, excludeTime = excludeTime, expVar = expVar, criteria = criteria, lonL = lonL, latL = latL, lonVar = lonVar, latVar = latVar, region = region, nAnalogs = nAnalogs, AnalogsInfo = AnalogsInfo, - output_dims = c('nAnalogs', 'lat', 'lon'), + lon_name = lon_name, lat_name = lat_name, + output_dims = c('nAnalogs', lat_name, lon_name), ncores = ncores)$output1 } else if (!is.null(obsVar) && !is.null(expVar)) { res <- Apply(list(expL, obsL, obsVar, expVar), - target_dims = list(c('lat', 'lon'), c('time','lat','lon'), - c('time','lat','lon'), c('lat','lon')), + target_dims = list(c(lat_name, lon_name), c('time', lat_name, lon_name), + c('time', lat_name, lon_name), c(lat_name, lon_name)), fun = .analogs, criteria = criteria, time_obsL, time_expL = time_expL, excludeTime = excludeTime, lonL = lonL, latL = latL, lonVar = lonVar, latVar = latVar, region = region, - nAnalogs = nAnalogs, AnalogsInfo = AnalogsInfo, - output_dims = c('nAnalogs', 'lat', 'lon'), + nAnalogs = nAnalogs, AnalogsInfo = AnalogsInfo, + lon_name = lon_name, lat_name = lat_name, + output_dims = c('nAnalogs', lat_name, lon_name), ncores = ncores)$output1 } } else { if (is.null(obsVar)) { res <- Apply(list(expL, obsL), - target_dims = list(c('lat', 'lon'), c('time','lat','lon')), + target_dims = list(c(lat_name, lon_name), c('time', lat_name, lon_name)), fun = .analogs, time_obsL, expVar = expVar, time_expL = time_expL, excludeTime = excludeTime, obsVar = obsVar, criteria = criteria, lonL = lonL, latL = latL, lonVar = lonVar, latVar = latVar, region = region, - nAnalogs = nAnalogs, AnalogsInfo = AnalogsInfo, - output_dims = list(fields = c('nAnalogs', 'lat', 'lon'), + nAnalogs = nAnalogs, AnalogsInfo = AnalogsInfo, + lon_name = lon_name, lat_name = lat_name, + output_dims = list(fields = c('nAnalogs', lat_name, lon_name), analogs = c('nAnalogs'), metric = c('nAnalogs', 'metric'), dates = c('nAnalogs')), ncores = ncores) } else if (!is.null(obsVar) && is.null(expVar)) { res <- Apply(list(expL, obsL, obsVar), - target_dims = list(c('lat', 'lon'), c('time','lat','lon'), - c('time', 'lat', 'lon')), - fun = .analogs,time_obsL, + target_dims = list(c(lat_name, lon_name), c('time', lat_name, lon_name), + c('time', lat_name, lon_name)), + fun = .analogs, time_obsL, time_expL = time_expL, excludeTime = excludeTime, expVar = expVar, criteria = criteria, lonL = lonL, latL = latL, lonVar = lonVar, latVar = latVar, region = region, - nAnalogs = nAnalogs, AnalogsInfo = AnalogsInfo, - output_dims = list(fields = c('nAnalogs', 'lat', 'lon'), + nAnalogs = nAnalogs, AnalogsInfo = AnalogsInfo, + lon_name = lon_name, lat_name = lat_name, + output_dims = list(fields = c('nAnalogs', lat_name, lon_name), analogs = c('nAnalogs'), metric = c('nAnalogs', 'metric'), dates = c('nAnalogs')), @@ -714,15 +798,16 @@ Analogs <- function(expL, obsL, time_obsL, time_expL = NULL, } else if (!is.null(obsVar) && !is.null(expVar)) { res <- Apply(list(expL, obsL, obsVar, expVar), - target_dims = list(c('lat', 'lon'), c('time', 'lat', 'lon'), - c('time', 'lat', 'lon'), c('lat', 'lon')), + target_dims = list(c(lat_name, lon_name), c('time', lat_name, lon_name), + c('time', lat_name, lon_name), c(lat_name, lon_name)), fun = .analogs, time_obsL, criteria = criteria, time_expL = time_expL, excludeTime = excludeTime, lonL = lonL, latL = latL, lonVar = lonVar, latVar = latVar, region = region, - nAnalogs = nAnalogs, AnalogsInfo = AnalogsInfo, - output_dims = list(fields = c('nAnalogs', 'lat', 'lon'), + nAnalogs = nAnalogs, AnalogsInfo = AnalogsInfo, + lon_name = lon_name, lat_name = lat_name, + output_dims = list(fields = c('nAnalogs', lat_name, lon_name), analogs = c('nAnalogs'), metric = c('nAnalogs', 'metric'), dates = c('nAnalogs')), @@ -736,7 +821,8 @@ Analogs <- function(expL, obsL, time_obsL, time_expL = NULL, time_obsL, criteria = "Large_dist", lonL = NULL, latL = NULL, lonVar = NULL, latVar = NULL, region = NULL, - nAnalogs = NULL, AnalogsInfo = FALSE) { + nAnalogs = NULL, AnalogsInfo = FALSE, lon_name = 'lon', + lat_name = 'lat') { if (all(excludeTime == "")) { excludeTime = NULL @@ -805,7 +891,7 @@ Analogs <- function(expL, obsL, time_obsL, time_expL = NULL, } else { stop("parameter 'obsL' cannot be NULL") } - if (length(time_obsL)==0) { + if (length(time_obsL) == 0) { stop("Parameter 'time_obsL' can not be length 0") } Analog_result <- FindAnalog(expL = expL, obsL = obsL, time_obsL = time_obsL, @@ -814,7 +900,8 @@ Analogs <- function(expL, obsL, time_obsL, time_expL = NULL, AnalogsInfo = AnalogsInfo, nAnalogs = nAnalogs, lonL = lonL, latL = latL, lonVar = lonVar, - latVar = latVar, region = region) + latVar = latVar, region = region, + lon_name = lon_name, lat_name = lat_name) if (AnalogsInfo == TRUE) { return(list(AnalogsFields = Analog_result$AnalogsFields, AnalogsInfo = Analog_result$Analog, @@ -827,15 +914,17 @@ Analogs <- function(expL, obsL, time_obsL, time_expL = NULL, FindAnalog <- function(expL, obsL, time_obsL, expVar, obsVar, criteria, lonL, latL, lonVar, latVar, region, nAnalogs = nAnalogs, - AnalogsInfo = AnalogsInfo) { + AnalogsInfo = AnalogsInfo, lon_name = 'lon', lat_name = 'lat') { position <- Select(expL = expL, obsL = obsL, expVar = expVar, obsVar = obsVar, criteria = criteria, lonL = lonL, latL = latL, lonVar = lonVar, - latVar = latVar, region = region)$position - metrics<- Select(expL = expL, obsL = obsL, expVar = expVar, + latVar = latVar, region = region, + lon_name = lon_name, lat_name = lat_name)$position + metrics <- Select(expL = expL, obsL = obsL, expVar = expVar, obsVar = obsVar, criteria = criteria, lonL = lonL, latL = latL, lonVar = lonVar, - latVar = latVar, region = region)$metric.original + latVar = latVar, region = region, + lon_name = lon_name, lat_name = lat_name)$metric.original best <- Apply(list(position), target_dims = c('time', 'pos'), fun = BestAnalog, criteria = criteria, AnalogsInfo = AnalogsInfo, nAnalogs = nAnalogs)$output1 @@ -845,7 +934,7 @@ FindAnalog <- function(expL, obsL, time_obsL, expVar, obsVar, criteria, if (all(!is.null(region), !is.null(lonVar), !is.null(latVar))) { if (is.null(obsVar)) { obsVar <- SelBox(obsL, lon = lonL, lat = latL, region = region)$data - expVar <- SelBox(expL, lon = lonL, lat = latL, region=region)$data + expVar <- SelBox(expL, lon = lonL, lat = latL, region = region)$data Analogs_fields <- Subset(obsVar, along = which(names(dim(obsVar)) == 'time'), indices = best) @@ -872,8 +961,8 @@ FindAnalog <- function(expL, obsL, time_obsL, expVar, obsVar, criteria, } } - lon_dim <- which(names(dim(Analogs_fields)) == 'lon') - lat_dim <- which(names(dim(Analogs_fields)) == 'lat') + lon_dim <- which(names(dim(Analogs_fields)) == lon_name) + lat_dim <- which(names(dim(Analogs_fields)) == lat_name) Analogs_metrics <- Subset(metrics, along = which(names(dim(metrics)) == 'time'), @@ -922,17 +1011,17 @@ BestAnalog <- function(position, nAnalogs = nAnalogs, AnalogsInfo = FALSE, } else { pos <- pos1[1 : nAnalogs] } - } else if (criteria== 'Local_dist') { + } else if (criteria == 'Local_dist') { pos1 <- pos1[1 : nAnalogs] pos2 <- pos2[1 : nAnalogs] best <- match(pos1, pos2) - if(length(best)==1) { + if (length(best) == 1) { warning("Just 1 best analog matching Large_dist and ", "Local_dist criteria") } - if(length(best)<1 | is.na(best[1])==TRUE){ + if (length(best) < 1 | is.na(best[1]) == TRUE) { stop("no best analogs matching Large_dist and Local_dist criterias, - please increase nAnalogs") + please increase nAnalogs") } pos <- pos2[as.logical(best)] pos <- pos[which(!is.na(pos))] @@ -944,25 +1033,25 @@ BestAnalog <- function(position, nAnalogs = nAnalogs, AnalogsInfo = FALSE, pos1 <- pos1[1 : nAnalogs] pos2 <- pos2[1 : nAnalogs] best <- match(pos1, pos2) - if (length(best)==1) { + if (length(best) == 1) { warning("Just 1 best analog matching Large_dist and ", "Local_dist criteria") } - if(length(best)<1 | is.na(best[1])==TRUE){ + if (length(best) < 1 | is.na(best[1]) == TRUE) { stop("no best analogs matching Large_dist and Local_dist criterias, - please increase nAnalogs") + please increase nAnalogs") } pos <- pos1[as.logical(best)] pos <- pos[which(!is.na(pos))] pos3 <- pos3[1 : nAnalogs] best <- match(pos, pos3) - if(length(best)==1){ + if (length(best) == 1) { warning("Just 1 best analog matching Large_dist, Local_dist and ", "Local_cor criteria") } - if(length(best)<1 | is.na(best[1])==TRUE){ + if (length(best) < 1 | is.na(best[1]) == TRUE) { stop("no best analogs matching Large_dist, Local_dist and Local_cor - criterias, please increase nAnalogs") + criterias, please increase nAnalogs") } pos <- pos[order(best, decreasing = F)] pos <- pos[which(!is.na(pos))] @@ -976,15 +1065,19 @@ BestAnalog <- function(position, nAnalogs = nAnalogs, AnalogsInfo = FALSE, } Select <- function(expL, obsL, expVar = NULL, obsVar = NULL, criteria = "Large_dist", lonL = NULL, latL = NULL, - lonVar = NULL, latVar = NULL, region = NULL) { + lonVar = NULL, latVar = NULL, region = NULL, + lon_name = 'lon', lat_name = 'lat') { names(dim(expL)) <- replace_repeat_dimnames(names(dim(expL)), - names(dim(obsL))) - metric1 <- Apply(list(obsL), target_dims = list(c('lat', 'lon')), - fun = .select, expL, metric = "dist")$output1 - metric1.original=metric1 + names(dim(obsL)), + lon_name = lon_name, + lat_name = lat_name) + metric1 <- Apply(list(obsL), target_dims = list(c(lat_name, lon_name)), + fun = .select, expL, metric = "dist", + lon_name = lon_name, lat_name = lat_name)$output1 + metric1.original = metric1 if (length(dim(metric1)) > 1) { dim_time_obs <- which(names(dim(metric1)) == 'time' | - names(dim(metric1)) == 'ftime') + names(dim(metric1)) == 'ftime') dim(metric1) <- c(dim(metric1), metric=1) margins <- c(1 : (length(dim(metric1))))[-dim_time_obs] pos1 <- apply(metric1, margins, order) @@ -992,27 +1085,28 @@ Select <- function(expL, obsL, expVar = NULL, obsVar = NULL, metric1.original = metric1 metric1 <- apply(metric1, margins, sort) names(dim(metric1))[1] <- 'time' - names(dim(metric1.original))=names(dim(metric1)) + names(dim(metric1.original)) = names(dim(metric1)) } else { pos1 <- order(metric1) dim(pos1) <- c(time = length(pos1)) metric1 <- sort(metric1) dim(metric1) <- c(time = length(metric1)) - dim(metric1.original)=dim(metric1) - dim_time_obs=1 + dim(metric1.original) = dim(metric1) + dim_time_obs = 1 } if (criteria == "Large_dist") { dim(metric1) <- c(dim(metric1), metric = 1) dim(pos1) <- c(dim(pos1), pos = 1) - dim(metric1.original)=dim(metric1) - return(list(metric = metric1, metric.original=metric1.original, + dim(metric1.original) = dim(metric1) + return(list(metric = metric1, metric.original = metric1.original, position = pos1)) } if (criteria == "Local_dist" | criteria == "Local_cor") { obs <- SelBox(obsL, lon = lonL, lat = latL, region = region)$data exp <- SelBox(expL, lon = lonL, lat = latL, region = region)$data - metric2 <- Apply(list(obs), target_dims = list(c('lat', 'lon')), - fun = .select, exp, metric = "dist")$output1 + metric2 <- Apply(list(obs), target_dims = list(c(lat_name, lon_name)), + fun = .select, exp, metric = "dist", + lon_name = lon_name, lat_name = lat_name)$output1 metric2.original = metric2 dim(metric2) <- c(dim(metric2), metric=1) margins <- c(1 : (length(dim(metric2))))[-dim_time_obs] @@ -1024,27 +1118,28 @@ Select <- function(expL, obsL, expVar = NULL, obsVar = NULL, if (criteria == "Local_dist") { metric <- abind(metric1, metric2, along = length(dim(metric1))+1) metric.original <- abind(metric1.original,metric2.original, - along=length(dim(metric1))+1) + along = length(dim(metric1))+1) position <- abind(pos1, pos2, along = length(dim(pos1))+1) names(dim(metric)) <- c(names(dim(pos1)), 'metric') names(dim(position)) <- c(names(dim(pos1)), 'pos') names(dim(metric.original)) = names(dim(metric)) - return(list(metric = metric, metric.original=metric.original, + return(list(metric = metric, metric.original = metric.original, position = position)) } } if (criteria == "Local_cor") { obs <- SelBox(obsVar, lon = lonVar, lat = latVar, region = region)$data exp <- SelBox(expVar, lon = lonVar, lat = latVar, region = region)$data - metric3 <- Apply(list(obs), target_dims = list(c('lat', 'lon')), - fun = .select, exp, metric = "cor")$output1 - metric3.original=metric3 + metric3 <- Apply(list(obs), target_dims = list(c(lat_name, lon_name)), + fun = .select, exp, metric = "cor", + lon_name = lon_name, lat_name = lat_name)$output1 + metric3.original = metric3 dim(metric3) <- c(dim(metric3), metric=1) margins <- c(1 : (length(dim(metric3))))[-dim_time_obs] pos3 <- apply(abs(metric3), margins, order, decreasing = TRUE) names(dim(pos3))[1] <- 'time' metricsort <- metric3[pos3] - dim(metricsort)=dim(metric3) + dim(metricsort) = dim(metric3) names(dim(metricsort))[1] <- 'time' metric <- abind(metric1, metric2, metricsort, along = length(dim(metric1)) + 1) @@ -1063,22 +1158,23 @@ Select <- function(expL, obsL, expVar = NULL, obsVar = NULL, "'Local_dist','Local_cor'.") } } -.select <- function(exp, obs, metric = "dist") { +.select <- function(exp, obs, metric = "dist", + lon_name = 'lon', lat_name = 'lat') { if (metric == "dist") { - result <- Apply(list(obs), target_dims = list(c('lat', 'lon')), + result <- Apply(list(obs), target_dims = list(c(lat_name, lon_name)), fun = function(x) {sqrt(sum((x - exp) ^ 2, na.rm = TRUE))})$output1 } else if (metric == "cor") { - result <- Apply(list(obs), target_dims = list(c('lat', 'lon')), + result <- Apply(list(obs), target_dims = list(c(lat_name, lon_name)), fun = function(x) {cor(as.vector(x), as.vector(exp), - method="spearman")})$output1 + method = "spearman")})$output1 } result } -.time_ref <- function(time_obsL,time_expL,excludeTime){ +.time_ref <- function(time_obsL,time_expL,excludeTime) { sameTime = which(time_obsL %in% time_expL) - result<- c(time_obsL[1:(sameTime-excludeTime-1)], - time_obsL[(sameTime+excludeTime+1):length(time_obsL)]) + result<- c(time_obsL[1:(sameTime - excludeTime - 1)], + time_obsL[(sameTime + excludeTime + 1):length(time_obsL)]) result } @@ -1103,8 +1199,8 @@ replace_repeat_dimnames <- function(names_exp, names_obs, lat_name = 'lat', } replace_time_dimnames <- function(dataL, time_name = 'time', - stdate_name='stdate', ftime_name='ftime') { - names_obs=names(dim(dataL)) + stdate_name = 'stdate', ftime_name='ftime') { + names_obs = names(dim(dataL)) if (!is.character(names_obs)) { stop("Parameter 'names_obs' must be a vector of characters.") } diff --git a/R/CST_AnalogsPredictors.R b/R/CST_AnalogsPredictors.R index a15a4c0cab79b383293dcbd4c34de41b95a8a19c..f114e8cbdc89fa7547236319d937d101112a9433 100644 --- a/R/CST_AnalogsPredictors.R +++ b/R/CST_AnalogsPredictors.R @@ -5,840 +5,826 @@ #'@author Marta Dominguez Alonso - AEMET, \email{mdomingueza@aemet.es} #'@author Nuria Perez-Zanon - BSC, \email{nuria.perez@bsc.es} #' -#'@description This function downscales low resolution precipitation data (e.g. from -#' Seasonal Forecast Models) through the association with an observational high -#' resolution (HR) dataset (AEMET 5 km gridded data of daily precipitation (Peral et al., 2017)) -#' and a collection of predictors and past synoptic situations similar to estimated day. -#' The method uses three domains: -#' - peninsular Spain and Balearic Islands domain (5 km resolution): HR precipitation -#' and the downscaling result domain. -#' - synoptic domain (low resolution, e.g. 1.5º x 1.5º): it should be centered over Iberian Peninsula -#' and cover enough extension to detect as much synoptic situations as possible. -#' - extended domain (low resolution, e.g. 1.5º x 1.5º): it should have the same resolution -#' as synoptic domain. It is used for SLP Seasonal Forecast Models. -#'@param exp List of arrays with downscaled period seasonal forecast data. The list -#' has to contain model atmospheric variables (instantaneous 12h data) that must -#' be indentify by parenthesis name. -#' For precipitation: -#' - u component of wind at 500 hPa (u500_mod) in m/s -#' - v component of wind at 500 hPa (v500_mod) in m/s -#' - temperature at 500 hPa (t500_mod) in K -#' - temperature at 850 hPa (t850_mod) in K -#' - specific humidity at 700 hPa (q700_mod) in g/kg -#' For temperature: -#' - u component of wind at 500 hPa (u500_mod) in m/s -#' - v component of wind at 500 hPa (v500_mod) in m/s -#' - temperature at 500 hPa (t500_mod) in K -#' - temperature at 700 hPa (t700_mod) in K -#' - temperature at 850 hPa (t850_mod) in K -#' - specific humidity at 700 hPa (q700_mod) in g/kg -#' - 2 meters temperature (tm2m_mod) in K -#' The arrays must have at least three dimensions with names 'lon', 'lat' and 'time'. -#' (lon = gridpoints of longitude, lat = gridpoints of latitude, time = number of downscaling days) -#' Seasonal forecast variables must have the same resolution and -#' domain as reanalysis variables ('obs' parameter, below). +#'@description This function downscales low resolution precipitation data (e.g. +#'from Seasonal Forecast Models) through the association with an observational +#'high resolution (HR) dataset (AEMET 5 km gridded data of daily precipitation +#'(Peral et al., 2017)) and a collection of predictors and past synoptic +#'situations similar to estimated day. The method uses three domains: +#'\itemize{ +#' \item{Peninsular Spain and Balearic Islands domain (5 km resolution): HR precipitation +#' and the downscaling result domain.} +#' \item{Synoptic domain (low resolution, e.g. 1.5º x 1.5º): it should be +#' centered over Iberian Peninsula and cover enough extension to detect +#' as much synoptic situations as possible.} +#' \item{Extended domain (low resolution, e.g. 1.5º x 1.5º): it should have the +#' same resolution as synoptic domain. It is used for SLP Seasonal +#' Forecast Models.} +#' } +#' +#'@param exp List of arrays with downscaled period seasonal forecast data. The +#' list has to contain model atmospheric variables (instantaneous 12h data) +#' that must be indentify by parenthesis name. For precipitation: +#' \itemize{ +#' \item{u component of wind at 500 hPa (u500_mod) in m/s.} +#' \item{v component of wind at 500 hPa (v500_mod) in m/s.} +#' \item{temperature at 500 hPa (t500_mod) in K.} +#' \item{temperature at 850 hPa (t850_mod) in K.} +#' \item{specific humidity at 700 hPa (q700_mod) in g/kg. } +#' } +#' For temperature: +#' \itemize{ +#' \item{u component of wind at 500 hPa (u500_mod) in m/s.} +#' \item{v component of wind at 500 hPa (v500_mod) in m/s.} +#' \item{temperature at 500 hPa (t500_mod) in K.} +#' \item{temperature at 700 hPa (t700_mod) in K. } +#' \item{temperature at 850 hPa (t850_mod) in K.} +#' \item{specific humidity at 700 hPa (q700_mod) in g/kg. } +#' \item{2 meters temperature (tm2m_mod) in K.} +#' } +#' The arrays must have at least three dimensions with names 'lon', 'lat' and +#' 'time'. (lon = gridpoints of longitude, lat = gridpoints of latitude, +#' time = number of downscaling days) Seasonal forecast variables must have the +#' same resolution and domain as reanalysis variables ('obs' parameter, below). #'@param slp Array with atmospheric seasonal forecast model sea level pressure -#' (instantaneous 12h data) that must be indentify as 'slp' (hPa). It has the same -#' resolution as 'exp' and 'obs' paremeters but with an extended domain. -#' This domain contains extra degrees (most in the north and west part) compare to -#' synoptic domain. The array must have at least three dimensions -#' with names 'lon', 'lat' and 'time'. +#' (instantaneous 12h data) that must be indentify as 'slp' (hPa). It has the +#' same resolution as 'exp' and 'obs' paremeters but with an extended domain. +#' This domain contains extra degrees (most in the north and west part) compare +#' to synoptic domain. The array must have at least three dimensions with +#' names 'lon', 'lat' and 'time'. #'@param obs List of arrays with training period reanalysis data. -#' The list has to contain reanalysis atmospheric variables (instantaneous -#' 12h data) that must be indentify by parenthesis name. -#' For precipitation: -#' - u component of wind at 500 hPa (u500) in m/s -#' - v component of wind at 500 hPa (v500) in m/s -#' - temperature at 500 hPa (t500) in K -#' - temperature at 850 hPa (t850) in K -#' - sea level pressure (slp) in hPa -#' - specific humidity at 700 hPa (q700) in g/kg -#' For maximum and minimum temperature: -#' - u component of wind at 500 hPa (u500) in m/s -#' - v component of wind at 500 hPa (v500) in m/s -#' - temperature at 500 hPa (t500) in K -#' - temperature at 700 hPa (t700) in K -#' - temperature at 850 hPa (t850) in K -#' - sea level pressure (slp) in hPa -#' - specific humidity at 700 hPa (q700) in g/kg -#' - 2 meters temperature (tm2m) in K -#' The arrays must have at least three dimensions with names 'lon', 'lat' and 'time'. +#' The list has to contain reanalysis atmospheric variables (instantaneous +#' 12h data) that must be indentify by parenthesis name. For precipitation: +#' \itemize{ +#' \item{u component of wind at 500 hPa (u500) in m/s.} +#' \item{v component of wind at 500 hPa (v500) in m/s.} +#' \item{temperature at 500 hPa (t500) in K.} +#' \item{temperature at 850 hPa (t850) in K.} +#' \item{sea level pressure (slp) in hPa.} +#' \item{specific humidity at 700 hPa (q700) in g/kg.} +#' } +#' For maximum and minimum temperature: +#' \itemize{ +#' \item{u component of wind at 500 hPa (u500) in m/s.} +#' \item{v component of wind at 500 hPa (v500) in m/s.} +#' \item{temperature at 500 hPa (t500) in K.} +#' \item{temperature at 700 hPa (t700) in K.} +#' \item{temperature at 850 hPa (t850) in K.} +#' \item{sea level pressure (slp) in hPa.} +#' \item{specific humidity at 700 hPa (q700) in g/kg} +#' \item{2 meters temperature (tm2m) in K} +#' } +#' The arrays must have at least three dimensions with names 'lon', 'lat' and +#' 'time'. #'@param lon Vector of the synoptic longitude (from (-180º) to 180º), -#' The vector must go from west to east. The same as for the training function. -#'@param lat Vector of the synoptic latitude. The vector must go from north to south. -#' The same as for the training function. +#' The vector must go from west to east. The same as for the training function. +#'@param lat Vector of the synoptic latitude. The vector must go from north to +#' south. The same as for the training function. #'@param slp_lon Vector of the extended longitude (from (-180º) to 180º), -#' The vector must go from west to east. The same as for the training function. -#'@param slp_lat Vector of the extended latitude. The vector must go from north to south. -#' The same as for the training function. +#' The vector must go from west to east. The same as for the training function. +#'@param slp_lat Vector of the extended latitude. The vector must go from north +#' to south. The same as for the training function. #'@param var_name Variable name to downscale. There are two options: 'prec' for -#' precipitation and 'temp' for maximum and minimum temperature. +#' precipitation and 'temp' for maximum and minimum temperature. #'@param hr_obs Local path of HR observational files (maestro and pcp/tmx-tmn). -#' For precipitation can be downloaded from http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/v2/Serie_AEMET_v2_pcp_1951a202006_txt.tar.gz -#' For maximum and minimum temperature can be downloaded from http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/temperatura/v1/tmax/Serie_AEMET_v1_tmax_1951a202006_txt.tar.gz and http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/temperatura/v1/tmin/Serie_AEMET_v1_tmin_1951a202006_txt.tar.gz respetively. -#' Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), longitude (lon), latitude (lat) and -#' altitude (alt) in columns (vector structure). -#' Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km resolution spanish daily data -#' (precipitation or maximum and minimum temperature from january 1951 to june 2020. See README -#' file for more information. -#' IMPORTANT!: HR observational period must be the same as for reanalysis variables. -#' It is assumed that the training period is smaller than the HR original one (1951-2019), so it is -#' needed to make a new ascii file with the new period and the same structure as original, -#' specifying the training dates in the name (e.g. 'pcp_red_SPAIN_19810101-19961231.txt' for -#' '19810101-19961231' period). +#' For precipitation and temperature can be downloaded from the following link: +#' \url{https://www.aemet.es/en/serviciosclimaticos/cambio_climat/datos_diarios?w=2} +#' respetively. Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), +#' longitude (lon), latitude (lat) and altitude (alt) in columns (vector +#' structure). Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km +#' resolution spanish daily data (precipitation or maximum and minimum +#' temperature from january 1951 to june 2020. See README file for more +#' information. IMPORTANT!: HR observational period must be the same as for +#' reanalysis variables. It is assumed that the training period is smaller than +#' the HR original one (1951-2019), so it is needed to make a new ascii file +#' with the new period and the same structure as original, specifying the +#' training dates in the name +#' (e.g. 'pcp_red_SPAIN_19810101-19961231.txt' for '19810101-19961231' period). #'@param tdates Training period dates in format YYYYMMDD(start)-YYYYMMDD(end) -#' (e.g. 19810101-20181231). -#'@param ddates Downscaling period dates in format YYYYMMDD(start)-YYYYMMDD(end) (e.g. 20191001-20200331). -#'@param restrain Output (list of matrix) obtained from 'training_analogs' function. -#' For precipitation, 'restrain' object must contains um, vm, nger, gu92, gv92, -#' gu52, gv52, neni, vdmin, vref, ccm, lab_pred and cor_pred variables. -#' For maximum and minimum temperature, 'restrain' object must contains um, vm, -#' insol, neni, vdmin y vref. See 'AnalogsPred_train.R' for more information. -#'@param dim_name_longitude A character string indicating the name of the longitude -#'dimension, by default 'longitude'. -#'@param dim_name_latitude A character string indicating the name of the latitude -#'dimension, by default 'latitude'. +#' (e.g. 19810101-20181231). +#'@param ddates Downscaling period dates in format YYYYMMDD(start)-YYYYMMDD(end) +#' (e.g. 20191001-20200331). +#'@param restrain Output (list of matrix) obtained from 'training_analogs' +#' function. For precipitation, 'restrain' object must contains um, vm, nger, +#' gu92, gv92, gu52, gv52, neni, vdmin, vref, ccm, lab_pred and cor_pred +#' variables. For maximum and minimum temperature, 'restrain' object must +#' contains um, vm, insol, neni, vdmin y vref. See 'AnalogsPred_train.R' for +#' more information. +#'@param dim_name_longitude A character string indicating the name of the +#' longitude dimension, by default 'longitude'. +#'@param dim_name_latitude A character string indicating the name of the +#' latitude dimension, by default 'latitude'. #'@param dim_name_time A character string indicating the name of the time -#'dimension, by default 'time'. -#'@return Matrix with seasonal forecast precipitation (mm) or -#' maximum and minimum temperature (dozens of ºC) in a 5km x 5km regular grid -#' over peninsular Spain and Balearic Islands. The resulted matrices have two -#' dimensions ('ddates' x 'nptos').(ddates = number of downscaling days -#' and nptos = number of 'hr_obs' gridpoints). +#' dimension, by default 'time'. +#'@return Matrix with seasonal forecast precipitation (mm) or maximum and +#'minimum temperature (dozens of ºC) in a 5km x 5km regular grid over peninsular +#'Spain and Balearic Islands. The resulted matrices have two dimensions +#'('ddates' x 'nptos').(ddates = number of downscaling days and nptos = number +#'of 'hr_obs' gridpoints). #' #'@useDynLib CSTools -#' #'@export -#' -CST_AnalogsPredictors <- function(exp, - slp, - obs, - lon, - lat, - slp_lon, - slp_lat, - var_name, - hr_obs, - tdates, - ddates, - restrain, - dim_name_longitude = "lon", - dim_name_latitude = "lat", - dim_name_time = "time") { - -if (!is.list(exp)) { +CST_AnalogsPredictors <- function(exp, slp, obs, lon, lat, slp_lon, slp_lat, + var_name, hr_obs, tdates, ddates, restrain, + dim_name_longitude = "lon", + dim_name_latitude = "lat", + dim_name_time = "time") { + if (!is.list(exp)) { stop("Parameter 'exp' must be a list of 'array' objects") } -if (!(all(sapply(exp, inherits, 'array')))) { - stop("Elements of the list in parameter 'exp' must be of the class ", - "'array'.") + if (!(all(sapply(exp, inherits, 'array')))) { + stop("Elements of the list in parameter 'exp' must be of the class ", + "'array'.") } -if (!is.array(slp)) { - stop("Parameter 'slp' must be of the class 'array'.") + if (!is.array(slp)) { + stop("Parameter 'slp' must be of the class 'array'.") } -if (!is.list(obs)) { + if (!is.list(obs)) { stop("Parameter 'obs' must be a list of 'array' objects") } -if (!(all(sapply(obs, inherits, 'array')))) { - stop("Elements of the list in parameter 'obs' must be of the class ", + if (!(all(sapply(obs, inherits, 'array')))) { + stop("Elements of the list in parameter 'obs' must be of the class ", "'array'.") } -if (var_name == "prec") { - if (length(exp) != 5) { - stop("Parameter 'exp' must be a length of 5.") - } else { - if (!(any(names(exp) %in% "u500_mod"))) { - stop("Variable 'u500_mod' in 'exp' parameter is missed.") - } else if (!(any(names(exp) %in% "v500_mod"))) { - stop("Variable 'v500_mod' in 'exp' parameter is missed.") - } else if (!(any(names(exp) %in% "t500_mod"))) { - stop("Variable 't500_mod' in 'exp' parameter is missed.") - } else if (!(any(names(exp) %in% "t850_mod"))) { - stop("Variable 't850_mod' in 'exp' parameter is missed.") - } else if (!(any(names(exp) %in% "q700_mod"))) { - stop("Variable 'q700_mod' in 'exp' parameter is missed.") - } - } - if (length(obs) != 6) { - stop("Parameter 'obs' must be a length of 6.") - } else { - if (!(any(names(obs) %in% "u500"))) { - stop("Variable 'u500' in 'obs' parameter is missed.") - } else if (!(any(names(obs) %in% "v500"))) { - stop("Variable 'v500' in 'obs' parameter is missed.") - } else if (!(any(names(obs) %in% "t500"))) { - stop("Variable 't500' in 'obs' parameter is missed.") - } else if (!(any(names(obs) %in% "t850"))) { - stop("Variable 't850' in 'obs' parameter is missed.") - } else if (!(any(names(obs) %in% "slp"))) { - stop("Variable 'slp' in 'obs' parameter is missed.") - } else if (!(any(names(obs) %in% "q700"))) { - stop("Variable 'q700' in 'obs' parameter is missed.") - } - } -} else { - if (length(exp) != 7) { - stop("Parameter 'exp' must be a length of 7.") - } else { - if (!(any(names(exp) %in% "u500_mod"))) { - stop("Variable 'u500_mod' in 'exp' parameter is missed.") - } else if (!(any(names(exp) %in% "v500_mod"))) { - stop("Variable 'v500_mod' in 'exp' parameter is missed.") - } else if (!(any(names(exp) %in% "t500_mod"))) { - stop("Variable 't500_mod' in 'exp' parameter is missed.") - } else if (!(any(names(exp) %in% "t700_mod"))) { - stop("Variable 't700_mod' in 'exp' parameter is missed.") - } else if (!(any(names(exp) %in% "t850_mod"))) { - stop("Variable 't850_mod' in 'exp' parameter is missed.") - } else if (!(any(names(exp) %in% "q700_mod"))) { - stop("Variable 'q700_mod' in 'exp' parameter is missed.") - } else if (!(any(names(exp) %in% "tm2m_mod"))) { - stop("Variable 'tm2m_mod' in 'exp' parameter is missed.") - } - } - if (length(obs) != 8) { - stop("Parameter 'obs' must be a length of 8.") - } else { - if (!(any(names(obs) %in% "u500"))) { - stop("Variable 'u500' in 'obs' parameter is missed.") - } else if (!(any(names(obs) %in% "v500"))) { - stop("Variable 'v500' in 'obs' parameter is missed.") - } else if (!(any(names(obs) %in% "t500"))) { - stop("Variable 't500' in 'obs' parameter is missed.") - } else if (!(any(names(obs) %in% "t700"))) { - stop("Variable 't700' in 'obs' parameter is missed.") - } else if (!(any(names(obs) %in% "t850"))) { - stop("Variable 't850' in 'obs' parameter is missed.") - } else if (!(any(names(obs) %in% "slp"))) { - stop("Variable 'slp' in 'obs' parameter is missed.") - } else if (!(any(names(obs) %in% "q700"))) { - stop("Variable 'q700' in 'obs' parameter is missed.") - } else if (!(any(names(obs) %in% "tm2m"))) { - stop("Variable 'tm2m' in 'obs' parameter is missed.") - } - } -} + if (var_name == "prec") { + if (length(exp) != 5) { + stop("Parameter 'exp' must be a length of 5.") + } else { + if (!(any(names(exp) %in% "u500_mod"))) { + stop("Variable 'u500_mod' in 'exp' parameter is missed.") + } else if (!(any(names(exp) %in% "v500_mod"))) { + stop("Variable 'v500_mod' in 'exp' parameter is missed.") + } else if (!(any(names(exp) %in% "t500_mod"))) { + stop("Variable 't500_mod' in 'exp' parameter is missed.") + } else if (!(any(names(exp) %in% "t850_mod"))) { + stop("Variable 't850_mod' in 'exp' parameter is missed.") + } else if (!(any(names(exp) %in% "q700_mod"))) { + stop("Variable 'q700_mod' in 'exp' parameter is missed.") + } + } + if (length(obs) != 6) { + stop("Parameter 'obs' must be a length of 6.") + } else { + if (!(any(names(obs) %in% "u500"))) { + stop("Variable 'u500' in 'obs' parameter is missed.") + } else if (!(any(names(obs) %in% "v500"))) { + stop("Variable 'v500' in 'obs' parameter is missed.") + } else if (!(any(names(obs) %in% "t500"))) { + stop("Variable 't500' in 'obs' parameter is missed.") + } else if (!(any(names(obs) %in% "t850"))) { + stop("Variable 't850' in 'obs' parameter is missed.") + } else if (!(any(names(obs) %in% "slp"))) { + stop("Variable 'slp' in 'obs' parameter is missed.") + } else if (!(any(names(obs) %in% "q700"))) { + stop("Variable 'q700' in 'obs' parameter is missed.") + } + } + } else { + if (length(exp) != 7) { + stop("Parameter 'exp' must be a length of 7.") + } else { + if (!(any(names(exp) %in% "u500_mod"))) { + stop("Variable 'u500_mod' in 'exp' parameter is missed.") + } else if (!(any(names(exp) %in% "v500_mod"))) { + stop("Variable 'v500_mod' in 'exp' parameter is missed.") + } else if (!(any(names(exp) %in% "t500_mod"))) { + stop("Variable 't500_mod' in 'exp' parameter is missed.") + } else if (!(any(names(exp) %in% "t700_mod"))) { + stop("Variable 't700_mod' in 'exp' parameter is missed.") + } else if (!(any(names(exp) %in% "t850_mod"))) { + stop("Variable 't850_mod' in 'exp' parameter is missed.") + } else if (!(any(names(exp) %in% "q700_mod"))) { + stop("Variable 'q700_mod' in 'exp' parameter is missed.") + } else if (!(any(names(exp) %in% "tm2m_mod"))) { + stop("Variable 'tm2m_mod' in 'exp' parameter is missed.") + } + } + if (length(obs) != 8) { + stop("Parameter 'obs' must be a length of 8.") + } else { + if (!(any(names(obs) %in% "u500"))) { + stop("Variable 'u500' in 'obs' parameter is missed.") + } else if (!(any(names(obs) %in% "v500"))) { + stop("Variable 'v500' in 'obs' parameter is missed.") + } else if (!(any(names(obs) %in% "t500"))) { + stop("Variable 't500' in 'obs' parameter is missed.") + } else if (!(any(names(obs) %in% "t700"))) { + stop("Variable 't700' in 'obs' parameter is missed.") + } else if (!(any(names(obs) %in% "t850"))) { + stop("Variable 't850' in 'obs' parameter is missed.") + } else if (!(any(names(obs) %in% "slp"))) { + stop("Variable 'slp' in 'obs' parameter is missed.") + } else if (!(any(names(obs) %in% "q700"))) { + stop("Variable 'q700' in 'obs' parameter is missed.") + } else if (!(any(names(obs) %in% "tm2m"))) { + stop("Variable 'tm2m' in 'obs' parameter is missed.") + } + } + } -if (all((sapply(exp,dim))==dim(exp[[1]]))) { + if (all((sapply(exp,dim)) == dim(exp[[1]]))) { dim_exp <- dim(exp[[1]]) - if (!(any(names(dim_exp) %in% dim_name_longitude))) { - stop("Dimension 'lon' in exp parameter is missed.") - } - if (!(any(names(dim_exp) %in% dim_name_latitude))) { - stop("Dimension 'lat' in exp parameter is missed.") - } - if (!(any(names(dim_exp) %in% dim_name_time))) { - stop("Dimension 'time' in exp parameter is missed.") - } -} else { - stop("All 'exp' variables must have the same dimensions.") -} + if (!(any(names(dim_exp) %in% dim_name_longitude))) { + stop("Dimension 'lon' in exp parameter is missed.") + } + if (!(any(names(dim_exp) %in% dim_name_latitude))) { + stop("Dimension 'lat' in exp parameter is missed.") + } + if (!(any(names(dim_exp) %in% dim_name_time))) { + stop("Dimension 'time' in exp parameter is missed.") + } + } else { + stop("All 'exp' variables must have the same dimensions.") + } -dim_slp <- dim(slp) -if (!(any(names(dim_slp) %in% dim_name_longitude))) { + dim_slp <- dim(slp) + if (!(any(names(dim_slp) %in% dim_name_longitude))) { stop("Dimension 'lon' in slp parameter is missed.") -} -if (!(any(names(dim_slp) %in% dim_name_latitude))) { + } + if (!(any(names(dim_slp) %in% dim_name_latitude))) { stop("Dimension 'lat' in slp parameter is missed.") -} -if (!(any(names(dim_slp) %in% dim_name_time))) { + } + if (!(any(names(dim_slp) %in% dim_name_time))) { stop("Dimension 'time' in slp parameter is missed.") -} + } -if (all((sapply(obs,dim))==dim(obs[[1]]))) { - dim_obs <- dim(obs[[1]]) - if (!(any(names(dim_obs) %in% dim_name_longitude))) { - stop("Dimension 'lon' in obs parameter is missed.") - } - if (!(any(names(dim_obs) %in% dim_name_latitude))) { - stop("Dimension 'lat' in obs parameter is missed.") - } - if (!(any(names(dim_obs) %in% dim_name_time))) { - stop("Dimension 'time' in obs parameter is missed.") - } -} else { - stop("All 'obs' variables must have the same dimensions.") -} + if (all((sapply(obs,dim))==dim(obs[[1]]))) { + dim_obs <- dim(obs[[1]]) + if (!(any(names(dim_obs) %in% dim_name_longitude))) { + stop("Dimension 'lon' in obs parameter is missed.") + } + if (!(any(names(dim_obs) %in% dim_name_latitude))) { + stop("Dimension 'lat' in obs parameter is missed.") + } + if (!(any(names(dim_obs) %in% dim_name_time))) { + stop("Dimension 'time' in obs parameter is missed.") + } + } else { + stop("All 'obs' variables must have the same dimensions.") + } -if (!is.vector(lon) || !is.numeric(lon)) { + if (!is.vector(lon) || !is.numeric(lon)) { stop("Parameter 'lon' must be a numeric vector") -} else { + } else { if (is.unsorted(lon)) { - lon <- sort(lon) - warning("'lon' vector has been sorted in increasing order") + lon <- sort(lon) + warning("'lon' vector has been sorted in increasing order") } -} + } -if (!is.vector(lat) || !is.numeric(lat)) { + if (!is.vector(lat) || !is.numeric(lat)) { stop("Parameter 'lat' must be a numeric vector") -} else { + } else { if (!is.unsorted(lat)) { - lat <- sort(lat, decreasing = TRUE) - warning("'lat' vector has been sorted in decreasing order") + lat <- sort(lat, decreasing = TRUE) + warning("'lat' vector has been sorted in decreasing order") } -} + } -if (!is.vector(slp_lon) || !is.numeric(slp_lon)) { + if (!is.vector(slp_lon) || !is.numeric(slp_lon)) { stop("Parameter 'slp_lon' must be a numeric vector") -} else { + } else { if (is.unsorted(slp_lon)) { - lon <- sort(slp_lon) - warning("'slp_lon' vector has been sorted in increasing order") + lon <- sort(slp_lon) + warning("'slp_lon' vector has been sorted in increasing order") } -} + } -if (!is.vector(slp_lat) || !is.numeric(slp_lat)) { + if (!is.vector(slp_lat) || !is.numeric(slp_lat)) { stop("Parameter 'slp_lat' must be a numeric vector") -} else { + } else { if (!is.unsorted(slp_lat)) { - lat <- sort(slp_lat, decreasing = TRUE) - warning("'slp_lat' vector has been sorted in decreasing order") + lat <- sort(slp_lat, decreasing = TRUE) + warning("'slp_lat' vector has been sorted in decreasing order") } -} + } -if (!is.character(hr_obs)){ + if (!is.character(hr_obs)){ stop("Parameter 'hr_obs' must be a character.") -} else { - if (!dir.exists(hr_obs)) { - stop("'hr_obs' directory does not exist") - } -} - -if (!is.character(tdates)) { - stop("Parameter 'tdates' must be a character.") -} else { - if (nchar(tdates) != "17") { - stop("Parameter 'tdates' must be a string with 17 charecters.") - } else { - dateini <- as.Date(substr(tdates,start=1,stop=8),format="%Y%m%d") - dateend <- as.Date(substr(tdates,start=10,stop=18),format="%Y%m%d") - if (dateend <= dateini) { - stop("Parameter 'tdates' must be at least of one day") - } - } -} + } else { + if (!dir.exists(hr_obs)) { + stop("'hr_obs' directory does not exist") + } + } -if (!is.character(ddates)) { - stop("Parameter 'ddates' must be a character.") -} else { - if (nchar(ddates) != "17") { - stop("Parameter 'ddates' must be a string with 17 charecters.") - } else { - dateini <- as.Date(substr(ddates,start=1,stop=8),format="%Y%m%d") - dateend <- as.Date(substr(ddates,start=10,stop=18),format="%Y%m%d") - if (dateend <= dateini) { - stop("Parameter 'ddates' must be at least of one day") - } - } -} + if (!is.character(tdates)) { + stop("Parameter 'tdates' must be a character.") + } else { + if (nchar(tdates) != "17") { + stop("Parameter 'tdates' must be a string with 17 charecters.") + } else { + dateini <- as.Date(substr(tdates,start = 1, stop = 8), format = "%Y%m%d") + dateend <- as.Date(substr(tdates,start = 10, stop = 18), format = "%Y%m%d") + if (dateend <= dateini) { + stop("Parameter 'tdates' must be at least of one day") + } + } + } -# + if (!is.character(ddates)) { + stop("Parameter 'ddates' must be a character.") + } else { + if (nchar(ddates) != "17") { + stop("Parameter 'ddates' must be a string with 17 charecters.") + } else { + dateini <- as.Date(substr(ddates, start = 1, stop = 8), format = "%Y%m%d") + dateend <- as.Date(substr(ddates, start = 10, stop = 18), format = "%Y%m%d") + if (dateend <= dateini) { + stop("Parameter 'ddates' must be at least of one day") + } + } + } -if (names(dim(exp[[1]]))[1] == "lon" & names(dim(exp[[1]]))[2] == "lat" - || names(dim(exp[[1]]))[2] == "lon" & names(dim(exp[[1]]))[3] == "lat") { - texp2D <- lapply(exp, MergeDims, merge_dims = c('lon', 'lat'), - rename_dim = 'gridpoint') -} else if (names(dim(exp[[1]]))[1] == "lat" & names(dim(exp[[1]]))[2] == "lon" - || names(dim(exp[[1]]))[2] == "lat" & names(dim(exp[[1]]))[3] == "lon") { - texp2D <- lapply(exp, MergeDims, merge_dims = c('lat', 'lon'), - rename_dim = 'gridpoint') -} + if (names(dim(exp[[1]]))[1] == "lon" & names(dim(exp[[1]]))[2] == "lat" + || names(dim(exp[[1]]))[2] == "lon" & names(dim(exp[[1]]))[3] == "lat") { + texp2D <- lapply(exp, MergeDims, merge_dims = c('lon', 'lat'), + rename_dim = 'gridpoint') + } else if (names(dim(exp[[1]]))[1] == "lat" & names(dim(exp[[1]]))[2] == "lon" + || names(dim(exp[[1]]))[2] == "lat" & names(dim(exp[[1]]))[3] == "lon") { + texp2D <- lapply(exp, MergeDims, merge_dims = c('lat', 'lon'), + rename_dim = 'gridpoint') + } -if (names(dim(slp))[1] == "lon" & names(dim(slp))[2] == "lat" - || names(dim(slp))[2] == "lon" & names(dim(slp))[3] == "lat") { - tslp2D <- MergeDims(slp,merge_dims = c('lon', 'lat'), - rename_dim = 'gridpoint') -} else if (names(dim(slp))[1] == "lat" & names(dim(slp))[2] == "lon" - || names(dim(slp))[2] == "lat" & names(dim(slp))[3] == "lon") { - tslp2D <- MergeDims(slp,merge_dims = c('lat', 'lon'), - rename_dim = 'gridpoint') -} + if (names(dim(slp))[1] == "lon" & names(dim(slp))[2] == "lat" + || names(dim(slp))[2] == "lon" & names(dim(slp))[3] == "lat") { + tslp2D <- MergeDims(slp,merge_dims = c('lon', 'lat'), + rename_dim = 'gridpoint') + } else if (names(dim(slp))[1] == "lat" & names(dim(slp))[2] == "lon" + || names(dim(slp))[2] == "lat" & names(dim(slp))[3] == "lon") { + tslp2D <- MergeDims(slp,merge_dims = c('lat', 'lon'), + rename_dim = 'gridpoint') + } -if (names(dim(obs[[1]]))[1] == "lon" & names(dim(obs[[1]]))[2] == "lat" - || names(dim(obs[[1]]))[2] == "lon" & names(dim(obs[[1]]))[3] == "lat") { - tobs2D <- lapply(obs, MergeDims, merge_dims = c('lon', 'lat'), - rename_dim = 'gridpoint') -} else if (names(dim(obs[[1]]))[1] == "lat" & names(dim(obs[[1]]))[2] == "lon" - || names(dim(obs[[1]]))[2] == "lat" & names(dim(obs[[1]]))[3] == "lon") { - tobs2D <- lapply(obs, MergeDims, merge_dims = c('lat', 'lon'), - rename_dim = 'gridpoint') -} + if (names(dim(obs[[1]]))[1] == "lon" & names(dim(obs[[1]]))[2] == "lat" + || names(dim(obs[[1]]))[2] == "lon" & names(dim(obs[[1]]))[3] == "lat") { + tobs2D <- lapply(obs, MergeDims, merge_dims = c('lon', 'lat'), + rename_dim = 'gridpoint') + } else if (names(dim(obs[[1]]))[1] == "lat" & names(dim(obs[[1]]))[2] == "lon" + || names(dim(obs[[1]]))[2] == "lat" & names(dim(obs[[1]]))[3] == "lon") { + tobs2D <- lapply(obs, MergeDims, merge_dims = c('lat', 'lon'), + rename_dim = 'gridpoint') + } -if (names(dim(texp2D[[1]]))[1] == "gridpoint") { - exp2D <- lapply(texp2D,aperm) -} else { - exp2D <- texp2D -} + if (names(dim(texp2D[[1]]))[1] == "gridpoint") { + exp2D <- lapply(texp2D,aperm) + } else { + exp2D <- texp2D + } -if (names(dim(tslp2D))[1] == "gridpoint") { - slp2D <- aperm(tslp2D) -} else { - slp2D <- tslp2D -} + if (names(dim(tslp2D))[1] == "gridpoint") { + slp2D <- aperm(tslp2D) + } else { + slp2D <- tslp2D + } -if (names(dim(tobs2D[[1]]))[1] == "gridpoint") { - obs2D <- lapply(tobs2D,aperm) -} else { - obs2D <- tobs2D -} + if (names(dim(tobs2D[[1]]))[1] == "gridpoint") { + obs2D <- lapply(tobs2D,aperm) + } else { + obs2D <- tobs2D + } - downres <- .analogspred(exp2D, - slp2D, - obs2D, - lon, - lat, - slp_lon, - slp_lat, - var_name, - hr_obs, - tdates, - ddates, - restrain) + downres <- .analogspred(exp2D, slp2D, obs2D, lon, lat, slp_lon, slp_lat, + var_name, hr_obs, tdates, ddates, restrain) } #' Atomic .analogspred function #' #'@author Marta Dom\'inguez Alonso - AEMET, \email{mdomingueza@aemet.es} -#' -#' This function works with lists of matrix from reanalysis and seasonal -#' forecast data and uses a Fortran interface (.Fortran) to run an -#' analogs method developed in AEMET. +#'This function works with lists of matrix from reanalysis and seasonal +#'forecast data and uses a Fortran interface (.Fortran) to run an +#'analogs method developed in AEMET. #'@param pred_mod List of matrix with downscaled period seasonal forecast data. The list -#' has to contain model atmospheric variables (instantaneous 12h data) that must -#' be indentify by parenthesis name. -#' For precipitation: -#' - u component of wind at 500 hPa (u500_mod) in m/s -#' - v component of wind at 500 hPa (v500_mod) in m/s -#' - temperature at 500 hPa (t500_mod) in K -#' - temperature at 850 hPa (t850_mod) in K -#' - specific humidity at 700 hPa (q700_mod) in g/kg -#' For temperature: -#' - u component of wind at 500 hPa (u500_mod) in m/s -#' - v component of wind at 500 hPa (v500_mod) in m/s -#' - temperature at 500 hPa (t500_mod) in K -#' - temperature at 700 hPa (t500_mod) in K -#' - temperature at 850 hPa (t850_mod) in K -#' - specific humidity at 700 hPa (q700_mod) in g/kg -#' - 2 meters temperature (tm2m_mod) in K -#' Seasonal forecast variables must have the same resolution and -#' domain as 'pred_rea' parameter. -#' All matrices must have two dimensions with names 'time' and 'gridpoint'. -#'@param pred_slp Matrix with atmospheric seasonal forecast model sea level pressure -#' (instantaneous 12h data) that must be indentify as 'slp'. It has the same -#' resolution as 'pred_mod' paremeter but with an extended domain. This domain contains -#' extra degrees (most in the north and west part) compare to synoptic domain. -#' The matrix must have two dimensions with names 'time' and 'gridpoint'. -#'@param pred_rea List of matrix with training period reanalysis data. -#' The list has to contain reanalysis atmospheric variables (instantaneous -#' 12h data) that must be indentify by parenthesis name. -#' For precipitation: -#' - u component of wind at 500 hPa (u500) in m/s -#' - v component of wind at 500 hPa (v500) in m/s -#' - temperature at 500 hPa (t500) in K -#' - temperature at 850 hPa (t850) in K -#' - sea level pressure (slp) in hPa -#' - specific humidity at 700 hPa (q700) in g/kg -#' For maximum and minimum temperature: -#' - u component of wind at 500 hPa (u500) in m/s -#' - v component of wind at 500 hPa (v500) in m/s -#' - temperature at 500 hPa (t500) in K -#' - temperature at 700 hPa (t500) in K -#' - temperature at 850 hPa (t850) in K -#' - sea level pressure (slp) in hPa -#' - specific humidity at 700 hPa (q700) in g/kg -#' - 2 meters temperature (tm2m) in K -#' All matrices must have two dimensions with names 'ddates' and 'gridpoint'. +#' has to contain model atmospheric variables (instantaneous 12h data) that must +#' be indentify by parenthesis name. For precipitation: +#' \itemize{ +#' \item{u component of wind at 500 hPa (u500_mod) in m/s.} +#' \item{v component of wind at 500 hPa (v500_mod) in m/s.} +#' \item{temperature at 500 hPa (t500_mod) in K.} +#' \item{temperature at 850 hPa (t850_mod) in K.} +#' \item{specific humidity at 700 hPa (q700_mod) in g/kg.} +#' } +#' For temperature: +#' \itemize{ +#' \item{u component of wind at 500 hPa (u500_mod) in m/s.} +#' \item{v component of wind at 500 hPa (v500_mod) in m/s.} +#' \item{temperature at 500 hPa (t500_mod) in K.} +#' \item{temperature at 700 hPa (t500_mod) in K.} +#' \item{temperature at 850 hPa (t850_mod) in K.} +#' \item{specific humidity at 700 hPa (q700_mod) in g/kg.} +#' \item{2 meters temperature (tm2m_mod) in K.} +#' } +#' Seasonal forecast variables must have the same resolution and +#' domain as 'pred_rea' parameter. All matrices must have two dimensions with +#' names 'time' and 'gridpoint'. +#'@param pred_slp Matrix with atmospheric seasonal forecast model sea level +#' pressure (instantaneous 12h data) that must be indentify as 'slp'. It has +#' the same resolution as 'pred_mod' paremeter but with an extended domain. +#' This domain contains extra degrees (most in the north and west part) compare +#' to synoptic domain. The matrix must have two dimensions with names 'time' +#' and 'gridpoint'. +#'@param pred_rea List of matrix with training period reanalysis data. The +#' list has to contain reanalysis atmospheric variables (instantaneous 12h +#' data) that must be indentify by parenthesis name. For precipitation: +#' \itemize{ +#' \item{u component of wind at 500 hPa (u500) in m/s.} +#' \item{v component of wind at 500 hPa (v500) in m/s.} +#' \item{temperature at 500 hPa (t500) in K.} +#' \item{temperature at 850 hPa (t850) in K.} +#' \item{sea level pressure (slp) in hPa.} +#' \item{specific humidity at 700 hPa (q700) in g/kg.} +#' } +#' For maximum and minimum temperature: +#' \itemize{ +#' \item{u component of wind at 500 hPa (u500) in m/s.} +#' \item{v component of wind at 500 hPa (v500) in m/s.} +#' \item{temperature at 500 hPa (t500) in K.} +#' \item{temperature at 700 hPa (t500) in K.} +#' \item{temperature at 850 hPa (t850) in K.} +#' \item{sea level pressure (slp) in hPa.} +#' \item{specific humidity at 700 hPa (q700) in g/kg.} +#' \item{2 meters temperature (tm2m) in K} +#' } +#' All matrices must have two dimensions with names 'ddates' and 'gridpoint'. #'@param lon Vector of the synoptic longitude (from (-180º) to 180º), -#' The vector must go from west to east. -#'@param lat Vector of the synoptic latitude. The vector must go from north to south. +#' The vector must go from west to east. +#'@param lat Vector of the synoptic latitude. The vector must go from north to +#' south. #'@param slp_lon Vector of the extended longitude (from (-180º) to 180º), -#' The vector must go from west to east. -#'@param slp_lat Vector of the extended latitude. The vector must go from north to south. +#' The vector must go from west to east. +#'@param slp_lat Vector of the extended latitude. The vector must go from north +#' to south. #'@param var Variable name to downscale. There are two options: 'prec' for -#' precipitation and 'temp' for maximum and minimum temperature. +#' precipitation and 'temp' for maximum and minimum temperature. #'@param HR_path Local path of HR observational files (maestro and pcp/tmx-tmn). -#' For precipitation can be downloaded from http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/v2/Serie_AEMET_v2_pcp_1951a201903_txt.tar.gz -#' For maximum and minimum temperature can be downloaded from http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/temperatura/v1/tmax/Serie_AEMET_v1_tmax_1951a202006_txt.tar.gz and http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/temperatura/v1/tmin/Serie_AEMET_v1_tmin_1951a202006_txt.tar.gz respetively. -#' Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), longitude (lon), latitude (lat) and -#' altitude (alt) in columns (vector structure). -#' Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km resolution spanish daily data -#' (precipitation or maximum and minimum temperature from january 1951 to march 2019. See README -#' file for more information. -#' IMPORTANT!: HR observational period must be the same as for reanalysis variables -#' ('pred_rea' parameter). -#' It is assumed that the training period is smaller than the HR original one (1951-2019), so it is -#' needed to make a new ascii file with the new period and the same structure as original, -#' specifying the training dates in the name (e.g. 'pcp_red_SPAIN_19810101-19961231.txt' for -#' '19810101-19961231' period). +#' For precipitation and temperature can be downloaded from the following link: +#' \url{https://www.aemet.es/en/serviciosclimaticos/cambio_climat/datos_diarios?w=2} +#' respetively. Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), +#' longitude (lon), latitude (lat) and altitude (alt) in columns (vector +#' structure). Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km +#' resolution spanish daily data (precipitation or maximum and minimum +#' temperature from january 1951 to march 2019. See README file for more +#' information. IMPORTANT!: HR observational period must be the same as for +#' reanalysis variables ('pred_rea' parameter). It is assumed that the training +#' period is smaller than the HR original one (1951-2019), so it is needed to +#' make a new ascii file with the new period and the same structure as original, +#' specifying the training dates in the name (e.g. +#' 'pcp_red_SPAIN_19810101-19961231.txt' for '19810101-19961231' period). #'@param tdates Training period dates in format YYYYMMDD(start)-YYYYMMDD(end) -#' (e.g. 19810101-20181231). The same as for the training function. -#'@param ddates Downscaling period dates in format YYYYMMDD(start)-YYYYMMDD(end) (e.g. 20191001-20200331). -#'@param restrain Output (list of matrix) obtained from 'training_analogs' function. -#' For precipitation, 'restrain' object must contains um, vm, nger, gu92, gv92, -#' gu52, gv52, neni, vdmin, vref, ccm, lab_pred and cor_pred variables. -#' For maximum and minimum temperature, 'restrain' object must contains um, vm, -#' insol, neni, vdmin y vref. See 'AnalogsPred_train.R' for more information. -#'@return .analogspred returns seasonal forecast precipitation (mm) or -#' maximum and minimum temperature (dozens of ºC) in a 5km x 5km regular grid over -#' peninsular Spain and Balearic Islands. Each matrix of the list has two dimensions -#' ('ddates' x 'nptos'). +#' (e.g. 19810101-20181231). The same as for the training function. +#'@param ddates Downscaling period dates in format YYYYMMDD(start)-YYYYMMDD(end) +#' (e.g. 20191001-20200331). +#'@param restrain Output (list of matrix) obtained from 'training_analogs' +#' function. For precipitation, 'restrain' object must contains um, vm, nger, +#' gu92, gv92, gu52, gv52, neni, vdmin, vref, ccm, lab_pred and cor_pred +#' variables. For maximum and minimum temperature, 'restrain' object must +#' contains um, vm, insol, neni, vdmin y vref. See 'AnalogsPred_train.R' for +#' more information. +#'@return .analogspred Returns seasonal forecast precipitation (mm) or maximum +#'and minimum temperature (dozens of ºC) in a 5km x 5km regular grid over +#'peninsular Spain and Balearic Islands. Each matrix of the list has two +#'dimensions ('ddates' x 'nptos'). #' #'@importFrom utils read.table -#' #'@useDynLib CSTools #'@noRd +.analogspred <- function(pred_mod, pred_slp, pred_rea, lon, lat, slp_lon, + slp_lat, var, HR_path, tdates, ddates, restrain) { + if (!is.list(pred_mod)) { + stop("Parameter 'pred_mod' must be a list of 'matrix' objects") + } -.analogspred <- function(pred_mod, - pred_slp, - pred_rea, - lon, - lat, - slp_lon, - slp_lat, - var, - HR_path, - tdates, - ddates, - restrain) { + if (!(all(sapply(pred_mod, inherits, 'matrix')))) { + stop("Elements of the list in parameter 'pred_mod' must be of the class ", + "'matrix'.") + } + if (!is.matrix(pred_slp)) { + stop("Parameter 'pred_slp' must be of the class 'matrix'.") + } -if (!is.list(pred_mod)) { - stop("Parameter 'pred_mod' must be a list of 'matrix' objects") - } + if (!is.list(pred_rea)) { + stop("Parameter 'pred_rea' must be a list of 'matrix' objects") + } -if (!(all(sapply(pred_mod, inherits, 'matrix')))) { - stop("Elements of the list in parameter 'pred_mod' must be of the class ", - "'matrix'.") - } + if (!(all(sapply(pred_rea, inherits, 'matrix')))) { + stop("Elements of the list in parameter 'pred_rea' must be of the class ", + "'matrix'.") + } -if (!is.matrix(pred_slp)) { - stop("Parameter 'pred_slp' must be of the class 'matrix'.") + if (var == "prec") { + if (length(pred_rea) != 6) { + stop("Parameter 'pred_rea' must be a length of 6.") + } + if (length(pred_mod) != 5) { + stop("Parameter 'pred_mod' must be a length of 5.") + } + } else { + if (length(pred_rea) != 8) { + stop("Parameter 'pred_rea' must be a length of 8.") + } + if (length(pred_mod) != 7) { + stop("Parameter 'pred_mod' must be a length of 7.") + } } -if (!is.list(pred_rea)) { - stop("Parameter 'pred_rea' must be a list of 'matrix' objects") - } + if (!is.vector(lon) || !is.numeric(lon)) { + stop("Parameter 'lon' must be a numeric vector") + } -if (!(all(sapply(pred_rea, inherits, 'matrix')))) { - stop("Elements of the list in parameter 'pred_rea' must be of the class ", - "'matrix'.") - } + if (!is.vector(lat) || !is.numeric(lat)) { + stop("Parameter 'lat' must be a numeric vector") + } -if (var == "prec") { - if (length(pred_rea) != 6) { - stop("Parameter 'pred_rea' must be a length of 6.") - } - if (length(pred_mod) != 5) { - stop("Parameter 'pred_mod' must be a length of 5.") - } -} else { - if (length(pred_rea) != 8) { - stop("Parameter 'pred_rea' must be a length of 8.") - } - if (length(pred_mod) != 7) { - stop("Parameter 'pred_mod' must be a length of 7.") - } -} + if (!is.vector(slp_lon) || !is.numeric(slp_lon)) { + stop("Parameter 'slp_lon' must be a numeric vector") + } -if (!is.vector(lon) || !is.numeric(lon)) { - stop("Parameter 'lon' must be a numeric vector") - } + if (!is.vector(slp_lat) || !is.numeric(slp_lat)) { + stop("Parameter 'slp_lat' must be a numeric vector") + } -if (!is.vector(lat) || !is.numeric(lat)) { - stop("Parameter 'lat' must be a numeric vector") - } + if (!is.character(HR_path)){ + stop("Parameter 'HR_path' must be a character.") + } -if (!is.vector(slp_lon) || !is.numeric(slp_lon)) { - stop("Parameter 'slp_lon' must be a numeric vector") - } + if (!is.character(tdates)) { + stop("Parameter 'tdates' must be a character.") + } -if (!is.vector(slp_lat) || !is.numeric(slp_lat)) { - stop("Parameter 'slp_lat' must be a numeric vector") - } + if (!is.character(ddates)) { + stop("Parameter 'ddates' must be a character.") + } -if (!is.character(HR_path)){ - stop("Parameter 'HR_path' must be a character.") - } + if (!is.list(restrain)) { + stop("Parameter 'restrain' must be a list of 'matrix' and 'parameter' objects") + } -if (!is.character(tdates)) { - stop("Parameter 'tdates' must be a character.") - } + #! REANALYSIS GRID PARAMETERS -if (!is.character(ddates)) { - stop("Parameter 'ddates' must be a character.") - } + rlon <- c(lon, NA) - c(NA, lon) + rlon <- rlon[!is.na(rlon)] + if (!all(rlon == rlon[1])) { + stop("Parameter 'lon' must be in regular grid.") + } else { + rlon <- rlon[1] + } -if (!is.list(restrain)) { - stop("Parameter 'restrain' must be a list of 'matrix' and 'parameter' objects") - } + rlat <- c(lat, NA) - c(NA, lat) + rlat <- rlat[!is.na(rlat)] + if (!all(rlat == rlat[1])) { + stop("Parameter 'lat' must be in regular grid.") + } else { + rlat <- rlat[1] + } -#! REANALYSIS GRID PARAMETERS + if (rlon != (-rlat)) { + stop("Parameters 'lon' and 'lat' must have the same resolution.") + } else { + res <- rlon + } - rlon <- c(lon, NA) - c(NA, lon) - rlon <- rlon[!is.na(rlon)] - if (!all(rlon == rlon[1])) { - stop("Parameter 'lon' must be in regular grid.") - } else { - rlon <- rlon[1] - } + nlat <- ((lat[length(lat)] - lat[1]) / rlat) + 1 + nlon <- ((lon[length(lon)] - lon[1]) / rlon) + 1 + + ic <- nlat * nlon + # + slp_rlon <- c(slp_lon, NA) - c(NA, slp_lon) + slp_rlon <- slp_rlon[!is.na(slp_rlon)] + if (!all(slp_rlon == slp_rlon[1])) { + stop("Parameter 'slp_lon' must be in regular grid.") + } else { + slp_rlon <- slp_rlon[1] + } - rlat <- c(lat, NA) - c(NA, lat) - rlat <- rlat[!is.na(rlat)] - if (!all(rlat == rlat[1])) { - stop("Parameter 'lat' must be in regular grid.") - } else { - rlat <- rlat[1] - } + slp_rlat <- c(slp_lat, NA) - c(NA, slp_lat) + slp_rlat <- slp_rlat[!is.na(slp_rlat)] + if (!all(slp_rlat == slp_rlat[1])) { + stop("Parameter 'slp_lat' must be in regular grid.") + } else { + slp_rlat <- slp_rlat[1] + } - if (rlon != (-rlat)) { - stop("Parameters 'lon' and 'lat' must have the same resolution.") - } else { - res <- rlon - } + if (slp_rlon != (-slp_rlat)) { + stop("Parameters 'slp_lon' and 'slp_lat' must have the same resolution.") + } else { + slp_res <- slp_rlon + } - nlat <- ((lat[length(lat)] - lat[1]) / rlat) + 1 - nlon <- ((lon[length(lon)] - lon[1]) / rlon) + 1 + nlatt <- ((slp_lat[length(slp_lat)] - slp_lat[1]) / slp_rlat) + 1 + nlont <- ((slp_lon[length(slp_lon)] - slp_lon[1]) / slp_rlon) + 1 - ic <- nlat * nlon -# - slp_rlon <- c(slp_lon, NA) - c(NA, slp_lon) - slp_rlon <- slp_rlon[!is.na(slp_rlon)] - if (!all(slp_rlon == slp_rlon[1])) { - stop("Parameter 'slp_lon' must be in regular grid.") - } else { - slp_rlon <- slp_rlon[1] - } + id <- nlatt * nlont - slp_rlat <- c(slp_lat, NA) - c(NA, slp_lat) - slp_rlat <- slp_rlat[!is.na(slp_rlat)] - if (!all(slp_rlat == slp_rlat[1])) { - stop("Parameter 'slp_lat' must be in regular grid.") - } else { - slp_rlat <- slp_rlat[1] - } + slat <- max(lat) + slon <- min(c(lon[which(lon > 180)] - 360, + lon[which(lon <= 180)])) - if (slp_rlon != (-slp_rlat)) { - stop("Parameters 'slp_lon' and 'slp_lat' must have the same resolution.") - } else { - slp_res <- slp_rlon - } + slatt <- max(slp_lat) + slont <- min(c(slp_lon[which(slp_lon > 180)] - 360, + slp_lon[which(slp_lon <= 180)])) - nlatt <- ((slp_lat[length(slp_lat)] - slp_lat[1]) / slp_rlat) + 1 - nlont <- ((slp_lon[length(slp_lon)] - slp_lon[1]) / slp_rlon) + 1 + ngridd <- ((2*nlatt)-1)*((2*nlont)-1) - id <- nlatt * nlont + if (all((sapply(pred_rea,nrow))==nrow(pred_rea[[1]]))){ + nd <- nrow(pred_rea[[1]]) + } else { + stop("All 'pred_rea' variables must have the same period.") + } - slat <- max(lat) - slon <- min(c(lon[which(lon > 180)] - 360, - lon[which(lon <= 180)])) + if (all((sapply(pred_mod,nrow))==nrow(pred_mod[[1]]))){ + nm <- nrow(pred_mod[[1]]) + } else { + stop("All 'pred_mod' variables must have the same period.") + } - slatt <- max(slp_lat) - slont <- min(c(slp_lon[which(slp_lon > 180)] - 360, - slp_lon[which(slp_lon <= 180)])) + seqdates <- seq(as.Date(substr(ddates,start=1,stop=8),format="%Y%m%d"),as.Date(substr(ddates,start=10,stop=18),format="%Y%m%d"),by="days") + month <- format(seqdates,format="%m") + day <- format(seqdates,format="%d") + + #! TRAINING REANALYSIS VARIABLES + u500 <- pred_rea[['u500']] + v500 <- pred_rea[['v500']] + t500 <- pred_rea[['t500']] + t850 <- pred_rea[['t850']] + msl_si <- pred_rea[['slp']] + q700 <- pred_rea[['q700']] + + if (var == "temp") { + t700 <- pred_rea[['t700']] + tm2m <- pred_rea[['tm2m']] + } - ngridd <- ((2*nlatt)-1)*((2*nlont)-1) + #! SEASONAL FORECAST MODEL VARIABLES + u500_mod <- pred_mod[['u500_mod']] + v500_mod <- pred_mod[['v500_mod']] + t500_mod <- pred_mod[['t500_mod']] + t850_mod <- pred_mod[['t850_mod']] + msl_lr_mod <- pred_slp + q700_mod <- pred_mod[['q700_mod']] + + if (var == "temp") { + t700_mod <- pred_mod[['t700_mod']] + tm2m_mod <- pred_mod[['tm2m_mod']] + } - if (all((sapply(pred_rea,nrow))==nrow(pred_rea[[1]]))){ - nd <- nrow(pred_rea[[1]]) + #! HIGH-RESOLUTION (HR) OBSERVATIONAL DATASET + maestro_hr_file <- paste(HR_path, "maestro_red_hr_SPAIN.txt",sep="") + if (!file.exists(maestro_hr_file)) { + stop("'maestro_red_hr_SPAIN.txt' does not exist.") } else { - stop("All 'pred_rea' variables must have the same period.") + maestro <- read.table(maestro_hr_file) + lon_hr <- unlist(maestro[2]) + lat_hr <- unlist(maestro[3]) + nptos <- length(readLines(maestro_hr_file)) } - if (all((sapply(pred_mod,nrow))==nrow(pred_mod[[1]]))){ - nm <- nrow(pred_mod[[1]]) + if (var == "prec") { + prec_hr_file <- paste(HR_path, "pcp_red_SPAIN_",tdates,".txt",sep="") + if (!file.exists(prec_hr_file)) { + stop(sprintf("precipitation HR file for %s does not exist.",tdates)) + } else { + nd_hr <- length(readLines(prec_hr_file)) + preprec_hr <- matrix(scan(prec_hr_file), nrow=nd_hr ,ncol= nptos+1, byrow=TRUE) + prec_hr <- preprec_hr[1:nd_hr,-c(1)] + } } else { - stop("All 'pred_mod' variables must have the same period.") + tmx_hr_file <- paste(HR_path, "tmx_red_SPAIN_",tdates,".txt",sep="") + tmn_hr_file <- paste(HR_path, "tmn_red_SPAIN_",tdates,".txt",sep="") + if (!file.exists(tmx_hr_file)) { + stop(sprintf("maximum temperature HR file for %s does not exist.",tdates)) + } else if (!file.exists(tmn_hr_file)) { + stop(sprintf("minimum temperature HR file for %s does not exist.",tdates)) + } else if (length(readLines(tmx_hr_file)) != length(readLines(tmn_hr_file))) { + stop("maximum and minimum temperature HR observation files must have the same period.") + } else { + nd_hr <- length(readLines(tmx_hr_file)) + pretmx_hr <- matrix(scan(tmx_hr_file), nrow=nd_hr ,ncol= nptos+1, byrow=TRUE) + tmx_hr <- pretmx_hr[1:nd_hr,-c(1)] + pretmn_hr <- matrix(scan(tmn_hr_file), nrow=nd_hr ,ncol= nptos+1, byrow=TRUE) + tmn_hr <- pretmn_hr[1:nd_hr,-c(1)] + } } - seqdates <- seq(as.Date(substr(ddates,start=1,stop=8),format="%Y%m%d"),as.Date(substr(ddates,start=10,stop=18),format="%Y%m%d"),by="days") - month <- format(seqdates,format="%m") - day <- format(seqdates,format="%d") - -#! TRAINING REANALYSIS VARIABLES -u500 <- pred_rea[['u500']] -v500 <- pred_rea[['v500']] -t500 <- pred_rea[['t500']] -t850 <- pred_rea[['t850']] -msl_si <- pred_rea[['slp']] -q700 <- pred_rea[['q700']] - -if (var == "temp") { -t700 <- pred_rea[['t700']] -tm2m <- pred_rea[['tm2m']] -} - -#! SEASONAL FORECAST MODEL VARIABLES -u500_mod <- pred_mod[['u500_mod']] -v500_mod <- pred_mod[['v500_mod']] -t500_mod <- pred_mod[['t500_mod']] -t850_mod <- pred_mod[['t850_mod']] -msl_lr_mod <- pred_slp -q700_mod <- pred_mod[['q700_mod']] - -if (var == "temp") { -t700_mod <- pred_mod[['t700_mod']] -tm2m_mod <- pred_mod[['tm2m_mod']] -} - -#! HIGH-RESOLUTION (HR) OBSERVATIONAL DATASET -maestro_hr_file <- paste(HR_path, "maestro_red_hr_SPAIN.txt",sep="") -if (!file.exists(maestro_hr_file)) { - stop("'maestro_red_hr_SPAIN.txt' does not exist.") -} else { - maestro <- read.table(maestro_hr_file) - lon_hr <- unlist(maestro[2]) - lat_hr <- unlist(maestro[3]) - nptos <- length(readLines(maestro_hr_file)) -} + if (nd_hr != nd) { + stop("Reanalysis variables and HR observations must have the same period.") + } -if (var == "prec") { - prec_hr_file <- paste(HR_path, "pcp_red_SPAIN_",tdates,".txt",sep="") - if (!file.exists(prec_hr_file)) { - stop(sprintf("precipitation HR file for %s does not exist.",tdates)) - } else { - nd_hr <- length(readLines(prec_hr_file)) - preprec_hr <- matrix(scan(prec_hr_file), nrow=nd_hr ,ncol= nptos+1, byrow=TRUE) - prec_hr <- preprec_hr[1:nd_hr,-c(1)] - } -} else { - tmx_hr_file <- paste(HR_path, "tmx_red_SPAIN_",tdates,".txt",sep="") - tmn_hr_file <- paste(HR_path, "tmn_red_SPAIN_",tdates,".txt",sep="") - if (!file.exists(tmx_hr_file)) { - stop(sprintf("maximum temperature HR file for %s does not exist.",tdates)) - } else if (!file.exists(tmn_hr_file)) { - stop(sprintf("minimum temperature HR file for %s does not exist.",tdates)) - } else if (length(readLines(tmx_hr_file)) != length(readLines(tmn_hr_file))) { - stop("maximum and minimum temperature HR observation files must have the same period.") - } else { - nd_hr <- length(readLines(tmx_hr_file)) - pretmx_hr <- matrix(scan(tmx_hr_file), nrow=nd_hr ,ncol= nptos+1, byrow=TRUE) - tmx_hr <- pretmx_hr[1:nd_hr,-c(1)] - pretmn_hr <- matrix(scan(tmn_hr_file), nrow=nd_hr ,ncol= nptos+1, byrow=TRUE) - tmn_hr <- pretmn_hr[1:nd_hr,-c(1)] - } -} + #! OTHER PARAMETERS that should not be changed + #! Number of analog situations to consider + nanx <- 155 + #! Number of temperature predictors + nvar <- 7 + + #!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + + if (var == "prec") { + + downs <- .Fortran("down_prec", + ic = as.integer(ic), + id = as.integer(id), + nd = as.integer(nd), + nm = as.integer(nm), + nlat = as.integer(nlat), + nlon = as.integer(nlon), + nlatt = as.integer(nlatt), + nlont = as.integer(nlont), + slat = as.numeric(slat), + slon = as.numeric(slon), + rlat = as.numeric(rlat), + rlon = as.numeric(rlon), + slatt = as.numeric(slatt), + slont = as.numeric(slont), + ngridd = as.integer(ngridd), + u500 = as.numeric(u500), + v500 = as.numeric(v500), + t500 = as.numeric(t500), + t850 = as.numeric(t850), + msl_si = as.numeric(msl_si), + q700 = as.numeric(q700), + prec_hr = as.numeric(prec_hr), + nanx = as.integer(nanx), + restrain$um, + restrain$vm, + restrain$nger, + restrain$gu92, + restrain$gv92, + restrain$gu52, + restrain$gv52, + restrain$neni, + restrain$vdmin, + restrain$vref, + restrain$ccm, + restrain$indices[,,,1],#lab_pred + restrain$indices[,,,2],#cor_pred + u500_mod = as.numeric(u500_mod), + v500_mod = as.numeric(v500_mod), + t500_mod = as.numeric(t500_mod), + t850_mod = as.numeric(t850_mod), + msl_lr_mod = as.numeric(msl_lr_mod), + q700_mod = as.numeric(q700_mod), + pp=matrix(as.double(seq(1,nm*nptos)),c(nm,nptos)), + PACKAGE = 'CSTools') + + output <- downs$pp - if (nd_hr != nd) { - stop("Reanalysis variables and HR observations must have the same period.") - } + } else { -#! OTHER PARAMETERS that should not be changed -#! Number of analog situations to consider -nanx <- 155 -#! Number of temperature predictors -nvar <- 7 - -#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - -if (var == "prec") { - - downs <- .Fortran("down_prec", - ic = as.integer(ic), - id = as.integer(id), - nd = as.integer(nd), - nm = as.integer(nm), - nlat = as.integer(nlat), - nlon = as.integer(nlon), - nlatt = as.integer(nlatt), - nlont = as.integer(nlont), - slat = as.numeric(slat), - slon = as.numeric(slon), - rlat = as.numeric(rlat), - rlon = as.numeric(rlon), - slatt = as.numeric(slatt), - slont = as.numeric(slont), - ngridd = as.integer(ngridd), - u500 = as.numeric(u500), - v500 = as.numeric(v500), - t500 = as.numeric(t500), - t850 = as.numeric(t850), - msl_si = as.numeric(msl_si), - q700 = as.numeric(q700), - prec_hr = as.numeric(prec_hr), - nanx = as.integer(nanx), - restrain$um, - restrain$vm, - restrain$nger, - restrain$gu92, - restrain$gv92, - restrain$gu52, - restrain$gv52, - restrain$neni, - restrain$vdmin, - restrain$vref, - restrain$ccm, - restrain$indices[,,,1],#lab_pred - restrain$indices[,,,2],#cor_pred - u500_mod = as.numeric(u500_mod), - v500_mod = as.numeric(v500_mod), - t500_mod = as.numeric(t500_mod), - t850_mod = as.numeric(t850_mod), - msl_lr_mod = as.numeric(msl_lr_mod), - q700_mod = as.numeric(q700_mod), - pp=matrix(as.double(seq(1,nm*nptos)),c(nm,nptos)), - PACKAGE = 'CSTools') - - output <- downs$pp - -} else { - - downs <- .Fortran("down_temp", - ic = as.integer(ic), - id = as.integer(id), - nd = as.integer(nd), - nm = as.integer(nm), - nlat = as.integer(nlat), - nlon = as.integer(nlon), - nlatt = as.integer(nlatt), - nlont = as.integer(nlont), - slat = as.numeric(slat), - slon = as.numeric(slon), - rlat = as.numeric(rlat), - rlon = as.numeric(rlon), - slatt = as.numeric(slatt), - slont = as.numeric(slont), - ngridd = as.integer(ngridd), - u500 = as.numeric(u500), - v500 = as.numeric(v500), - t500 = as.numeric(t500), - t850 = as.numeric(t850), - msl_si = as.numeric(msl_si), - q700 = as.numeric(q700), - t700 = as.numeric(t700), - tm2m = as.numeric(tm2m), - tmx_hr = as.numeric(tmx_hr), - tmn_hr = as.numeric(tmn_hr), - nanx = as.integer(nanx), - nvar = as.integer(nvar), - day = as.integer(day), - month = as.integer(month), - restrain$um, - restrain$vm, - restrain$insol, - restrain$neni, - restrain$vdmin, - restrain$vref, - u500_mod = as.numeric(u500_mod), - v500_mod = as.numeric(v500_mod), - t500_mod = as.numeric(t500_mod), - t850_mod = as.numeric(t850_mod), - msl_lr_mod = as.numeric(msl_lr_mod), - q700_mod = as.numeric(q700_mod), - t700_mod = as.numeric(t700_mod), - tm2m_mod = as.numeric(tm2m_mod), - tmx=matrix(as.double(seq(1,nm*nptos)),c(nm,nptos)), - tmn=matrix(as.double(seq(1,nm*nptos)),c(nm,nptos)), - PACKAGE = 'CSTools') - - output <- list("tmax" = downs$tmx, - "tmin" = downs$tmn) + downs <- .Fortran("down_temp", + ic = as.integer(ic), + id = as.integer(id), + nd = as.integer(nd), + nm = as.integer(nm), + nlat = as.integer(nlat), + nlon = as.integer(nlon), + nlatt = as.integer(nlatt), + nlont = as.integer(nlont), + slat = as.numeric(slat), + slon = as.numeric(slon), + rlat = as.numeric(rlat), + rlon = as.numeric(rlon), + slatt = as.numeric(slatt), + slont = as.numeric(slont), + ngridd = as.integer(ngridd), + u500 = as.numeric(u500), + v500 = as.numeric(v500), + t500 = as.numeric(t500), + t850 = as.numeric(t850), + msl_si = as.numeric(msl_si), + q700 = as.numeric(q700), + t700 = as.numeric(t700), + tm2m = as.numeric(tm2m), + tmx_hr = as.numeric(tmx_hr), + tmn_hr = as.numeric(tmn_hr), + nanx = as.integer(nanx), + nvar = as.integer(nvar), + day = as.integer(day), + month = as.integer(month), + restrain$um, + restrain$vm, + restrain$insol, + restrain$neni, + restrain$vdmin, + restrain$vref, + u500_mod = as.numeric(u500_mod), + v500_mod = as.numeric(v500_mod), + t500_mod = as.numeric(t500_mod), + t850_mod = as.numeric(t850_mod), + msl_lr_mod = as.numeric(msl_lr_mod), + q700_mod = as.numeric(q700_mod), + t700_mod = as.numeric(t700_mod), + tm2m_mod = as.numeric(tm2m_mod), + tmx=matrix(as.double(seq(1,nm*nptos)),c(nm,nptos)), + tmn=matrix(as.double(seq(1,nm*nptos)),c(nm,nptos)), + PACKAGE = 'CSTools') + + output <- list("tmax" = downs$tmx, + "tmin" = downs$tmn) } return(output) diff --git a/R/CST_Anomaly.R b/R/CST_Anomaly.R index a84b6fc8538b03f4113b96aa8b3126189a0bdee9..c93267413c41aa336280715f389d0c540ec1e82b 100644 --- a/R/CST_Anomaly.R +++ b/R/CST_Anomaly.R @@ -1,4 +1,5 @@ -#'Anomalies relative to a climatology along selected dimension with or without cross-validation +#'Anomalies relative to a climatology along selected dimension with or without +#'cross-validation #' #'@author Perez-Zanon Nuria, \email{nuria.perez@bsc.es} #'@author Pena Jesus, \email{jesus.pena@bsc.es} @@ -41,34 +42,31 @@ #'in CSTools. #' #'@examples -#'# Example 1: #'mod <- 1 : (2 * 3 * 4 * 5 * 6 * 7) #'dim(mod) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) #'obs <- 1 : (1 * 1 * 4 * 5 * 6 * 7) #'dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) #'lon <- seq(0, 30, 5) #'lat <- seq(0, 25, 5) -#'exp <- list(data = mod, lat = lat, lon = lon) -#'obs <- list(data = obs, lat = lat, lon = lon) +#'coords <- list(lon = lon, lat = lat) +#'exp <- list(data = mod, coords = coords) +#'obs <- list(data = obs, coords = coords) #'attr(exp, 'class') <- 's2dv_cube' #'attr(obs, 'class') <- 's2dv_cube' #' -#'anom1 <- CST_Anomaly(exp = exp, obs = obs, cross = FALSE, memb = TRUE) -#'anom2 <- CST_Anomaly(exp = exp, obs = obs, cross = TRUE, memb = TRUE) -#'anom3 <- CST_Anomaly(exp = exp, obs = obs, cross = TRUE, memb = FALSE) -#'anom4 <- CST_Anomaly(exp = exp, obs = obs, cross = FALSE, memb = FALSE) -#'anom5 <- CST_Anomaly(lonlat_temp$exp) -#'anom6 <- CST_Anomaly(obs = lonlat_temp$obs) +#'anom <- CST_Anomaly(exp = exp, obs = obs, cross = FALSE, memb = TRUE) #' -#'@seealso \code{\link[s2dv]{Ano_CrossValid}}, \code{\link[s2dv]{Clim}} and \code{\link{CST_Load}} +#'@seealso \code{\link[s2dv]{Ano_CrossValid}}, \code{\link[s2dv]{Clim}} and +#'\code{\link{CST_Load}} #' #'@import multiApply #'@importFrom s2dv InsertDim Clim Ano_CrossValid Reorder #'@export -CST_Anomaly <- function(exp = NULL, obs = NULL, dim_anom = 'sdate', cross = FALSE, - memb_dim = 'member', memb = TRUE, dat_dim = c('dataset', 'member'), - filter_span = NULL, ftime_dim = 'ftime', ncores = NULL) { - # s2dv_cube +CST_Anomaly <- function(exp = NULL, obs = NULL, dim_anom = 'sdate', + cross = FALSE, memb_dim = 'member', memb = TRUE, + dat_dim = c('dataset', 'member'), filter_span = NULL, + ftime_dim = 'ftime', ncores = NULL) { + # Check 's2dv_cube' if (!inherits(exp, 's2dv_cube') & !is.null(exp) || !inherits(obs, 's2dv_cube') & !is.null(obs)) { stop("Parameter 'exp' and 'obs' must be of the class 's2dv_cube', ", @@ -89,23 +87,18 @@ CST_Anomaly <- function(exp = NULL, obs = NULL, dim_anom = 'sdate', cross = FALS case_exp = 1 warning("Parameter 'obs' is not provided and 'exp' will be used instead.") } - if(any(is.null(names(dim(exp$data))))| any(nchar(names(dim(exp$data))) == 0) | - any(is.null(names(dim(obs$data))))| any(nchar(names(dim(obs$data))) == 0)) { + if (any(is.null(names(dim(exp$data))))| any(nchar(names(dim(exp$data))) == 0) | + any(is.null(names(dim(obs$data))))| any(nchar(names(dim(obs$data))) == 0)) { stop("Parameter 'exp' and 'obs' must have dimension names in element 'data'.") } - if(!all(names(dim(exp$data)) %in% names(dim(obs$data))) | - !all(names(dim(obs$data)) %in% names(dim(exp$data)))) { + if (!all(names(dim(exp$data)) %in% names(dim(obs$data))) | + !all(names(dim(obs$data)) %in% names(dim(exp$data)))) { stop("Parameter 'exp' and 'obs' must have same dimension names in element 'data'.") } dim_exp <- dim(exp$data) dim_obs <- dim(obs$data) dimnames_data <- names(dim_exp) # dim_anom - if (is.numeric(dim_anom) & length(dim_anom) == 1) { - warning("Parameter 'dim_anom' must be a character string and a numeric value will not be ", - "accepted in the next release. The corresponding dimension name is assigned.") - dim_anom <- dimnames_data[dim_anom] - } if (!is.character(dim_anom)) { stop("Parameter 'dim_anom' must be a character string.") } @@ -117,18 +110,18 @@ CST_Anomaly <- function(exp = NULL, obs = NULL, dim_anom = 'sdate', cross = FALS "'exp' and 'obs' must be greater than 1.") } # cross - if (!is.logical(cross) | !is.logical(memb) ) { + if (!is.logical(cross) | !is.logical(memb)) { stop("Parameters 'cross' and 'memb' must be logical.") } - if (length(cross) > 1 | length(memb) > 1 ) { + if (length(cross) > 1 | length(memb) > 1) { cross <- cross[1] - warning("Parameter 'cross' has length greater than 1 and only the first element", + warning("Parameter 'cross' has length greater than 1 and only the first element ", "will be used.") } # memb if (length(memb) > 1) { memb <- memb[1] - warning("Parameter 'memb' has length greater than 1 and only the first element", + warning("Parameter 'memb' has length greater than 1 and only the first element ", "will be used.") } # memb_dim @@ -146,15 +139,15 @@ CST_Anomaly <- function(exp = NULL, obs = NULL, dim_anom = 'sdate', cross = FALS stop("Parameter 'dat_dim' must be a character vector.") } if (!all(dat_dim %in% names(dim_exp)) | !all(dat_dim %in% names(dim_obs))) { - stop("Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension in element 'data'.", - " Set it as NULL if there is no dataset dimension.") + stop("Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension in element 'data'. ", + "Set it as NULL if there is no dataset dimension.") } } # filter_span if (!is.null(filter_span)) { if (!is.numeric(filter_span)) { - warning("Paramater 'filter_span' is not numeric and any filter", - " is being applied.") + warning("Paramater 'filter_span' is not numeric and any filter ", + "is being applied.") filter_span <- NULL } # ncores @@ -178,11 +171,15 @@ CST_Anomaly <- function(exp = NULL, obs = NULL, dim_anom = 'sdate', cross = FALS # With cross-validation if (cross) { - ano <- Ano_CrossValid(exp = exp$data, obs = obs$data, time_dim = dim_anom, memb_dim = memb_dim, memb = memb, dat_dim = dat_dim) + ano <- Ano_CrossValid(exp = exp$data, obs = obs$data, time_dim = dim_anom, + memb_dim = memb_dim, memb = memb, dat_dim = dat_dim, + ncores = ncores) - # Without cross-validation + # Without cross-validation } else { - tmp <- Clim(exp = exp$data, obs = obs$data, time_dim = dim_anom, memb_dim = memb_dim, memb = memb, dat_dim = dat_dim) + tmp <- Clim(exp = exp$data, obs = obs$data, time_dim = dim_anom, + memb_dim = memb_dim, memb = memb, dat_dim = dat_dim, + ncores = ncores) if (!is.null(filter_span)) { tmp$clim_exp <- Apply(tmp$clim_exp, target_dims = c(ftime_dim), @@ -201,8 +198,8 @@ CST_Anomaly <- function(exp = NULL, obs = NULL, dim_anom = 'sdate', cross = FALS clim_exp <- tmp$clim_exp clim_obs <- tmp$clim_obs } else { - clim_exp <- InsertDim(tmp$clim_exp, 1, dim_exp[memb_dim]) - clim_obs <- InsertDim(tmp$clim_obs, 1, dim_obs[memb_dim]) + clim_exp <- InsertDim(tmp$clim_exp, 1, dim_exp[memb_dim]) + clim_obs <- InsertDim(tmp$clim_obs, 1, dim_obs[memb_dim]) } clim_exp <- InsertDim(clim_exp, 1, dim_exp[dim_anom]) clim_obs <- InsertDim(clim_obs, 1, dim_obs[dim_anom]) diff --git a/R/CST_BEI_Weighting.R b/R/CST_BEI_Weighting.R index de7470110c4f9d090bc1ad84b6c786448a5c63c4..adc268a2ad2e8b48b95579be87891d9819064616 100644 --- a/R/CST_BEI_Weighting.R +++ b/R/CST_BEI_Weighting.R @@ -1,78 +1,74 @@ #' Weighting SFSs of a CSTools object. #' -#' @author Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} +#'@author Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} #' -#' @description Function to apply weights to a 's2dv_cube' object. -#' It could return a weighted ensemble mean (deterministic output) or -#' the terciles probabilities (probabilistic output) for Seasonal Forecast -#' Systems (SFSs). +#'@description Function to apply weights to a 's2dv_cube' object. +#'It could return a weighted ensemble mean (deterministic output) or +#'the terciles probabilities (probabilistic output) for Seasonal Forecast +#'Systems (SFSs). #' -#' @references Regionally improved seasonal forecast of precipitation through -#' Best estimation of winter NAO, Sanchez-Garcia, E. et al., -#' Adv. Sci. Res., 16, 165174, 2019, https://doi.org/10.5194/asr-16-165-2019 +#'@references Regionally improved seasonal forecast of precipitation through +#'Best estimation of winter NAO, Sanchez-Garcia, E. et al., +#'Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} #' -#' @param var_exp An object of the class 's2dv_cube' containing the variable -#' (e.g. precipitation, temperature, NAO index) array. -#' The var_exp object is expected to have an element named \code{$data} with -#' at least a temporal dimension and a dimension named 'member'. -#' @param aweights Normalized weights array with at least dimensions -#' (time, member), when 'time' is the temporal dimension as default. -#' When 'aweights' parameter has any other dimensions (as e.g. 'lat') and -#' 'var_exp' parameter has also the same dimension, they must be equals. -#' @param terciles A numeric array with at least one dimension 'tercil' equal to -#' 2, the first element is the lower tercil for a hindcast period, and the second -#' element is the upper tercile. By default is NULL, the terciles are computed -#' from var_exp data. -#' @param type A character string indicating the type of output. -#' If 'type' = 'probs', the function returns, in the element data from -#' 'var_exp' parameter, an array with at least two -#' or four dimensions depending if the variable is spatially aggregated variable -#' (as e.g. NAO index), dimension (time, tercil) or it is spatial variable -#' (as e.g. precipitation or temperature), dimension (time, tercile, lat, lon), -#' containing the terciles probabilities computing with weighted members. -#' The first tercil is the lower tercile, the second is the normal tercile and -#' the third is the upper tercile. -#' If 'type' = 'ensembleMean', the function returns, in the element data from -#' 'var_exp' parameter, an array with at least one or three dimensions -#' depending if the variable is a spatially aggregated variable -#' (as e.g. NAO index)(time) or it is spatial variable (as e.g. precipitation -#' or temperature) (time, lat, lon), containing the ensemble means computing -#' with weighted members. -#' @param time_dim_name A character string indicating the name of the +#'@param var_exp An object of the class 's2dv_cube' containing the variable +#' (e.g. precipitation, temperature, NAO index) array. +#' The var_exp object is expected to have an element named \code{$data} with +#' at least a temporal dimension and a dimension named 'member'. +#'@param aweights Normalized weights array with at least dimensions +#' (time, member), when 'time' is the temporal dimension as default. +#' When 'aweights' parameter has any other dimensions (as e.g. 'lat') and +#' 'var_exp' parameter has also the same dimension, they must be equals. +#'@param terciles A numeric array with at least one dimension 'tercil' equal to +#' 2, the first element is the lower tercil for a hindcast period, and the second +#' element is the upper tercile. By default is NULL, the terciles are computed +#' from var_exp data. +#'@param type A character string indicating the type of output. +#' If 'type' = 'probs', the function returns, in the element data from +#' 'var_exp' parameter, an array with at least two +#' or four dimensions depending if the variable is spatially aggregated variable +#' (as e.g. NAO index), dimension (time, tercil) or it is spatial variable +#' (as e.g. precipitation or temperature), dimension (time, tercile, lat, lon), +#' containing the terciles probabilities computing with weighted members. +#' The first tercil is the lower tercile, the second is the normal tercile and +#' the third is the upper tercile. If 'type' = 'ensembleMean', the function +#' returns, in the element data from 'var_exp' parameter, an array with at +#' least one or three dimensions depending if the variable is a spatially +#' aggregated variable (as e.g. NAO index)(time) or it is spatial variable (as +#' e.g. precipitation or temperature) (time, lat, lon), containing the ensemble +#' means computing with weighted members. +#'@param time_dim_name A character string indicating the name of the #' temporal dimension, by default 'time'. +#'@param memb_dim A character string indicating the name of the +#' member dimension, by default 'member'. #' -#' @return CST_BEI_Weighting() returns a CSTools object (i.e., of the -#' class 's2dv_cube'). -#' This object has at least an element named \code{$data} -#' with at least a temporal dimension (and dimension 'tercil' when the output -#' are tercile probabilities), containing the ensemble means computing with -#' weighted members or probabilities of terciles. +#'@return CST_BEI_Weighting() returns a CSTools object (i.e., of the +#'class 's2dv_cube'). +#'This object has at least an element named \code{$data} +#'with at least a temporal dimension (and dimension 'tercil' when the output +#'are tercile probabilities), containing the ensemble means computing with +#'weighted members or probabilities of terciles. #' -#' @examples -#' var_exp <- 1 : (2 * 4 * 3 * 2) -#' dim(var_exp) <- c(time = 2, member = 4, lat = 3, lon = 2) -#' aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3, 0.1, 0.2, 0.4, 0.4, 0.1, 0.2, 0.4, 0.2) -#' dim(aweights) <- c(time = 2, member = 4, dataset = 2) -#' var_exp <- list(data = var_exp) -#' class(var_exp) <- 's2dv_cube' -#' res_CST <- CST_BEI_Weighting(var_exp, aweights) -#' dim(res_CST$data) -#' # time lat lon dataset -#' # 2 3 2 2 -#' @export - +#'@examples +#'var_exp <- 1 : (2 * 4 * 3 * 2) +#'dim(var_exp) <- c(time = 2, member = 4, lat = 3, lon = 2) +#'aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3, 0.1, 0.2, 0.4, 0.4, 0.1, +#' 0.2, 0.4, 0.2) +#'dim(aweights) <- c(time = 2, member = 4, dataset = 2) +#'var_exp <- list(data = var_exp) +#'class(var_exp) <- 's2dv_cube' +#'res_CST <- CST_BEI_Weighting(var_exp, aweights) +#'@export CST_BEI_Weighting <- function(var_exp, aweights, terciles = NULL, - type = 'ensembleMean', time_dim_name = 'time') { + type = 'ensembleMean', time_dim_name = 'time', + memb_dim = 'member') { - if (!is.character(time_dim_name)) { - stop("Parameter 'time_dim_name' must be a character string indicating", - " the name of the temporal dimension.") - } - if (length(time_dim_name) > 1) { - warning("Parameter 'time_dim_name' has length greater than 1 and ", - "only the first element will be used.") - time_dim_name <- time_dim_name[1] + # s2dv_cube + if (!inherits(var_exp, "s2dv_cube")) { + stop("Parameter 'var_exp' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") } + # type if (!is.character(type)) { stop("Parameter 'type' must be a character string, 'probs' or ", "'ensembleMean', indicating the type of output.") @@ -82,70 +78,19 @@ CST_BEI_Weighting <- function(var_exp, aweights, terciles = NULL, "only the first element will be used.") type <- type[1] } - if (!inherits(var_exp, 's2dv_cube')) { - stop("Parameter 'var_exp' must be of the class 's2dv_cube', ", - "as output by CSTools::CST_Load.") - } - if (!is.null(terciles)){ - if(!is.array(terciles)){ - stop("Parameter 'terciles' must be an array.") - } - if (is.null(names(dim(terciles)))) { - stop("Parameters 'terciles' should have dimmension names.") - } - if(!('tercil' %in% names(dim(terciles)))) { - stop("Parameter 'terciles' must have dimension 'tercil'.") - } - if (dim(terciles)['tercil'] != 2) { - stop("Length of dimension 'tercil' ", - "of parameter 'terciles' must be equal to 2.") - } - if(time_dim_name %in% names(dim(terciles))) { - stop("Parameter 'terciles' must not have temporal dimension.") - } - if('member' %in% names(dim(terciles))) { - stop("Parameter 'terciles' must not have dimension 'member'.") - } - } - if (!is.array(aweights)) { - stop("Parameter 'aweights' must be an array.") - } - if (is.null(names(dim(var_exp$data))) || is.null(names(dim(aweights)))) { - stop("Element 'data' from parameter 'var_exp' and parameter 'aweights'", - " should have dimmension names.") - } - if(!(time_dim_name %in% names(dim(var_exp$data)))) { - stop("Element 'data' from parameter 'var_exp' must have ", - "temporal dimension.") - } - if(!(time_dim_name %in% names(dim(aweights)))) { - stop("Parameter 'aweights' must have temporal dimension.") - } - if(!('member' %in% names(dim(var_exp$data)))) { - stop("Element 'data' from parameter 'var_exp' must have ", - "dimension 'member'.") - } - if(!('member' %in% names(dim(aweights)))) { - stop("Parameter 'aweights' must have dimension 'member'.") - } - if (dim(var_exp$data)[time_dim_name] != dim(aweights)[time_dim_name]) { - stop("Length of temporal dimensions ", - "of element 'data' from parameter 'var_exp' and parameter ", - "'aweights' must be equals.") - } - if (dim(var_exp$data)['member'] != dim(aweights)['member']) { - stop("Length of dimension 'member' of element 'data' from ", - "parameter 'var_exp' and parameter 'aweights' must be equals.") - } - if (type == 'ensembleMean'){ - em <- BEI_EMWeighting(var_exp$data, aweights, time_dim_name) + if (type == 'ensembleMean') { + em <- BEI_EMWeighting(var_exp$data, aweights, time_dim_name, memb_dim) var_exp$data <- em - } else if (type == 'probs'){ - if (is.null(terciles)){ - terciles <- BEI_TercilesWeighting(var_exp$data, aweights, time_dim_name) + } else if (type == 'probs') { + if (is.null(terciles)) { + terciles <- BEI_TercilesWeighting(var_exp$data, aweights, + time_dim_name = time_dim_name, + memb_dim = memb_dim) } - probs <- BEI_ProbsWeighting(var_exp$data, aweights, terciles, time_dim_name) + probs <- BEI_ProbsWeighting(var_exp$data, aweights, terciles, + time_dim_name = time_dim_name, + memb_dim = memb_dim) var_exp$data <- probs } else { stop("Parameter 'type' must be a character string ('probs' or ", @@ -154,54 +99,60 @@ CST_BEI_Weighting <- function(var_exp, aweights, terciles = NULL, return(var_exp) } - -#' @title Computing the weighted ensemble means for SFSs. -#' @author Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} -#' @description This function implements the computation to obtain the weighted -#' ensemble means for SFSs using a normalized weights array, +#'@title Computing the weighted ensemble means for SFSs. +#'@author Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} +#'@description This function implements the computation to obtain the weighted +#'ensemble means for SFSs using a normalized weights array, #' -#' @references Regionally improved seasonal forecast of precipitation through Best -#' estimation of winter NAO, Sanchez-Garcia, E. et al., -#' Adv. Sci. Res., 16, 165174, 2019, https://doi.org/10.5194/asr-16-165-2019 +#'@references Regionally improved seasonal forecast of precipitation through Best +#'estimation of winter NAO, Sanchez-Garcia, E. et al., +#'Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} #' -#' @param var_exp Variable (e.g. precipitation, temperature, NAO index) -#' array from a SFS with at least dimensions (time, member) for a spatially -#' aggregated variable or dimensions (time, member, lat, lon) for a spatial -#' variable, as 'time' the spatial dimension by default. -#' @param aweights Normalized weights array with at least dimensions -#' (time, member), when 'time' is the temporal dimension as default. -#' @param time_dim_name A character string indicating the name of the -#' temporal dimension, by default 'time'. +#'@param var_exp Variable (e.g. precipitation, temperature, NAO index) +#' array from a SFS with at least dimensions (time, member) for a spatially +#' aggregated variable or dimensions (time, member, lat, lon) for a spatial +#' variable, as 'time' the spatial dimension by default. +#'@param aweights Normalized weights array with at least dimensions +#' (time, member), when 'time' is the temporal dimension as default. +#'@param time_dim_name A character string indicating the name of the +#' temporal dimension, by default 'time'. +#'@param memb_dim A character string indicating the name of the +#' member dimension, by default 'member'. +#' +#'@return BEI_EMWeighting() returns an array with at least one or three +#'dimensions depending if the variable is spatially aggregated variable +#'(as e.g. NAO index)(time) or it is spatial variable (as e.g. precipitation +#'or temperature) (time, lat, lon), containing the ensemble means computing +#'with weighted members. #' -#' @return BEI_EMWeighting() returns an array with at least one or three -#' dimensions depending if the variable is spatially aggregated variable -#' (as e.g. NAO index)(time) or it is spatial variable (as e.g. precipitation -#' or temperature) (time, lat, lon), containing the ensemble means computing -#' with weighted members. -#' @import multiApply +#'@examples +#'# Example 1 +#'var_exp <- 1 : (2 * 3 * 4) +#'dim(var_exp) <- c(time = 2, dataset = 3, member = 4) +#'aweights <- runif(24, min = 0.001, max = 0.999) +#'dim(aweights) <- c(time = 2, dataset = 3, member = 4) +#'res <- BEI_EMWeighting(var_exp, aweights) #' -#' @examples -#' # Example 1 -#' var_exp <- 1 : (2 * 3 * 4) -#' dim(var_exp) <- c(time = 2, dataset = 3, member = 4) -#' aweights<- runif(24, min=0.001, max=0.999) -#' dim(aweights) <- c(time = 2, dataset = 3, member = 4) -#' res <- BEI_EMWeighting(var_exp, aweights) -#' dim(res) -#' # time dataset -#' # 2 3 -#' # Example 2 -#' var_exp <- 1 : (2 * 4 * 2 * 3) -#' dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) -#' aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) -#' dim(aweights) <- c(time = 2, member = 4) -#' res <- BEI_EMWeighting(var_exp, aweights) -#' dim(res) -#' # time lat lon -#' # 2 2 3 +#'# Example 2 +#'var_exp <- 1 : (2 * 4 * 2 * 3) +#'dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) +#'aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) +#'dim(aweights) <- c(time = 2, member = 4) +#'res <- BEI_EMWeighting(var_exp, aweights) #' -#' @noRd -BEI_EMWeighting <- function(var_exp, aweights, time_dim_name = 'time') { +#'@import multiApply +#'@export +BEI_EMWeighting <- function(var_exp, aweights, time_dim_name = 'time', + memb_dim = 'member') { + # var_exp + if (!is.array(var_exp)) { + stop("Parameter 'var_exp' must be an array.") + } + # aweights + if (!is.array(aweights)) { + stop("Parameter 'aweights' must be an array.") + } + # time_dim_name if (!is.character(time_dim_name)) { stop("Parameter 'time_dim_name' must be a character string indicating", " the name of the temporal dimension.") @@ -211,65 +162,61 @@ BEI_EMWeighting <- function(var_exp, aweights, time_dim_name = 'time') { "only the first element will be used.") time_dim_name <- time_dim_name[1] } - if (!is.array(var_exp)) { - stop("Parameter 'var_exp' must be an array.") - } - if (!is.array(aweights)) { - stop("Parameter 'aweights' must be an array.") + # memb_dim + if (!is.character(memb_dim)) { + stop("Parameter 'memb_dim' must be a character string indicating", + " the name of the member dimension.") } + # var_exp, aweights (2) if (is.null(names(dim(var_exp))) || is.null(names(dim(aweights)))) { - stop("Parameters 'var_exp' and 'aweights'", - " should have dimmension names.") + stop("Parameters 'var_exp' and 'aweights' should have dimension names.") } - if(!(time_dim_name %in% names(dim(var_exp)))) { + if (!(time_dim_name %in% names(dim(var_exp)))) { stop("Parameter 'var_exp' must have temporal dimension.") } - if(!(time_dim_name %in% names(dim(aweights)))) { + if (!(time_dim_name %in% names(dim(aweights)))) { stop("Parameter 'aweights' must have temporal dimension.") } - if(!('member' %in% names(dim(var_exp)))) { - stop("Parameter 'var_exp' must have temporal dimension.") + if (!(memb_dim %in% names(dim(var_exp)))) { + stop("Parameter 'var_exp' must have member dimension.") } - if(!('member' %in% names(dim(aweights)))) { - stop("Parameter 'aweights' must have temporal dimension.") + if (!(memb_dim %in% names(dim(aweights)))) { + stop("Parameter 'aweights' must have member dimension.") } if (dim(var_exp)[time_dim_name] != dim(aweights)[time_dim_name]) { - stop("Length of temporal dimensions ", - "of parameter 'var_exp' and 'aweights' must be equals.") + stop("Length of temporal dimension ", + "of parameter 'var_exp' and 'aweights' must be equal.") } - if (dim(var_exp)['member'] != dim(aweights)['member']) { - stop("Length of dimension 'member' ", + if (dim(var_exp)[memb_dim] != dim(aweights)[memb_dim]) { + stop("Length of member dimension ", "of parameter 'var_exp' and 'aweights' must be equals.") } res <- Apply(list(var_exp, aweights), - target_dims = list(c(time_dim_name,'member'), - c(time_dim_name,'member')), + target_dims = list(c(time_dim_name, memb_dim), + c(time_dim_name, memb_dim)), fun = .BEI_EMWeighting, time_dim_name)$output1 return(res) } -#' Atomic BEI_EMWeighting -#' @param var_exp Variable (e.g. precipitation, temperature, NAO index) -#' array from a SFS with a temporal dimension, -#' by default 'time', and dimension 'member'. -#' @param aweights Normalized weights array with a temporal dimension, -#' by default 'time', and dimension 'member' -#' @param time_dim_name A character string indicating the name of the -#' temporal dimension, by default 'time'. -#' @return .BEI_EMWeighting returns an array of with a temporal dimension, -#' by default 'time', containing the weighted ensemble means. -#' @examples -#' # Example for the Atomic BEI_EMWeighting function -#' var_exp <- 1 : 6 -#' dim(var_exp) <- c(time = 2, member = 3) -#' aweights <- c(0.28, 0.15, 0.69, 0.64, 0.42, 0.17) -#' dim(aweights) <- c(time = 2, member = 3) -#' res <- .BEI_EMWeighting(var_exp, aweights) -#' dim(res) -#' # time -#' # 2 -#' @noRd +#'Atomic BEI_EMWeighting +#'@param var_exp Variable (e.g. precipitation, temperature, NAO index) +#' array from a SFS with a temporal dimension, +#' by default 'time', and dimension 'member'. +#'@param aweights Normalized weights array with a temporal dimension, +#' by default 'time', and dimension 'member' +#'@param time_dim_name A character string indicating the name of the +#' temporal dimension, by default 'time'. +#'@return .BEI_EMWeighting returns an array of with a temporal dimension, +#'by default 'time', containing the weighted ensemble means. +#'@examples +#'# Example for the Atomic BEI_EMWeighting function +#'var_exp <- 1 : 6 +#'dim(var_exp) <- c(time = 2, member = 3) +#'aweights <- c(0.28, 0.15, 0.69, 0.64, 0.42, 0.17) +#'dim(aweights) <- c(time = 2, member = 3) +#'res <- .BEI_EMWeighting(var_exp, aweights) +#'@noRd .BEI_EMWeighting <- function(var_exp, aweights, time_dim_name = 'time') { posTime <- match(time_dim_name, names(dim(var_exp))) @@ -280,171 +227,186 @@ BEI_EMWeighting <- function(var_exp, aweights, time_dim_name = 'time') { #' Computing the weighted tercile probabilities for SFSs. -#' @author Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} +#'@author Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} #' -#' @description This function implements the computation to obtain the tercile -#' probabilities for a weighted variable for SFSs using a normalized weights array, +#'@description This function implements the computation to obtain the tercile +#'probabilities for a weighted variable for SFSs using a normalized weights array, #' -#' @references Regionally improved seasonal forecast of precipitation through Best -#' estimation of winter NAO, Sanchez-Garcia, E. et al., -#' Adv. Sci. Res., 16, 165174, 2019, https://doi.org/10.5194/asr-16-165-2019 +#'@references Regionally improved seasonal forecast of precipitation through Best +#'estimation of winter NAO, Sanchez-Garcia, E. et al., +#'Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} #' -#' @param var_exp Variable (e.g. precipitation, temperature, NAO index) -#' array from a SFS with at least dimensions (time, member) for a spatially -#' aggregated variable or dimensions (time, member, lat, lon) for a spatial -#' variable, as 'time' the spatial dimension by default. -#' @param aweights Normalized weights array with at least dimensions -#' (time, member), when 'time' is the temporal dimension as default. -#' @param terciles A numeric array with at least one dimension 'tercil' equal to -#' 2, the first element is the lower tercil for a hindcast period, and the second -#' element is the upper tercile. -#' @param time_dim_name A character string indicating the name of the -#' temporal dimension, by default 'time'. +#'@param var_exp Variable (e.g. precipitation, temperature, NAO index) +#' array from a SFS with at least dimensions (time, member) for a spatially +#' aggregated variable or dimensions (time, member, lat, lon) for a spatial +#' variable, as 'time' the spatial dimension by default. +#'@param aweights Normalized weights array with at least dimensions +#' (time, member), when 'time' is the temporal dimension as default. +#'@param terciles A numeric array with at least one dimension 'tercil' equal to +#' 2, the first element is the lower tercil for a hindcast period, and the second +#' element is the upper tercile. +#'@param time_dim_name A character string indicating the name of the +#' temporal dimension, by default 'time'. +#'@param memb_dim A character string indicating the name of the +#' member dimension, by default 'member'. +#' +#'@return BEI_ProbsWeighting() returns an array with at least two or four +#'dimensions depending if the variable is a spatially aggregated variable +#'(as e.g. NAO index)(time, tercil) or it is spatial variable (as e.g. +#'precipitation or temperature)(time, tercile, lat, lon), containing the +#'terciles probabilities computing with weighted members. +#'The first tercil is the lower tercile, the second is the normal tercile and +#'the third is the upper tercile. #' -#' @return BEI_ProbsWeighting() returns an array with at least two or four -#' dimensions depending if the variable is a spatially aggregated variable -#' (as e.g. NAO index)(time, tercil) or it is spatial variable (as e.g. -#' precipitation or temperature)(time, tercile, lat, lon), containing the -#' terciles probabilities computing with weighted members. -#' The first tercil is the lower tercile, the second is the normal tercile and -#' the third is the upper tercile. -#' -#' @import multiApply +#'@examples +#'# Example 1 +#'var_exp <- 1 : (2 * 4) +#'dim(var_exp) <- c(time = 2, member = 4) +#'aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) +#'dim(aweights) <- c(time = 2, member = 4) +#'terciles <- c(2.5,5) +#'dim(terciles) <- c(tercil = 2) +#'res <- BEI_ProbsWeighting(var_exp, aweights, terciles) #' -#' @examples -#' # Example 1 -#' var_exp <- 1 : (2 * 4) -#' dim(var_exp) <- c(time = 2, member = 4) -#' aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) -#' dim(aweights) <- c(time = 2, member = 4) -#' terciles <- c(2.5,5) -#' dim(terciles) <- c(tercil = 2) -#' res <- BEI_ProbsWeighting(var_exp, aweights, terciles) -#' dim(res) -#' # time tercil -#' # 2 3 -#' # Example 2 -#' var_exp <- rnorm(48, 50, 9) -#' dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) -#' aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) -#' dim(aweights) <- c(time = 2, member = 4) -#' terciles <- rep(c(48,50), 2*3) -#' dim(terciles) <- c(tercil = 2, lat = 2, lon = 3) -#' res <- BEI_ProbsWeighting(var_exp, aweights, terciles) -#' dim(res) -#' # time tercil lat lon -#' # 2 3 2 3 -#' @noRd +#'# Example 2 +#'var_exp <- rnorm(48, 50, 9) +#'dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) +#'aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) +#'dim(aweights) <- c(time = 2, member = 4) +#'terciles <- rep(c(48,50), 2*3) +#'dim(terciles) <- c(tercil = 2, lat = 2, lon = 3) +#'res <- BEI_ProbsWeighting(var_exp, aweights, terciles) +#'@import multiApply +#'@export BEI_ProbsWeighting <- function(var_exp, aweights, terciles, - time_dim_name = 'time') { - - if (!is.character(time_dim_name)) { - stop("Parameter 'time_dim_name' must be a character string indicating", - " the name of the temporal dimension.") + time_dim_name = 'time', memb_dim = 'member') { + # var_exp + if (!is.array(var_exp)) { + stop("Parameter 'var_exp' must be an array.") } - if (length(time_dim_name) > 1) { - warning("Parameter 'time_dim_name' has length greater than 1 and ", - "only the first element will be used.") - time_dim_name <- time_dim_name[1] + # aweights + if (!is.array(aweights)) { + stop("Parameter 'aweights' must be an array.") } - if (is.null(terciles)){ - stop("Parameter 'terciles' is null") + # terciles + if (is.null(terciles)) { + stop("Parameter 'terciles' cannot be null.") } - if(!is.array(terciles)){ + if (!is.array(terciles)) { stop("Parameter 'terciles' must be an array.") } if (is.null(names(dim(terciles)))) { - stop("Parameters 'terciles' should have dimmension names.") + stop("Parameter 'terciles' should have dimension names.") } - if(!('tercil' %in% names(dim(terciles)))) { + if (!('tercil' %in% names(dim(terciles)))) { stop("Parameter 'terciles' must have dimension 'tercil'.") } if (dim(terciles)['tercil'] != 2) { stop("Length of dimension 'tercil' ", "of parameter 'terciles' must be equal to 2.") } - if(time_dim_name %in% names(dim(terciles))) { - stop("Parameter 'terciles' must not have temporal dimension.") + # time_dim_name + if (!is.character(time_dim_name)) { + stop("Parameter 'time_dim_name' must be a character string indicating", + " the name of the temporal dimension.") } - if('member' %in% names(dim(terciles))) { - stop("Parameter 'terciles' must not have dimension 'member'.") + if (length(time_dim_name) > 1) { + warning("Parameter 'time_dim_name' has length greater than 1 and ", + "only the first element will be used.") + time_dim_name <- time_dim_name[1] } - if (!is.array(var_exp)) { - stop("Parameter 'var_exp' must be an array.") + # memb_dim + if (!is.character(memb_dim)) { + stop("Parameter 'memb_dim' must be a character string indicating", + " the name of the member dimension.") } - if (!is.array(aweights)) { - stop("Parameter 'aweights' must be an array.") + # var_exp, terciles, aweights (2) + if (time_dim_name %in% names(dim(terciles))) { + stop("Parameter 'terciles' must not have temporal dimension.") + } + if (memb_dim %in% names(dim(terciles))) { + stop("Parameter 'terciles' must not have a member dimension.") } if (is.null(names(dim(var_exp))) || is.null(names(dim(aweights)))) { stop("Parameters 'var_exp' and 'aweights'", - " should have dimmension names.") + " should have dimension names.") } - if(!(time_dim_name %in% names(dim(var_exp)))) { + if (!(time_dim_name %in% names(dim(var_exp)))) { stop("Parameter 'var_exp' must have temporal dimension.") } - if(!(time_dim_name %in% names(dim(aweights)))) { + if (!(time_dim_name %in% names(dim(aweights)))) { stop("Parameter 'aweights' must have temporal dimension.") } - if(!('member' %in% names(dim(var_exp)))) { - stop("Parameter 'var_exp' must have dimension 'member'.") + if (!(memb_dim %in% names(dim(var_exp)))) { + stop("Parameter 'var_exp' must have member dimension.") } - if(!('member' %in% names(dim(aweights)))) { - stop("Parameter 'aweights' must have dimension 'member'.") + if (!(memb_dim %in% names(dim(aweights)))) { + stop("Parameter 'aweights' must have member dimension.") } if (dim(var_exp)[time_dim_name] != dim(aweights)[time_dim_name]) { - stop("Length of temporal dimensions ", - "of parameter 'var_exp' and 'aweights' must be equals.") + stop("Length of temporal dimension ", + "of parameter 'var_exp' and 'aweights' must be equal.") } - if (dim(var_exp)['member'] != dim(aweights)['member']) { - stop("Length of dimension 'member' ", - "of parameter 'var_exp' and 'aweights' must be equals.") + if (dim(var_exp)[memb_dim] != dim(aweights)[memb_dim]) { + stop("Length of member dimension ", + "of parameter 'var_exp' and 'aweights' must be equal.") + } + + names_exp <- sort(names(dim(var_exp))) + names_exp <- names_exp[-which(names_exp %in% c(time_dim_name, memb_dim))] + names_tercil <- sort(names(dim(terciles))) + names_tercil <- names_tercil[-which(names_tercil == 'tercil')] + + if (!all(dim(var_exp)[names_exp] == dim(terciles)[names_tercil])) { + stop("Length of common dimensions ", + "of parameter 'var_exp' and 'terciles' must be equal.") } - res <- Apply(list(var_exp, aweights, terciles), - target_dims = list(c(time_dim_name,'member'), - c(time_dim_name,'member'), + target_dims = list(c(time_dim_name, memb_dim), + c(time_dim_name, memb_dim), c('tercil')), fun = .BEI_ProbsWeighting, time_dim_name)$output1 return(res) } -#' Atomic BEI_ProbsWeighting -#' @param var_exp Variable (e.g. precipitation, temperature, NAO index) -#' array from a SFS with a temporal dimension, -#' by default 'time', and dimension 'member'. -#' @param aweights Normalized weights array with a temporal dimension, -#' by default 'time', and dimension 'member' -#' @param terciles A numeric array with one dimension 'tercil' equal to 2, -#' the first element is the lower tercil for a hindcast period, and the second -#' element is the upper tercile. -#' @param time_dim_name A character string indicating the name of the -#' temporal dimension, by default 'time'. +#'Atomic BEI_ProbsWeighting +#'@param var_exp Variable (e.g. precipitation, temperature, NAO index) +#' array from a SFS with a temporal dimension, +#' by default 'time', and dimension 'member'. +#'@param aweights Normalized weights array with a temporal dimension, +#' by default 'time', and dimension 'member' +#'@param terciles A numeric array with one dimension 'tercil' equal to 2, +#' the first element is the lower tercil for a hindcast period, and the second +#' element is the upper tercile. +#'@param time_dim_name A character string indicating the name of the +#' temporal dimension, by default 'time'. +#'@param memb_dim A character string indicating the name of the +#' member dimension, by default 'member'. #' -#' @return .BEI_ProbsWeighting returns an array of with a temporal dimension, -#' as default 'time', and 'tercil' dimension, containing the probabilities -#' for each tercile computing with weighted members. -#' The firt tercil is the lower tercile, the second is the normal tercile and -#' the third is the upper tercile. -#' @examples -#' # Example -#' var_exp <- 1 : 8 -#' dim(var_exp) <- c(stime = 2, member = 4) -#' aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) -#' dim(aweights) <- c(stime = 2, member = 4) -#' terciles <- quantile(1:8, probs = c(1/3, 2/3)) -#' dim(terciles) <- c(tercil = 2) -#' res <- .BEI_ProbsWeighting(var_exp, aweights, terciles, time_dim_name = 'stime') -#' dim(res) -#' # stime tercil -#' # 2 3 -#' @noRd +#'@return .BEI_ProbsWeighting returns an array of with a temporal dimension, +#'as default 'time', and 'tercil' dimension, containing the probabilities +#'for each tercile computing with weighted members. +#'The firt tercil is the lower tercile, the second is the normal tercile and +#'the third is the upper tercile. +#'@examples +#'# Example +#'var_exp <- 1 : 8 +#'dim(var_exp) <- c(stime = 2, member = 4) +#'aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) +#'dim(aweights) <- c(stime = 2, member = 4) +#'terciles <- quantile(1:8, probs = c(1/3, 2/3)) +#'dim(terciles) <- c(tercil = 2) +#'res <- .BEI_ProbsWeighting(var_exp, aweights, terciles, time_dim_name = 'stime') +#'@noRd .BEI_ProbsWeighting <- function(var_exp, aweights, terciles, - time_dim_name = 'time') { - if(any(is.na(var_exp)) || any(is.na(aweights))){ + time_dim_name = 'time', memb_dim = 'member') { + if (any(is.na(var_exp)) || any(is.na(aweights))) { probTercile <- array(NA, dim = c(dim(var_exp)[time_dim_name], tercil = 3)) } else { - if(any(is.na(terciles))) stop("Terciles are NAs") + if (any(is.na(terciles))) { + stop("Terciles are NAs") + } terciles_exp <- list(lowerTercile = terciles[1], upperTercile = terciles[2]) @@ -452,73 +414,79 @@ BEI_ProbsWeighting <- function(var_exp, aweights, terciles, upperTercile <- terciles_exp$upperTercile # Probabilities - aTerciles <- Apply(list(var_exp), target_dims = list('member'), + aTerciles <- Apply(list(var_exp), target_dims = list(memb_dim), fun = Data2Tercil, lowerTercile, upperTercile)$output1 - pos <- match(names(dim(aTerciles)), c(time_dim_name,'member')) - aTerciles <- aperm(aTerciles,pos) - names(dim(aTerciles)) <- c(time_dim_name,'member') + pos <- match(names(dim(aTerciles)), c(time_dim_name, memb_dim)) + aTerciles <- aperm(aTerciles, pos) + names(dim(aTerciles)) <- c(time_dim_name, memb_dim) probTercile <- array(NA, dim = c(dim(var_exp)[time_dim_name], tercil = 3)) - for (idTercil in 1:3){ - probTercile[,idTercil] <- Apply(list(aTerciles, aweights), - target_dims = list('member','member'), + for (idTercil in 1:3) { + probTercile[ ,idTercil] <- Apply(list(aTerciles, aweights), + target_dims = list(memb_dim, memb_dim), fun = WeightTercil2Prob, idTercil)$output1 } } return(probTercile) } -#' Computing the weighted terciles for SFSs. -#' @author Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} +#'Computing the weighted terciles for SFSs. +#'@author Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} #' -#' @description This function implements the computation to obtain the terciles -#' for a weighted variable for SFSs using a normalized weights array, +#'@description This function implements the computation to obtain the terciles +#'for a weighted variable for SFSs using a normalized weights array, #' -#' @references Regionally improved seasonal forecast of precipitation through Best -#' estimation of winter NAO, Sanchez-Garcia, E. et al., -#' Adv. Sci. Res., 16, 165174, 2019, https://doi.org/10.5194/asr-16-165-2019 +#'@references Regionally improved seasonal forecast of precipitation through Best +#'estimation of winter NAO, Sanchez-Garcia, E. et al., +#'Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} #' -#' @param var_exp Variable (e.g. precipitation, temperature, NAO index) -#' array from a SFS with at least dimensions (time, member) for a spatially -#' aggregated variable or dimensions (time, member, lat, lon) for a spatial -#' variable, as 'time' the spatial dimension by default. -#' @param aweights Normalized weights array with at least dimensions -#' (time, member), when 'time' is the temporal dimension as default. -#' @param time_dim_name A character string indicating the name of the -#' temporal dimension, by default 'time'. +#'@param var_exp Variable (e.g. precipitation, temperature, NAO index) +#' array from a SFS with at least dimensions (time, member) for a spatially +#' aggregated variable or dimensions (time, member, lat, lon) for a spatial +#' variable, as 'time' the spatial dimension by default. +#'@param aweights Normalized weights array with at least dimensions +#' (time, member), when 'time' is the temporal dimension as default. +#'@param time_dim_name A character string indicating the name of the +#' temporal dimension, by default 'time'. +#'@param memb_dim A character string indicating the name of the +#' member dimension, by default 'member'. #' -#' @return BEI_TercilesWeighting() returns an array with at least one -#' dimension depending if the variable is a spatially aggregated variable -#' (as e.g. NAO index)(tercil) or it is spatial variable (as e.g. -#' precipitation or temperature)(tercil, lat, lon), containing the -#' terciles computing with weighted members. -#' The first tercil is the lower tercile, the second is the upper tercile. -#' -#' @import multiApply +#'@return BEI_TercilesWeighting() returns an array with at least one +#'dimension depending if the variable is a spatially aggregated variable +#'(as e.g. NAO index)(tercil) or it is spatial variable (as e.g. +#'precipitation or temperature)(tercil, lat, lon), containing the +#'terciles computing with weighted members. +#'The first tercil is the lower tercile, the second is the upper tercile. #' -#' @examples -#' # Example 1 -#' var_exp <- 1 : (2 * 4) -#' dim(var_exp) <- c(time = 2, member = 4) -#' aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) -#' dim(aweights) <- c(time = 2, member = 4) -#' res <- BEI_TercilesWeighting(var_exp, aweights) -#' dim(res) -#' # tercil -#' # 2 -#' # Example 2 -#' var_exp <- rnorm(48, 50, 9) -#' dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) -#' aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) -#' dim(aweights) <- c(time = 2, member = 4) -#' res <- BEI_TercilesWeighting(var_exp, aweights) -#' dim(res) -#' # tercil lat lon -#' # 2 2 3 -#' @noRd -BEI_TercilesWeighting <- function(var_exp, aweights, time_dim_name = 'time') { +#'@examples +#'# Example 1 +#'var_exp <- 1 : (2 * 4) +#'dim(var_exp) <- c(time = 2, member = 4) +#'aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) +#'dim(aweights) <- c(time = 2, member = 4) +#'res <- BEI_TercilesWeighting(var_exp, aweights) +#' +#'# Example 2 +#'var_exp <- rnorm(48, 50, 9) +#'dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) +#'aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) +#'dim(aweights) <- c(time = 2, member = 4) +#'res <- BEI_TercilesWeighting(var_exp, aweights) +#'@import multiApply +#'@export +BEI_TercilesWeighting <- function(var_exp, aweights, time_dim_name = 'time', + memb_dim = 'member') { + # var_exp + if (!is.array(var_exp)) { + stop("Parameter 'var_exp' must be an array.") + } + # aweights + if (!is.array(aweights)) { + stop("Parameter 'aweights' must be an array.") + } + # time_dim_name if (!is.character(time_dim_name)) { stop("Parameter 'time_dim_name' must be a character string indicating", " the name of the temporal dimension.") @@ -528,69 +496,68 @@ BEI_TercilesWeighting <- function(var_exp, aweights, time_dim_name = 'time') { "only the first element will be used.") time_dim_name <- time_dim_name[1] } - if (!is.array(var_exp)) { - stop("Parameter 'var_exp' must be an array.") - } - if (!is.array(aweights)) { - stop("Parameter 'aweights' must be an array.") + # memb_dim + if (!is.character(memb_dim)) { + stop("Parameter 'memb_dim' must be a character string indicating", + " the name of the member dimension.") } + # var_exp, aweights (2) if (is.null(names(dim(var_exp))) || is.null(names(dim(aweights)))) { stop("Parameters 'var_exp' and 'aweights'", - " should have dimmension names.") + " should have dimension names.") } - if(!(time_dim_name %in% names(dim(var_exp)))) { + if (!(time_dim_name %in% names(dim(var_exp)))) { stop("Parameter 'var_exp' must have temporal dimension.") } - if(!(time_dim_name %in% names(dim(aweights)))) { + if (!(time_dim_name %in% names(dim(aweights)))) { stop("Parameter 'aweights' must have temporal dimension.") } - if(!('member' %in% names(dim(var_exp)))) { - stop("Parameter 'var_exp' must have temporal dimension.") + if (!(memb_dim %in% names(dim(var_exp)))) { + stop("Parameter 'var_exp' must have member dimension.") } - if(!('member' %in% names(dim(aweights)))) { - stop("Parameter 'aweights' must have temporal dimension.") + if (!(memb_dim %in% names(dim(aweights)))) { + stop("Parameter 'aweights' must have member dimension.") } if (dim(var_exp)[time_dim_name] != dim(aweights)[time_dim_name]) { - stop("Length of temporal dimensions ", - "of parameter 'var_exp' and 'aweights' must be equals.") + stop("Length of temporal dimension ", + "of parameter 'var_exp' and 'aweights' must be equal.") } - if (dim(var_exp)['member'] != dim(aweights)['member']) { - stop("Length of dimension 'member' ", - "of parameter 'var_exp' and 'aweights' must be equals.") + if (dim(var_exp)[memb_dim] != dim(aweights)[memb_dim]) { + stop("Length of member dimension ", + "of parameter 'var_exp' and 'aweights' must be equal.") } res <- Apply(list(var_exp, aweights), - target_dims = list(c(time_dim_name,'member'), c(time_dim_name,'member')), + target_dims = list(c(time_dim_name, memb_dim), + c(time_dim_name, memb_dim)), fun = .BEI_TercilesWeighting, time_dim_name)$output1 return(res) } -#' Atomic BEI_TercilesWeighting -#' @param var_exp Variable (e.g. precipitation, temperature, NAO index) -#' array from a SFS with a temporal dimension, -#' by default 'time', and dimension 'member'. -#' @param aweights Normalized weights array with a temporal dimension, -#' by default 'time', and dimension 'member' -#' @param time_dim_name A character string indicating the name of the -#' temporal dimension, by default 'time'. -#' @return .BEI_TercilesWeighting returns a numeric array with dimension tercil -#' equal to 2, the first is the lower tercil and the second the upper tercile, -#' computing with weighted members considering all members and all period. -#' If any member value for any period is NA , the terciles are not computed, and -#' the function return NA value as tercile upper and lower. -#' @examples -#' # Example -#' var_exp <- 1 : 8 -#' dim(var_exp) <- c(stime = 2, member = 4) -#' aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) -#' dim(aweights) <- c(stime = 2, member = 4) -#' res <- .BEI_TercilesWeighting(var_exp, aweights, time_dim_name = 'stime') -#' dim(res) -#' # tercil -#' # 2 -#' @noRd +#'Atomic BEI_TercilesWeighting +#'@param var_exp Variable (e.g. precipitation, temperature, NAO index) +#' array from a SFS with a temporal dimension, +#' by default 'time', and dimension 'member'. +#'@param aweights Normalized weights array with a temporal dimension, +#' by default 'time', and dimension 'member' +#'@param time_dim_name A character string indicating the name of the +#' temporal dimension, by default 'time'. +#'@return .BEI_TercilesWeighting returns a numeric array with dimension tercil +#'equal to 2, the first is the lower tercil and the second the upper tercile, +#'computing with weighted members considering all members and all period. +#'If any member value for any period is NA , the terciles are not computed, and +#'the function return NA value as tercile upper and lower. +#'@examples +#'# Example +#'var_exp <- 1 : 8 +#'dim(var_exp) <- c(stime = 2, member = 4) +#'aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) +#'dim(aweights) <- c(stime = 2, member = 4) +#'res <- .BEI_TercilesWeighting(var_exp, aweights, time_dim_name = 'stime') +#'@noRd .BEI_TercilesWeighting <- function(var_exp, aweights, time_dim_name = 'time') { - if(any(is.na(var_exp)) || any(is.na(aweights))){ + + if (any(is.na(var_exp)) || any(is.na(aweights))) { terciles_exp <- array(c(NA, NA), dim = c(tercil = 2)) } else { l_terciles_exp <- WeightTerciles(var_exp, aweights, time_dim_name) @@ -601,11 +568,11 @@ BEI_TercilesWeighting <- function(var_exp, aweights, time_dim_name = 'time') { } # Auxiliar function to compute in which tercile is a data value -Data2Tercil_old <- function(x,lt,ut) { - if(is.na(lt) || is.na(ut)){ +Data2Tercil_old <- function(x, lt, ut) { + if (is.na(lt) || is.na(ut)) { y <- rep(NA, length(x)) } else { - y <- rep(2,length(x)) + y <- rep(2, length(x)) y[x <= lt] <- 1 y[x >= ut] <- 3 if (lt == ut) { @@ -616,11 +583,11 @@ Data2Tercil_old <- function(x,lt,ut) { return (y) } # Auxiliar function to compute in which tercile is a data value -Data2Tercil <- function(x,lt,ut) { - if(is.na(lt) || is.na(ut)){ +Data2Tercil <- function(x, lt, ut) { + if (is.na(lt) || is.na(ut)) { y <- rep(NA, length(x)) } else { - y <- rep(2,length(x)) + y <- rep(2, length(x)) y[x <= lt] <- 1 y[x >= ut] <- 3 if (lt == ut) { @@ -655,27 +622,27 @@ WeightTerciles <- function(data, aweights, time_dim_name = 'time') { # is lower tercile and when 2/3 is reached, it is the upper tercile. sumWeights <- 0 ilowerTercile <- 0 - while ((sumWeights < 1/3) & (ilowerTercile < length(aweights))){ - ilowerTercile<- ilowerTercile +1 + while ((sumWeights < 1/3) & (ilowerTercile < length(aweights))) { + ilowerTercile <- ilowerTercile + 1 sumWeights <- sumWeights + vectorWeights[indSort[ilowerTercile]] } - if (ilowerTercile == 1){ + if (ilowerTercile == 1) { lowerTercile <- dataSort[ilowerTercile] } else { - lowerTercile <- (dataSort[ilowerTercile]+ - dataSort[ilowerTercile-1])/2 + lowerTercile <- (dataSort[ilowerTercile] + + dataSort[ilowerTercile - 1]) / 2 } sumWeights <- 0 iupperTercile <- 0 - while ((sumWeights < 2/3) & (iupperTercile < length(aweights))){ - iupperTercile<- iupperTercile +1 + while ((sumWeights < 2/3) & (iupperTercile < length(aweights))) { + iupperTercile <- iupperTercile + 1 sumWeights <- sumWeights + vectorWeights[indSort[iupperTercile]] } if (iupperTercile == 1) { upperTercile <- dataSort[iupperTercile] } else { upperTercile <- (dataSort[iupperTercile]+ - dataSort[iupperTercile-1])/2 + dataSort[iupperTercile - 1]) / 2 } return(list(lowerTercile = lowerTercile, upperTercile = upperTercile)) } diff --git a/R/CST_BiasCorrection.R b/R/CST_BiasCorrection.R index 20b51082ccc7a164213e26d4844b36ea05dafe48..772d23ef6e1e9f36fc5ca5d5eec616e4b8d6c6a6 100644 --- a/R/CST_BiasCorrection.R +++ b/R/CST_BiasCorrection.R @@ -1,4 +1,4 @@ -#' Bias Correction based on the mean and standard deviation adjustment +#'Bias Correction based on the mean and standard deviation adjustment #' #'@author Verónica Torralba, \email{veronica.torralba@bsc.es} #'@description This function applies the simple bias adjustment technique @@ -7,27 +7,39 @@ #' #'@param exp An object of class \code{s2dv_cube} as returned by \code{CST_Load} #' function, containing the seasonal forecast experiment data in the element -#' named \code{$data} +#' named \code{$data} with at least time and member dimensions. #'@param obs An object of class \code{s2dv_cube} as returned by \code{CST_Load} -#' function, containing the observed data in the element named \code{$data}. +#' function, containing the observed data in the element named \code{$data} +#' with at least time dimension. #'@param exp_cor An object of class \code{s2dv_cube} as returned by -#' \code{CST_Load} function, containing the seasonl forecast experiment to be -#' corrected. If it is NULL, the 'exp' forecast will be corrected. +#' \code{CST_Load} function, containing the seasonal forecast experiment to be +#' corrected with at least time dimension. If it is NULL, the 'exp' forecast +#' will be corrected. If there is only one corrected dataset, it should not +#' have dataset dimension. If there is a corresponding corrected dataset for +#' each 'exp' forecast, the dataset dimension must have the same length as in +#' 'exp'. The default value is NULL. #'@param na.rm A logical value indicating whether missing values should be #' stripped before the computation proceeds, by default it is set to FALSE. #'@param memb_dim A character string indicating the name of the member #' dimension. By default, it is set to 'member'. #'@param sdate_dim A character string indicating the name of the start date #' dimension. By default, it is set to 'sdate'. +#'@param dat_dim A character string indicating the name of dataset dimension. +#' The length of this dimension can be different between 'exp' and 'obs'. +#' The default value is NULL. #'@param ncores An integer that indicates the number of cores for parallel #' computations using multiApply function. The default value is NULL. #'@return An object of class \code{s2dv_cube} containing the bias corrected -#'forecasts with the same dimensions of the experimental data. -#' +#'forecasts with the dimensions nexp, nobs and same dimensions as in the 'exp' +#'object. nexp is the number of experiment (i.e., 'dat_dim' in exp), and nobs is +#'the number of observation (i.e., 'dat_dim' in obs). If dat_dim is NULL, nexp +#'and nobs are omitted. If 'exp_cor' is provided the returned array will be with +#'the same dimensions as 'exp_cor'. +#' #'@references Torralba, V., F.J. Doblas-Reyes, D. MacLeod, I. Christel and M. #'Davis (2017). Seasonal climate prediction: a new source of information for #'the management of wind energy resources. Journal of Applied Meteorology and -#'Climatology, 56, 1231-1247, doi:10.1175/JAMC-D-16-0204.1. (CLIM4ENERGY, +#'Climatology, 56, 1231-1247, \doi{10.1175/JAMC-D-16-0204.1}. (CLIM4ENERGY, #'EUPORIAS, NEWA, RESILIENCE, SPECS) #' #'@examples @@ -37,54 +49,48 @@ #'dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) #'lon <- seq(0, 30, 5) #'lat <- seq(0, 25, 5) -#'exp <- list(data = mod1, lat = lat, lon = lon) -#'obs <- list(data = obs1, lat = lat, lon = lon) +#'coords <- list(lat = lat, lon = lon) +#'exp <- list(data = mod1, coords = coords) +#'obs <- list(data = obs1, coords = coords) #'attr(exp, 'class') <- 's2dv_cube' #'attr(obs, 'class') <- 's2dv_cube' #'a <- CST_BiasCorrection(exp = exp, obs = obs) #'@import multiApply #'@export CST_BiasCorrection <- function(exp, obs, exp_cor = NULL, na.rm = FALSE, - memb_dim = 'member', sdate_dim = 'sdate', - ncores = NULL) { + memb_dim = 'member', sdate_dim = 'sdate', + dat_dim = NULL, ncores = NULL) { + # Check 's2dv_cube' if (!inherits(exp, 's2dv_cube') || !inherits(obs, 's2dv_cube')) { - stop("Parameter 'exp' and 'obs' must be of the class 's2dv_cube', ", - "as output by CSTools::CST_Load.") + stop("Parameter 'exp' and 'obs' must be of the class 's2dv_cube'.") } if (!is.null(exp_cor)) { if (!inherits(exp_cor, 's2dv_cube')) { - stop("Parameter 'exp_cor' must be of the class 's2dv_cube', ", - "as output by CSTools::CST_Load.") + stop("Parameter 'exp_cor' must be of the class 's2dv_cube'.") } - dimnames <- names(dim(exp_cor$data)) - } else { - dimnames <- names(dim(exp$data)) } BiasCorrected <- BiasCorrection(exp = exp$data, obs = obs$data, exp_cor = exp_cor$data, - memb_dim = memb_dim, sdate_dim = sdate_dim, + memb_dim = memb_dim, sdate_dim = sdate_dim, dat_dim = dat_dim, na.rm = na.rm, ncores = ncores) - - pos <- match(dimnames, names(dim(BiasCorrected))) - BiasCorrected <- aperm(BiasCorrected, pos) if (is.null(exp_cor)) { exp$data <- BiasCorrected - exp$Datasets <- c(exp$Datasets, obs$Datasets) - exp$source_files <- c(exp$source_files, obs$source_files) + exp$attrs$Datasets <- c(exp$attrs$Datasets, obs$attrs$Datasets) + exp$attrs$source_files <- c(exp$attrs$source_files, obs$attrs$source_files) return(exp) } else { exp_cor$data <- BiasCorrected - exp_cor$Datasets <- c(exp_cor$Datasets, exp$Datasets, obs$Datasets) - exp_cor$source_files <- c(exp_cor$source_files, exp$source_files, obs$source_files) + exp_cor$attrs$Datasets <- c(exp_cor$attrs$Datasets, exp$attrs$Datasets, obs$attrs$Datasets) + exp_cor$attrs$source_files <- c(exp_cor$attrs$source_files, exp$attrs$source_files, obs$attrs$source_files) return(exp_cor) } } -#' Bias Correction based on the mean and standard deviation adjustment +#'Bias Correction based on the mean and standard deviation adjustment #' #'@author Verónica Torralba, \email{veronica.torralba@bsc.es} #'@description This function applies the simple bias adjustment technique @@ -92,29 +98,38 @@ CST_BiasCorrection <- function(exp, obs, exp_cor = NULL, na.rm = FALSE, #'standard deviation and mean to that of the reference dataset. #' #'@param exp A multidimensional array with named dimensions containing the -#' seasonal forecast experiment data with at least 'member' and 'sdate' -#' dimensions. +#' seasonal forecast experiment data with at least time and member dimensions. #'@param obs A multidimensional array with named dimensions containing the -#' observed data with at least 'sdate' dimension. +#' observed data with at least time dimension. #'@param exp_cor A multidimensional array with named dimensions containing the -#' seasonl forecast experiment to be corrected. If it is NULL, the 'exp' -#' forecast will be corrected. +#' seasonal forecast experiment to be corrected with at least time and member +#' dimension. If it is NULL, the 'exp' forecast will be corrected. If there is +#' only one corrected dataset, it should not have dataset dimension. If there +#' is a corresponding corrected dataset for each 'exp' forecast, the dataset +#' dimension must have the same length as in 'exp'. The default value is NULL. #'@param na.rm A logical value indicating whether missing values should be #' stripped before the computation proceeds, by default it is set to FALSE. #'@param memb_dim A character string indicating the name of the member #' dimension. By default, it is set to 'member'. #'@param sdate_dim A character string indicating the name of the start date #' dimension. By default, it is set to 'sdate'. +#'@param dat_dim A character string indicating the name of dataset dimension. +#' The length of this dimension can be different between 'exp' and 'obs'. +#' The default value is NULL. #'@param ncores An integer that indicates the number of cores for parallel #' computations using multiApply function. The default value is NULL. #' -#'@return An array containing the bias corrected forecasts with the same -#'dimensions of the experimental data. +#'@return An array containing the bias corrected forecasts with the dimensions +#'nexp, nobs and same dimensions as in the 'exp' object. nexp is the number of +#'experiment (i.e., 'dat_dim' in exp), and nobs is the number of observation +#'(i.e., 'dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. If +#''exp_cor' is provided the returned array will be with the same dimensions as +#''exp_cor'. #' #'@references Torralba, V., F.J. Doblas-Reyes, D. MacLeod, I. Christel and M. #'Davis (2017). Seasonal climate prediction: a new source of information for the #'management of wind energy resources. Journal of Applied Meteorology and -#'Climatology, 56, 1231-1247, doi:10.1175/JAMC-D-16-0204.1. (CLIM4ENERGY, +#'Climatology, 56, 1231-1247, \doi{10.1175/JAMC-D-16-0204.1}. (CLIM4ENERGY, #'EUPORIAS, NEWA, RESILIENCE, SPECS) #' #'@examples @@ -127,7 +142,7 @@ CST_BiasCorrection <- function(exp, obs, exp_cor = NULL, na.rm = FALSE, #'@export BiasCorrection <- function(exp, obs, exp_cor = NULL, na.rm = FALSE, memb_dim = 'member', sdate_dim = 'sdate', - ncores = NULL) { + dat_dim = NULL, ncores = NULL) { # Check inputs ## exp, obs if (!is.array(exp) || !is.numeric(exp)) { @@ -174,6 +189,68 @@ BiasCorrection <- function(exp, obs, exp_cor = NULL, na.rm = FALSE, if (dim(obs)[memb_dim] != 1) { stop("If parameter 'obs' has dimension 'memb_dim' its length must be equal to 1.") } + } else { + obs <- InsertDim(obs, posdim = 1, lendim = 1, name = memb_dim) + } + if (!is.null(exp_cor)) { + if (!memb_dim %in% names(dim(exp_cor))) { + exp_cor <- InsertDim(exp_cor, posdim = 1, lendim = 1, name = memb_dim) + exp_cor_remove_memb <- TRUE + } else { + exp_cor_remove_memb <- FALSE + } + } else { + exp_cor_remove_memb <- FALSE + } + + ## dat_dim + if (!is.null(dat_dim)) { + if (!is.character(dat_dim) | length(dat_dim) > 1) { + stop("Parameter 'dat_dim' must be a character string.") + } + if (!dat_dim %in% names(dim(exp)) | !dat_dim %in% names(dim(obs))) { + stop("Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension.", + " Set it as NULL if there is no dataset dimension.") + } + } + ## exp, obs, and exp_cor (2) + name_exp <- sort(names(dim(exp))) + name_obs <- sort(names(dim(obs))) + name_exp <- name_exp[-which(name_exp == memb_dim)] + name_obs <- name_obs[-which(name_obs == memb_dim)] + if (!is.null(dat_dim)) { + name_exp <- name_exp[-which(name_exp == dat_dim)] + name_obs <- name_obs[-which(name_obs == dat_dim)] + } + if (!identical(length(name_exp), length(name_obs)) | + !identical(dim(exp)[name_exp], dim(obs)[name_obs])) { + stop("Parameter 'exp' and 'obs' must have same length of all dimensions", + " except 'memb_dim' and 'dat_dim'.") + } + if (!is.null(exp_cor)) { + name_exp_cor <- sort(names(dim(exp_cor))) + name_exp <- sort(names(dim(exp))) + if (!is.null(dat_dim)) { + if (dat_dim %in% exp_cordims) { + if (!identical(dim(exp)[dat_dim], dim(exp_cor)[dat_dim])) { + stop("If parameter 'exp_cor' has dataset dimension, it must be", + " equal to dataset dimension of 'exp'.") + } + name_exp_cor <- name_exp_cor[-which(name_exp_cor == dat_dim)] + target_dims_cor <- c(memb_dim, sdate_dim, dat_dim) + } else { + target_dims_cor <- c(memb_dim, sdate_dim) + } + } else { + target_dims_cor <- c(memb_dim, sdate_dim) + } + name_exp <- name_exp[-which(name_exp %in% c(memb_dim, sdate_dim, dat_dim))] + name_exp_cor <- name_exp_cor[-which(name_exp_cor %in% target_dims_cor)] + if (!identical(length(name_exp), length(name_exp_cor)) | + !identical(dim(exp)[name_exp], dim(exp_cor)[name_exp_cor])) { + stop("Parameter 'exp' and 'exp_cor' must have the same length of ", + "all common dimensions except 'dat_dim', 'sdate_dim' and 'memb_dim'.") + } } ## na.rm if (!is.logical(na.rm)) { @@ -191,57 +268,106 @@ BiasCorrection <- function(exp, obs, exp_cor = NULL, na.rm = FALSE, stop("Parameter 'ncores' must be either NULL or a positive integer.") } } - - target_dims_obs <- sdate_dim - if (memb_dim %in% names(dim(obs))) { - target_dims_obs <- c(memb_dim, target_dims_obs) - } if (is.null(exp_cor)) { BiasCorrected <- Apply(data = list(var_obs = obs, var_exp = exp), - target_dims = list(target_dims_obs, - c(memb_dim, sdate_dim)), - fun = .sbc, + target_dims = list(c(memb_dim, sdate_dim, dat_dim), + c(memb_dim, sdate_dim, dat_dim)), + fun = .sbc, dat_dim = dat_dim, na.rm = na.rm, ncores = ncores)$output1 } else { BiasCorrected <- Apply(data = list(var_obs = obs, var_exp = exp, var_cor = exp_cor), - target_dims = list(target_dims_obs, - c(memb_dim, sdate_dim), - c(memb_dim, sdate_dim)), - fun = .sbc, - output_dims = c(memb_dim, sdate_dim), + target_dims = list(c(memb_dim, sdate_dim, dat_dim), + c(memb_dim, sdate_dim, dat_dim), + target_dims_cor), + fun = .sbc, dat_dim = dat_dim, na.rm = na.rm, ncores = ncores)$output1 } + if (!is.null(dat_dim)) { + pos <- match(c(names(dim(exp))[-which(names(dim(exp)) == dat_dim)], 'nexp', 'nobs'), + names(dim(BiasCorrected))) + BiasCorrected <- aperm(BiasCorrected, pos) + } else { + pos <- match(c(names(dim(exp))), names(dim(BiasCorrected))) + BiasCorrected <- aperm(BiasCorrected, pos) + } + + if (exp_cor_remove_memb) { + dim(BiasCorrected) <- dim(BiasCorrected)[-which(names(dim(BiasCorrected)) == memb_dim)] + } + return(BiasCorrected) } -.sbc <- function(var_obs, var_exp , var_cor = NULL, na.rm = FALSE) { +.sbc <- function(var_obs, var_exp, var_cor = NULL, dat_dim = NULL, na.rm = FALSE) { - ntime <- dim(var_exp)[2] - corrected <- NA * var_exp + # exp: [memb, sdate, (dat)] + # obs: [memb, sdate, (dat)] + # ref: [memb, sdate, (dat)] or NULL - if (is.null(var_cor)) { - for (t in 1:ntime) { - # parameters - sd_obs <- sd(var_obs[-t], na.rm = na.rm) - sd_exp <- sd(var_exp[, -t], na.rm = na.rm) - clim_exp <- mean(var_exp[, -t], na.rm = na.rm) - clim_obs <- mean(var_obs[-t], na.rm = na.rm) - - # bias corrected forecast - corrected[, t] <- ((var_exp[, t] - clim_exp) * (sd_obs / sd_exp)) + clim_obs + if (is.null(dat_dim)) { + nexp <- 1 + nobs <- 1 + var_exp <- InsertDim(var_exp, posdim = 3, lendim = 1, name = 'dataset') + var_obs <- InsertDim(var_obs, posdim = 3, lendim = 1, name = 'dataset') + if (!is.null(var_cor)) { + var_cor <- InsertDim(var_cor, posdim = 3, lendim = 1, name = 'dataset') } } else { - # parameters - sd_obs <- sd(var_obs, na.rm = na.rm) - sd_exp <- sd(var_exp, na.rm = na.rm) - clim_exp <- mean(var_exp, na.rm = na.rm) - clim_obs <- mean(var_obs, na.rm = na.rm) - - # bias corrected forecast - corrected <- ((var_cor - clim_exp) * (sd_obs / sd_exp)) + clim_obs + nexp <- as.numeric(dim(var_exp)[dat_dim]) + nobs <- as.numeric(dim(var_obs)[dat_dim]) + } + + if (!is.null(var_cor)) { + if (length(dim(var_cor)) == 2) { # ref: [memb, sdate] + cor_dat_dim <- FALSE + } else { # ref: [memb, sdate, dat] + cor_dat_dim <- TRUE + } + corrected <- array(dim = c(dim(var_cor)[1:2], nexp = nexp, nobs = nobs)) + } else { + ntime <- dim(var_exp)[2] + corrected <- array(dim = c(dim(var_exp)[1:2], nexp = nexp, nobs = nobs)) + } + + for (i in 1:nexp) { + for (j in 1:nobs) { + if (is.null(var_cor)) { + for (t in 1:ntime) { + # parameters + sd_obs <- sd(var_obs[, -t, j], na.rm = na.rm) + sd_exp <- sd(var_exp[, -t, i], na.rm = na.rm) + clim_exp <- mean(var_exp[, -t, i], na.rm = na.rm) + clim_obs <- mean(var_obs[, -t, j], na.rm = na.rm) + + # bias corrected forecast + corrected[, t, i, j] <- ((var_exp[, t, i] - clim_exp) * (sd_obs / sd_exp)) + clim_obs + } + } else { + # parameters + sd_obs <- sd(var_obs[, , j], na.rm = na.rm) + sd_exp <- sd(var_exp[, , i], na.rm = na.rm) + clim_exp <- mean(var_exp[, , i], na.rm = na.rm) + clim_obs <- mean(var_obs[, , j], na.rm = na.rm) + + # bias corrected forecast + if (cor_dat_dim) { + corrected[, , i, j] <- ((var_cor[, , i] - clim_exp) * (sd_obs / sd_exp)) + clim_obs + } else { + corrected[, , i, j] <- ((var_cor - clim_exp) * (sd_obs / sd_exp)) + clim_obs + } + } + } + } + + if (is.null(dat_dim)) { + if (!is.null(var_cor)) { + dim(corrected) <- dim(var_cor)[1:2] + } else { + dim(corrected) <- dim(var_exp)[1:2] + } } return(corrected) diff --git a/R/CST_Calibration.R b/R/CST_Calibration.R index f03d5d0557268e4f9438e7704c34a979502db125..e973c4d84dae5973e4f02653ea92747d13c98c1d 100644 --- a/R/CST_Calibration.R +++ b/R/CST_Calibration.R @@ -2,28 +2,112 @@ #' #'@author Verónica Torralba, \email{veronica.torralba@bsc.es} #'@author Bert Van Schaeybroeck, \email{bertvs@meteo.be} -#'@description Equivalent to function \code{Calibration} but for objects of class \code{s2dv_cube}. +#'@description Five types of member-by-member bias correction can be performed. +#'The \code{"bias"} method corrects the bias only, the \code{"evmos"} method +#'applies a variance inflation technique to ensure the correction of the bias +#'and the correspondence of variance between forecast and observation (Van +#'Schaeybroeck and Vannitsem, 2011). The ensemble calibration methods +#'\code{"mse_min"} and \code{"crps_min"} correct the bias, the overall forecast +#'variance and the ensemble spread as described in Doblas-Reyes et al. (2005) +#'and Van Schaeybroeck and Vannitsem (2015), respectively. While the +#'\code{"mse_min"} method minimizes a constrained mean-squared error using three +#'parameters, the \code{"crps_min"} method features four parameters and +#'minimizes the Continuous Ranked Probability Score (CRPS). The +#'\code{"rpc-based"} method adjusts the forecast variance ensuring that the +#'ratio of predictable components (RPC) is equal to one, as in Eade et al. +#'(2014). It is equivalent to function \code{Calibration} but for objects +#'of class \code{s2dv_cube}. #' -#'@param exp an object of class \code{s2dv_cube} as returned by \code{CST_Load} function, containing the seasonal hindcast experiment data in the element named \code{$data}. The hindcast is used to calibrate the forecast in case the forecast is provided; if not, the same hindcast will be calibrated instead. -#'@param obs an object of class \code{s2dv_cube} as returned by \code{CST_Load} function, containing the observed data in the element named \code{$data}. -#'@param exp_cor an optional object of class \code{s2dv_cube} as returned by \code{CST_Load} function, containing the seasonal forecast experiment data in the element named \code{$data}. If the forecast is provided, it will be calibrated using the hindcast and observations; if not, the hindcast will be calibrated instead. -#'@param cal.method is the calibration method used, can be either \code{bias}, \code{evmos}, \code{mse_min}, \code{crps_min} or \code{rpc-based}. Default value is \code{mse_min}. -#'@param eval.method is the sampling method used, can be either \code{in-sample} or \code{leave-one-out}. Default value is the \code{leave-one-out} cross validation. In case the forecast is provided, any chosen eval.method is over-ruled and a third option is used. -#'@param multi.model is a boolean that is used only for the \code{mse_min} method. If multi-model ensembles or ensembles of different sizes are used, it must be set to \code{TRUE}. By default it is \code{FALSE}. Differences between the two approaches are generally small but may become large when using small ensemble sizes. Using multi.model when the calibration method is \code{bias}, \code{evmos} or \code{crps_min} will not affect the result. -#'@param na.fill is a boolean that indicates what happens in case calibration is not possible or will yield unreliable results. This happens when three or less forecasts-observation pairs are available to perform the training phase of the calibration. By default \code{na.fill} is set to true such that NA values will be returned. If \code{na.fill} is set to false, the uncorrected data will be returned. -#'@param na.rm is a boolean that indicates whether to remove the NA values or not. The default value is \code{TRUE}. See Details section for further information about its use and compatibility with \code{na.fill}. -#'@param apply_to is a character string that indicates whether to apply the calibration to all the forecast (\code{"all"}) or only to those where the correlation between the ensemble mean and the observations is statistically significant (\code{"sign"}). Only useful if \code{cal.method == "rpc-based"}. -#'@param alpha is a numeric value indicating the significance level for the correlation test. Only useful if \code{cal.method == "rpc-based" & apply_to == "sign"}. -#'@param memb_dim is a character string indicating the name of the member dimension. By default, it is set to 'member'. -#'@param sdate_dim is a character string indicating the name of the start date dimension. By default, it is set to 'sdate'. -#'@param ncores is an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. -#'@return an object of class \code{s2dv_cube} containing the calibrated forecasts in the element \code{$data} with the same dimensions as the one in the exp object. -#' -#'@importFrom s2dv InsertDim -#'@import abind +#'@param exp An object of class \code{s2dv_cube} as returned by \code{CST_Load} +#' function with at least 'sdate' and 'member' dimensions, containing the +#' seasonal hindcast experiment data in the element named \code{data}. The +#' hindcast is used to calibrate the forecast in case the forecast is provided; +#' if not, the same hindcast will be calibrated instead. +#'@param obs An object of class \code{s2dv_cube} as returned by \code{CST_Load} +#' function with at least 'sdate' dimension, containing the observed data in +#' the element named \code{$data}. +#'@param exp_cor An optional object of class \code{s2dv_cube} as returned by +#' \code{CST_Load} function with at least 'sdate' and 'member' dimensions, +#' containing the seasonal forecast experiment data in the element named +#' \code{data}. If the forecast is provided, it will be calibrated using the +#' hindcast and observations; if not, the hindcast will be calibrated instead. +#' If there is only one corrected dataset, it should not have dataset dimension. +#' If there is a corresponding corrected dataset for each 'exp' forecast, the +#' dataset dimension must have the same length as in 'exp'. The default value +#' is NULL. +#'@param cal.method A character string indicating the calibration method used, +#' can be either \code{bias}, \code{evmos}, \code{mse_min}, \code{crps_min} or +#' \code{rpc-based}. Default value is \code{mse_min}. +#'@param eval.method A character string indicating the sampling method used, it +#' can be either \code{in-sample} or \code{leave-one-out}. Default value is the +#' \code{leave-one-out} cross validation. In case the forecast is provided, any +#' chosen eval.method is over-ruled and a third option is used. +#'@param multi.model A boolean that is used only for the \code{mse_min} +#' method. If multi-model ensembles or ensembles of different sizes are used, +#' it must be set to \code{TRUE}. By default it is \code{FALSE}. Differences +#' between the two approaches are generally small but may become large when +#' using small ensemble sizes. Using multi.model when the calibration method is +#' \code{bias}, \code{evmos} or \code{crps_min} will not affect the result. +#'@param na.fill A boolean that indicates what happens in case calibration is +#' not possible or will yield unreliable results. This happens when three or +#' less forecasts-observation pairs are available to perform the training phase +#' of the calibration. By default \code{na.fill} is set to true such that NA +#' values will be returned. If \code{na.fill} is set to false, the uncorrected +#' data will be returned. +#'@param na.rm A boolean that indicates whether to remove the NA values or not. +#' The default value is \code{TRUE}. See Details section for further +#' information about its use and compatibility with \code{na.fill}. +#'@param apply_to A character string that indicates whether to apply the +#' calibration to all the forecast (\code{"all"}) or only to those where the +#' correlation between the ensemble mean and the observations is statistically +#' significant (\code{"sign"}). Only useful if \code{cal.method == "rpc-based"}. +#'@param alpha A numeric value indicating the significance level for the +#' correlation test. Only useful if \code{cal.method == "rpc-based" & apply_to +#' == "sign"}. +#'@param memb_dim A character string indicating the name of the member dimension. +#' By default, it is set to 'member'. +#'@param sdate_dim A character string indicating the name of the start date +#' dimension. By default, it is set to 'sdate'. +#'@param dat_dim A character string indicating the name of dataset dimension. +#' The length of this dimension can be different between 'exp' and 'obs'. +#' The default value is NULL. +#'@param ncores An integer that indicates the number of cores for parallel +#' computations using multiApply function. The default value is one. +#' +#'@return An object of class \code{s2dv_cube} containing the calibrated +#'forecasts in the element \code{data} with the dimensions nexp, nobs and same +#'dimensions as in the 'exp' object. nexp is the number of experiment +#'(i.e., 'dat_dim' in exp), and nobs is the number of observation (i.e., +#''dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. If 'exp_cor' +#'is provided the returned array will be with the same dimensions as 'exp_cor'. #' +#'@details Both the \code{na.fill} and \code{na.rm} parameters can be used to +#'indicate how the function has to handle the NA values. The \code{na.fill} +#'parameter checks whether there are more than three forecast-observations pairs +#'to perform the computation. In case there are three or less pairs, the +#'computation is not carried out, and the value returned by the function depends +#'on the value of this parameter (either NA if \code{na.fill == TRUE} or the +#'uncorrected value if \code{na.fill == TRUE}). On the other hand, \code{na.rm} +#'is used to indicate the function whether to remove the missing values during +#'the computation of the parameters needed to perform the calibration. +#' +#'@references Doblas-Reyes F.J, Hagedorn R, Palmer T.N. The rationale behind the +#'success of multi-model ensembles in seasonal forecasting-II calibration and +#'combination. Tellus A. 2005;57:234-252. \doi{10.1111/j.1600-0870.2005.00104.x} +#'@references Eade, R., Smith, D., Scaife, A., Wallace, E., Dunstone, N., +#'Hermanson, L., & Robinson, N. (2014). Do seasonal-to-decadal climate +#'predictions underestimate the predictability of the read world? Geophysical +#'Research Letters, 41(15), 5620-5628. \doi{10.1002/2014GL061146} +#'@references Van Schaeybroeck, B., & Vannitsem, S. (2011). Post-processing +#'through linear regression. Nonlinear Processes in Geophysics, 18(2), +#'147. \doi{10.5194/npg-18-147-2011} +#'@references Van Schaeybroeck, B., & Vannitsem, S. (2015). Ensemble +#'post-processing using member-by-member approaches: theoretical aspects. +#'Quarterly Journal of the Royal Meteorological Society, 141(688), 807-818. +#'\doi{10.1002/qj.2397} +#' #'@seealso \code{\link{CST_Load}} -#' +#' #'@examples #'# Example 1: #'mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) @@ -32,12 +116,12 @@ #'dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) #'lon <- seq(0, 30, 5) #'lat <- seq(0, 25, 5) -#'exp <- list(data = mod1, lat = lat, lon = lon) -#'obs <- list(data = obs1, lat = lat, lon = lon) +#'coords <- list(lat = lat, lon = lon) +#'exp <- list(data = mod1, coords = coords) +#'obs <- list(data = obs1, coords = coords) #'attr(exp, 'class') <- 's2dv_cube' #'attr(obs, 'class') <- 's2dv_cube' #'a <- CST_Calibration(exp = exp, obs = obs, cal.method = "mse_min", eval.method = "in-sample") -#'str(a) #' #'# Example 2: #'mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) @@ -48,371 +132,576 @@ #'dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) #'lon <- seq(0, 30, 5) #'lat <- seq(0, 25, 5) -#'exp <- list(data = mod1, lat = lat, lon = lon) -#'obs <- list(data = obs1, lat = lat, lon = lon) +#'coords <- list(lat = lat, lon = lon) +#'exp <- list(data = mod1, coords = coords) +#'obs <- list(data = obs1, coords = coords) #'exp_cor <- list(data = mod2, lat = lat, lon = lon) #'attr(exp, 'class') <- 's2dv_cube' #'attr(obs, 'class') <- 's2dv_cube' #'attr(exp_cor, 'class') <- 's2dv_cube' #'a <- CST_Calibration(exp = exp, obs = obs, exp_cor = exp_cor, cal.method = "evmos") -#'str(a) +#' +#'@importFrom s2dv InsertDim Reorder +#'@import multiApply +#'@importFrom ClimProjDiags Subset #'@export - CST_Calibration <- function(exp, obs, exp_cor = NULL, cal.method = "mse_min", eval.method = "leave-one-out", multi.model = FALSE, - na.fill = TRUE, na.rm = TRUE, apply_to = NULL, alpha = NULL, - memb_dim = 'member', sdate_dim = 'sdate', ncores = 1) { - - if(!missing(multi.model) & !(cal.method == "mse_min")){ - warning(paste0("The multi.model parameter is ignored when using the calibration method ", cal.method)) + na.fill = TRUE, na.rm = TRUE, apply_to = NULL, + alpha = NULL, memb_dim = 'member', sdate_dim = 'sdate', + dat_dim = NULL, ncores = NULL) { + # Check 's2dv_cube' + if (!inherits(exp, "s2dv_cube") || !inherits(obs, "s2dv_cube")) { + stop("Parameter 'exp' and 'obs' must be of the class 's2dv_cube'.") } - - if(is.null(exp_cor)){ #exp will be used to calibrate and will also be calibrated: "calibrate hindcast" - if (!inherits(exp, "s2dv_cube") || !inherits(obs, "s2dv_cube")) { - stop("Parameter 'exp' and 'obs' must be of the class 's2dv_cube', ", - "as output by CSTools::CST_Load.") - } - exp$data <- Calibration(exp = exp$data, obs = obs$data, exp_cor = NULL, - cal.method = cal.method, - eval.method = eval.method, - multi.model = multi.model, - na.fill = na.fill, na.rm = na.rm, - apply_to = apply_to, alpha = alpha, - memb_dim = memb_dim, sdate_dim = sdate_dim, - ncores = ncores) - exp$Datasets <- c(exp$Datasets, obs$Datasets) - exp$source_files <- c(exp$source_files, obs$source_files) - - return(exp) - - }else{ #if exp_cor is provided, it will be calibrated: "calibrate forecast instead of hindcast" - eval.method = "hindcast-vs-forecast" #if exp_cor is provided, eval.method is overrruled (because if exp_cor is provided, the train data will be all data of "exp" and the evalutaion data will be all data of "exp_cor"; no need for "leave-one-out" or "in-sample") - if (!inherits(exp, "s2dv_cube") || !inherits(obs, "s2dv_cube") || !inherits(exp_cor, "s2dv_cube")) { - stop("Parameter 'exp', 'obs' and 'exp_cor' must be of the class 's2dv_cube', ", - "as output by CSTools::CST_Load.") + if (!is.null(exp_cor)) { + if (!inherits(exp_cor, "s2dv_cube")) { + stop("Parameter 'exp_cor' must be of the class 's2dv_cube'.") } - exp_cor$data <- Calibration(exp = exp$data, obs = obs$data, exp_cor = exp_cor$data, - cal.method = cal.method, - eval.method = eval.method, - multi.model = multi.model, - na.fill = na.fill, na.rm = na.rm, - apply_to = apply_to, alpha = alpha, - memb_dim = memb_dim, sdate_dim = sdate_dim, - ncores = ncores) - exp_cor$Datasets <- c(exp_cor$Datasets, obs$Datasets) - exp_cor$source_files <- c(exp_cor$source_files, exp$source_files, obs$source_files) - - return(exp_cor) + } - } -} + Calibration <- Calibration(exp = exp$data, obs = obs$data, exp_cor = exp_cor$data, + cal.method = cal.method, eval.method = eval.method, + multi.model = multi.model, na.fill = na.fill, + na.rm = na.rm, apply_to = apply_to, alpha = alpha, + memb_dim = memb_dim, sdate_dim = sdate_dim, + dat_dim = dat_dim, ncores = ncores) + if (is.null(exp_cor)) { + exp$data <- Calibration + exp$attrs$Datasets <- c(exp$attrs$Datasets, obs$attrs$Datasets) + exp$attrs$source_files <- c(exp$attrs$source_files, obs$attrs$source_files) + + return(exp) + } else { + exp_cor$data <- Calibration + exp_cor$attrs$Datasets <- c(exp_cor$attrs$Datasets, exp$attrs$Datasets, obs$attrs$Datasets) + exp_cor$attrs$source_files <- c(exp_cor$attrs$source_files, exp$attrs$source_files, obs$attrs$source_files) + + return(exp_cor) + } +} #'Forecast Calibration #' #'@author Verónica Torralba, \email{veronica.torralba@bsc.es} #'@author Bert Van Schaeybroeck, \email{bertvs@meteo.be} -#'@description Five types of member-by-member bias correction can be performed. The \code{"bias"} method corrects the bias only, the \code{"evmos"} method applies a variance inflation technique to ensure the correction of the bias and the correspondence of variance between forecast and observation (Van Schaeybroeck and Vannitsem, 2011). The ensemble calibration methods \code{"mse_min"} and \code{"crps_min"} correct the bias, the overall forecast variance and the ensemble spread as described in Doblas-Reyes et al. (2005) and Van Schaeybroeck and Vannitsem (2015), respectively. While the \code{"mse_min"} method minimizes a constrained mean-squared error using three parameters, the \code{"crps_min"} method features four parameters and minimizes the Continuous Ranked Probability Score (CRPS). The \code{"rpc-based"} method adjusts the forecast variance ensuring that the ratio of predictable components (RPC) is equal to one, as in Eade et al. (2014). -#'@description Both in-sample or our out-of-sample (leave-one-out cross validation) calibration are possible. -#'@references Doblas-Reyes F.J, Hagedorn R, Palmer T.N. The rationale behind the success of multi-model ensembles in seasonal forecasting-II calibration and combination. Tellus A. 2005;57:234-252. doi:10.1111/j.1600-0870.2005.00104.x -#'@references Eade, R., Smith, D., Scaife, A., Wallace, E., Dunstone, N., Hermanson, L., & Robinson, N. (2014). Do seasonal-to-decadal climate predictions underestimate the predictability of the read world? Geophysical Research Letters, 41(15), 5620-5628. doi: 10.1002/2014GL061146 -#'@references Van Schaeybroeck, B., & Vannitsem, S. (2011). Post-processing through linear regression. Nonlinear Processes in Geophysics, 18(2), 147. doi:10.5194/npg-18-147-2011 -#'@references Van Schaeybroeck, B., & Vannitsem, S. (2015). Ensemble post-processing using member-by-member approaches: theoretical aspects. Quarterly Journal of the Royal Meteorological Society, 141(688), 807-818. doi:10.1002/qj.2397 +#'@description Five types of member-by-member bias correction can be performed. +#'The \code{"bias"} method corrects the bias only, the \code{"evmos"} method +#'applies a variance inflation technique to ensure the correction of the bias +#'and the correspondence of variance between forecast and observation (Van +#'Schaeybroeck and Vannitsem, 2011). The ensemble calibration methods +#'\code{"mse_min"} and \code{"crps_min"} correct the bias, the overall forecast +#'variance and the ensemble spread as described in Doblas-Reyes et al. (2005) +#'and Van Schaeybroeck and Vannitsem (2015), respectively. While the +#'\code{"mse_min"} method minimizes a constrained mean-squared error using three +#'parameters, the \code{"crps_min"} method features four parameters and +#'minimizes the Continuous Ranked Probability Score (CRPS). The +#'\code{"rpc-based"} method adjusts the forecast variance ensuring that the +#'ratio of predictable components (RPC) is equal to one, as in Eade et al. +#'(2014). Both in-sample or our out-of-sample (leave-one-out cross +#'validation) calibration are possible. #' -#'@param exp a multidimensional array with named dimensions (at least 'sdate' and 'member') containing the seasonal hindcast experiment data. The hindcast is used to calibrate the forecast in case the forecast is provided; if not, the same hindcast will be calibrated instead. -#'@param obs a multidimensional array with named dimensions (at least 'sdate') containing the observed data. -#'@param exp_cor an optional multidimensional array with named dimensions (at least 'sdate' and 'member') containing the seasonal forecast experiment data. If the forecast is provided, it will be calibrated using the hindcast and observations; if not, the hindcast will be calibrated instead. -#'@param cal.method is the calibration method used, can be either \code{bias}, \code{evmos}, \code{mse_min}, \code{crps_min} or \code{rpc-based}. Default value is \code{mse_min}. -#'@param eval.method is the sampling method used, can be either \code{in-sample} or \code{leave-one-out}. Default value is the \code{leave-one-out} cross validation. In case the forecast is provided, any chosen eval.method is over-ruled and a third option is used. -#'@param multi.model is a boolean that is used only for the \code{mse_min} method. If multi-model ensembles or ensembles of different sizes are used, it must be set to \code{TRUE}. By default it is \code{FALSE}. Differences between the two approaches are generally small but may become large when using small ensemble sizes. Using multi.model when the calibration method is \code{bias}, \code{evmos} or \code{crps_min} will not affect the result. -#'@param na.fill is a boolean that indicates what happens in case calibration is not possible or will yield unreliable results. This happens when three or less forecasts-observation pairs are available to perform the training phase of the calibration. By default \code{na.fill} is set to true such that NA values will be returned. If \code{na.fill} is set to false, the uncorrected data will be returned. -#'@param na.rm is a boolean that indicates whether to remove the NA values or not. The default value is \code{TRUE}. -#'@param apply_to is a character string that indicates whether to apply the calibration to all the forecast (\code{"all"}) or only to those where the correlation between the ensemble mean and the observations is statistically significant (\code{"sign"}). Only useful if \code{cal.method == "rpc-based"}. -#'@param alpha is a numeric value indicating the significance level for the correlation test. Only useful if \code{cal.method == "rpc-based" & apply_to == "sign"}. -#'@param memb_dim is a character string indicating the name of the member dimension. By default, it is set to 'member'. -#'@param sdate_dim is a character string indicating the name of the start date dimension. By default, it is set to 'sdate'. -#'@param ncores is an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. -#'@return an array containing the calibrated forecasts with the same dimensions as the \code{exp} array. -#' -#'@importFrom s2dv InsertDim MeanDims Reorder -#'@import abind -#'@import multiApply -#'@importFrom ClimProjDiags Subset +#'@param exp A multidimensional array with named dimensions (at least 'sdate' +#' and 'member') containing the seasonal hindcast experiment data. The hindcast +#' is used to calibrate the forecast in case the forecast is provided; if not, +#' the same hindcast will be calibrated instead. +#'@param obs A multidimensional array with named dimensions (at least 'sdate') +#' containing the observed data. +#'@param exp_cor An optional multidimensional array with named dimensions (at +#' least 'sdate' and 'member') containing the seasonal forecast experiment +#' data. If the forecast is provided, it will be calibrated using the hindcast +#' and observations; if not, the hindcast will be calibrated instead. If there +#' is only one corrected dataset, it should not have dataset dimension. If there +#' is a corresponding corrected dataset for each 'exp' forecast, the dataset +#' dimension must have the same length as in 'exp'. The default value is NULL. +#'@param cal.method A character string indicating the calibration method used, +#' can be either \code{bias}, \code{evmos}, \code{mse_min}, \code{crps_min} +#' or \code{rpc-based}. Default value is \code{mse_min}. +#'@param eval.method A character string indicating the sampling method used, +#' can be either \code{in-sample} or \code{leave-one-out}. Default value is +#' the \code{leave-one-out} cross validation. In case the forecast is +#' provided, any chosen eval.method is over-ruled and a third option is +#' used. +#'@param multi.model A boolean that is used only for the \code{mse_min} +#' method. If multi-model ensembles or ensembles of different sizes are used, +#' it must be set to \code{TRUE}. By default it is \code{FALSE}. Differences +#' between the two approaches are generally small but may become large when +#' using small ensemble sizes. Using multi.model when the calibration method +#' is \code{bias}, \code{evmos} or \code{crps_min} will not affect the result. +#'@param na.fill A boolean that indicates what happens in case calibration is +#' not possible or will yield unreliable results. This happens when three or +#' less forecasts-observation pairs are available to perform the training phase +#' of the calibration. By default \code{na.fill} is set to true such that NA +#' values will be returned. If \code{na.fill} is set to false, the uncorrected +#' data will be returned. +#'@param na.rm A boolean that indicates whether to remove the NA values or +#' not. The default value is \code{TRUE}. +#'@param apply_to A character string that indicates whether to apply the +#' calibration to all the forecast (\code{"all"}) or only to those where the +#' correlation between the ensemble mean and the observations is statistically +#' significant (\code{"sign"}). Only useful if \code{cal.method == "rpc-based"}. +#'@param alpha A numeric value indicating the significance level for the +#' correlation test. Only useful if \code{cal.method == "rpc-based" & apply_to == +#' "sign"}. +#'@param memb_dim A character string indicating the name of the member +#' dimension. By default, it is set to 'member'. +#'@param sdate_dim A character string indicating the name of the start date +#' dimension. By default, it is set to 'sdate'. +#'@param dat_dim A character string indicating the name of dataset dimension. +#' The length of this dimension can be different between 'exp' and 'obs'. +#' The default value is NULL. +#'@param ncores An integer that indicates the number of cores for parallel +#' computation using multiApply function. The default value is NULL (one core). +#' +#'@return An array containing the calibrated forecasts with the dimensions +#'nexp, nobs and same dimensions as in the 'exp' array. nexp is the number of +#'experiment (i.e., 'dat_dim' in exp), and nobs is the number of observation +#'(i.e., 'dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. +#'If 'exp_cor' is provided the returned array will be with the same dimensions as +#''exp_cor'. +#' +#'@details Both the \code{na.fill} and \code{na.rm} parameters can be used to +#'indicate how the function has to handle the NA values. The \code{na.fill} +#'parameter checks whether there are more than three forecast-observations pairs +#'to perform the computation. In case there are three or less pairs, the +#'computation is not carried out, and the value returned by the function depends +#'on the value of this parameter (either NA if \code{na.fill == TRUE} or the +#'uncorrected value if \code{na.fill == TRUE}). On the other hand, \code{na.rm} +#'is used to indicate the function whether to remove the missing values during +#'the computation of the parameters needed to perform the calibration. #' +#'@references Doblas-Reyes F.J, Hagedorn R, Palmer T.N. The rationale behind the +#'success of multi-model ensembles in seasonal forecasting-II calibration and +#'combination. Tellus A. 2005;57:234-252. doi:10.1111/j.1600-0870.2005.00104.x +#'@references Eade, R., Smith, D., Scaife, A., Wallace, E., Dunstone, N., +#'Hermanson, L., & Robinson, N. (2014). Do seasonal-to-decadal climate +#'predictions underestimate the predictability of the read world? Geophysical +#'Research Letters, 41(15), 5620-5628. \doi{10.1002/2014GL061146} +#'@references Van Schaeybroeck, B., & Vannitsem, S. (2011). Post-processing +#'through linear regression. Nonlinear Processes in Geophysics, 18(2), +#'147. \doi{10.5194/npg-18-147-2011} +#'@references Van Schaeybroeck, B., & Vannitsem, S. (2015). Ensemble +#'post-processing using member-by-member approaches: theoretical aspects. +#'Quarterly Journal of the Royal Meteorological Society, 141(688), 807-818. +#'\doi{10.1002/qj.2397} +#' #'@seealso \code{\link{CST_Load}} -#' -#'@details -#'Both the \code{na.fill} and \code{na.rm} parameters can be used to indicate how the function has to handle the NA values. The \code{na.fill} parameter checks whether there are more than three forecast-observations pairs to perform the computation. In case there are three or less pairs, the computation is not carried out, and the value returned by the function depends on the value of this parameter (either NA if \code{na.fill == TRUE} or the uncorrected value if \code{na.fill == TRUE}). On the other hand, \code{na.rm} is used to indicate the function whether to remove the missing values during the computation of the parameters needed to perform the calibration. -#' +#' #'@examples #'mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) #'dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) #'obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) #'dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) #'a <- Calibration(exp = mod1, obs = obs1) -#'str(a) +#' +#'@importFrom s2dv InsertDim Reorder +#'@import multiApply +#'@importFrom ClimProjDiags Subset #'@export -Calibration <- function(exp, obs, exp_cor=NULL, cal.method = "mse_min", - eval.method = "leave-one-out", +Calibration <- function(exp, obs, exp_cor = NULL, + cal.method = "mse_min", eval.method = "leave-one-out", multi.model = FALSE, na.fill = TRUE, na.rm = TRUE, apply_to = NULL, alpha = NULL, - memb_dim = 'member', sdate_dim = 'sdate', ncores = 1) { - - dim.exp <- dim(exp) - amt.dims.exp <- length(dim.exp) - dim.obs <- dim(obs) - amt.dims.obs <- length(dim.obs) - dim.names.exp <- names(dim.exp) - dim.names.obs <- names(dim.obs) - if(!is.null(exp_cor)){ - dim.exp_cor <- dim(exp_cor) - amt.dims.exp_cor <- length(dim.exp_cor) - dim.names.exp_cor <- names(dim.exp_cor) - } - if (is.null(memb_dim) || !is.character(memb_dim)) { - stop("Parameter 'memb_dim' should be a character string indicating the", - "name of the dimension where members are stored in 'exp'.") + memb_dim = 'member', sdate_dim = 'sdate', dat_dim = NULL, + ncores = NULL) { + + # Check inputs + ## exp, obs + if (!is.array(exp) || !is.numeric(exp)) { + stop("Parameter 'exp' must be a numeric array.") } - if (length(memb_dim) > 1) { - memb_dim <- memb_dim[1] - warning("Parameter 'memb_dim' has length greater than 1 and only", - " the first element will be used.") - } - - if (is.null(sdate_dim) || !is.character(sdate_dim)) { + if (!is.array(obs) || !is.numeric(obs)) { + stop("Parameter 'obs' must be a numeric array.") + } + expdims <- names(dim(exp)) + obsdims <- names(dim(obs)) + if (is.null(expdims)) { + stop("Parameter 'exp' must have dimension names.") + } + if (is.null(obsdims)) { + stop("Parameter 'obs' must have dimension names.") + } + if (any(is.na(exp))) { + warning("Parameter 'exp' contains NA values.") + } + if (any(is.na(obs))) { + warning("Parameter 'obs' contains NA values.") + } + ## exp_cor + if (!is.null(exp_cor)) { + # if exp_cor is provided, it will be calibrated: "calibrate forecast instead of hindcast" + # if exp_cor is provided, eval.method is overruled (because if exp_cor is provided, the + # train data will be all data of "exp" and the evalutaion data will be all data of "exp_cor"; + # no need for "leave-one-out" or "in-sample") + eval.method <- "hindcast-vs-forecast" + expcordims <- names(dim(exp_cor)) + if (is.null(expcordims)) { + stop("Parameter 'exp_cor' must have dimension names.") + } + if (any(is.na(exp_cor))) { + warning("Parameter 'exp_cor' contains NA values.") + } + } + ## dat_dim + if (!is.null(dat_dim)) { + if (!is.character(dat_dim) | length(dat_dim) > 1) { + stop("Parameter 'dat_dim' must be a character string.") + } + if (!dat_dim %in% names(dim(exp)) | !dat_dim %in% names(dim(obs))) { + stop("Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension.", + " Set it as NULL if there is no dataset dimension.") + } + } + ## sdate_dim and memb_dim + if (!is.character(sdate_dim)) { stop("Parameter 'sdate_dim' should be a character string indicating the", - "name of the dimension where start dates are stored in 'exp'.") + "name of the dimension where start dates are stored in 'exp'.") } if (length(sdate_dim) > 1) { sdate_dim <- sdate_dim[1] warning("Parameter 'sdate_dim' has length greater than 1 and only", " the first element will be used.") } - target.dim.names.exp <- c(memb_dim, sdate_dim) - target.dim.names.obs <- sdate_dim - - if (!all(target.dim.names.exp %in% dim.names.exp)) { - stop("Parameter 'exp' must have the dimensions defined in memb_dim ", - "and sdate_dim.") + if (!is.character(memb_dim)) { + stop("Parameter 'memb_dim' should be a character string indicating the", + "name of the dimension where members are stored in 'exp'.") } - - if(!is.null(exp_cor)){ - if (!all(target.dim.names.exp %in% dim.names.exp_cor)) { - stop("Parameter 'exp_cor' must have the dimensions defined in memb_dim ", - "and sdate_dim.") - } + if (length(memb_dim) > 1) { + memb_dim <- memb_dim[1] + warning("Parameter 'memb_dim' has length greater than 1 and only", + " the first element will be used.") } - if (!all(c(sdate_dim) %in% dim.names.obs)) { + target_dims_exp <- c(memb_dim, sdate_dim, dat_dim) + target_dims_obs <- c(sdate_dim, dat_dim) + + if (!all(target_dims_exp %in% expdims)) { + stop("Parameter 'exp' requires 'sdate_dim' and 'memb_dim' dimensions.") + } + if (!all(target_dims_obs %in% obsdims)) { stop("Parameter 'obs' must have the dimension defined in sdate_dim ", "parameter.") } - - if (any(is.na(exp))) { - warning("Parameter 'exp' contains NA values.") + if (memb_dim %in% obsdims) { + if (dim(obs)[memb_dim] != 1) { + warning("Parameter 'obs' has dimension 'memb_dim' with length larger", + " than 1. Only the first member dimension will be used.") + } + obs <- Subset(obs, along = memb_dim, indices = 1, drop = "selected") } - - if(!is.null(exp_cor)){ - if (any(is.na(exp_cor))) { - warning("Parameter 'exp_cor' contains NA values.") + if (!is.null(exp_cor)) { + if (!memb_dim %in% names(dim(exp_cor))) { + exp_cor <- InsertDim(exp_cor, posdim = 1, lendim = 1, name = memb_dim) + exp_cor_remove_memb <- TRUE + } else { + exp_cor_remove_memb <- FALSE } + } else { + exp_cor_remove_memb <- FALSE } - - if (any(is.na(obs))) { - warning("Parameter 'obs' contains NA values.") + ## exp, obs, and exp_cor (2) + name_exp <- sort(names(dim(exp))) + name_obs <- sort(names(dim(obs))) + name_exp <- name_exp[-which(name_exp == memb_dim)] + if (!is.null(dat_dim)) { + name_exp <- name_exp[-which(name_exp == dat_dim)] + name_obs <- name_obs[-which(name_obs == dat_dim)] } - - if (memb_dim %in% names(dim(obs))) { - obs <- Subset(obs, along = memb_dim, indices = 1, drop = "selected") + if (!identical(length(name_exp), length(name_obs)) | + !identical(dim(exp)[name_exp], dim(obs)[name_obs])) { + stop("Parameter 'exp' and 'obs' must have same length of all ", + "dimensions except 'memb_dim' and 'dat_dim'.") } - data.set.sufficiently.large.out <- - Apply(data = list(exp = exp, obs = obs), - target_dims = list(exp = target.dim.names.exp, obs = target.dim.names.obs), - ncores = ncores, - fun = .data.set.sufficiently.large)$output1 - - if(!all(data.set.sufficiently.large.out)){ - if(na.fill){ - warning("Some forecast data could not be corrected due to data lack", - " and is replaced with NA values") - } else { - warning("Some forecast data could not be corrected due to data lack", - " and is replaced with uncorrected values") - } + if (!is.null(exp_cor)) { + name_exp_cor <- sort(names(dim(exp_cor))) + name_exp <- sort(names(dim(exp))) + if (!is.null(dat_dim)) { + if (dat_dim %in% expcordims) { + if (!identical(dim(exp)[dat_dim], dim(exp_cor)[dat_dim])) { + stop("If parameter 'exp_cor' has dataset dimension, it must be", + " equal to dataset dimension of 'exp'.") + } + name_exp_cor <- name_exp_cor[-which(name_exp_cor == dat_dim)] + target_dims_cor <- c(memb_dim, sdate_dim, dat_dim) + } else { + target_dims_cor <- c(memb_dim, sdate_dim) + } + } else { + target_dims_cor <- c(memb_dim, sdate_dim) + } + name_exp <- name_exp[-which(name_exp %in% target_dims_exp)] + name_exp_cor <- name_exp_cor[-which(name_exp_cor %in% target_dims_cor)] + if (!identical(length(name_exp), length(name_exp_cor)) | + !identical(dim(exp)[name_exp], dim(exp_cor)[name_exp_cor])) { + stop("Parameter 'exp' and 'exp_cor' must have the same length of ", + "all common dimensions except 'dat_dim', 'sdate_dim' and 'memb_dim'.") + } } - - if (!na.rm %in% c(TRUE,FALSE)) { - stop("Parameter 'na.rm' must be TRUE or FALSE.") + ## ncores + if (!is.null(ncores)) { + if (!is.numeric(ncores) | ncores %% 1 != 0 | ncores <= 0 | + length(ncores) > 1) { + stop("Parameter 'ncores' must be either NULL or a positive integer.") + } + } + ## na.rm + if (!inherits(na.rm, "logical")) { + stop("Parameter 'na.rm' must be a logical value.") + } + if (length(na.rm) > 1) { + na.rm <- na.rm[1] + warning("Paramter 'na.rm' has length greater than 1, and only the fist element is used.") + } + ## cal.method, apply_to, alpha + if (!any(cal.method %in% c('bias', 'evmos', 'mse_min', 'crps_min', 'rpc-based'))) { + stop("Parameter 'cal.method' must be a character string indicating the calibration method used.") } if (cal.method == 'rpc-based') { if (is.null(apply_to)) { apply_to <- 'sign' - warning("'apply_to' cannot be NULL for 'rpc-based' method so it has been set to 'sign', as in Eade et al. (2014).") + warning("Parameter 'apply_to' cannot be NULL for 'rpc-based' method so it ", + "has been set to 'sign', as in Eade et al. (2014).") } else if (!apply_to %in% c('all','sign')) { - stop("'apply_to' must be either 'all' or 'sign' when 'rpc-based' method is used.") + stop("Parameter 'apply_to' must be either 'all' or 'sign' when 'rpc-based' ", + "method is used.") } if (apply_to == 'sign') { if (is.null(alpha)) { alpha <- 0.1 - warning("'alpha' cannot be NULL for 'rpc-based' method so it has been set to 0.1, as in Eade et al. (2014).") + warning("Parameter 'alpha' cannot be NULL for 'rpc-based' method so it ", + "has been set to 0.1, as in Eade et al. (2014).") } else if (!is.numeric(alpha) | alpha <= 0 | alpha >= 1) { - stop("'alpha' must be a number between 0 and 1.") + stop("Parameter 'alpha' must be a number between 0 and 1.") } } } - - if(is.null(exp_cor)){ - calibrated <- Apply(data = list(exp = exp, obs = obs), - cal.method = cal.method, - eval.method = eval.method, - multi.model = multi.model, - na.fill = na.fill, na.rm = na.rm, - apply_to = apply_to, alpha = alpha, - target_dims = list(exp = target.dim.names.exp, obs = target.dim.names.obs), - ncores = ncores, output_dims = target.dim.names.exp, - fun = .cal)$output1 - dexes <- match(names(dim(exp)), names(dim(calibrated))) - calibrated <- aperm(calibrated, dexes) - dimnames(calibrated) <- dimnames(exp)[dexes] - }else{ + ## eval.method + if (!any(eval.method %in% c('in-sample', 'leave-one-out', 'hindcast-vs-forecast'))) { + stop(paste0("Parameter 'eval.method' must be a character string indicating ", + "the sampling method used ('in-sample', 'leave-one-out' or ", + "'hindcast-vs-forecast').")) + } + ## multi.model + if (!inherits(multi.model, "logical")) { + stop("Parameter 'multi.model' must be a logical value.") + } + if (multi.model & !(cal.method == "mse_min")) { + warning(paste0("The 'multi.model' parameter is ignored when using the ", + "calibration method '", cal.method, "'.")) + } + + warning_shown <- FALSE + + if (is.null(exp_cor)) { + calibrated <- Apply(data = list(exp = exp, obs = obs), dat_dim = dat_dim, + cal.method = cal.method, eval.method = eval.method, multi.model = multi.model, + na.fill = na.fill, na.rm = na.rm, apply_to = apply_to, alpha = alpha, + target_dims = list(exp = target_dims_exp, obs = target_dims_obs), + ncores = ncores, fun = .cal)$output1 + } else { calibrated <- Apply(data = list(exp = exp, obs = obs, exp_cor = exp_cor), - cal.method = cal.method, - eval.method = eval.method, - multi.model = multi.model, - na.fill = na.fill, na.rm = na.rm, - apply_to = apply_to, alpha = alpha, - target_dims = list(exp = target.dim.names.exp, obs = target.dim.names.obs, exp_cor = target.dim.names.exp), - ncores = ncores, output_dims = target.dim.names.exp, - fun = .cal)$output1 - dexes <- match(names(dim(exp_cor)), names(dim(calibrated))) - calibrated <- aperm(calibrated, dexes) - dimnames(calibrated) <- dimnames(exp_cor)[dexes] - } + dat_dim = dat_dim, cal.method = cal.method, eval.method = eval.method, + multi.model = multi.model, na.fill = na.fill, na.rm = na.rm, + apply_to = apply_to, alpha = alpha, + target_dims = list(exp = target_dims_exp, obs = target_dims_obs, + exp_cor = target_dims_cor), + ncores = ncores, fun = .cal)$output1 + } + if (!is.null(dat_dim)) { + pos <- match(c(names(dim(exp))[-which(names(dim(exp)) == dat_dim)], 'nexp', 'nobs'), + names(dim(calibrated))) + calibrated <- aperm(calibrated, pos) + } else { + pos <- match(c(names(dim(exp))), names(dim(calibrated))) + calibrated <- aperm(calibrated, pos) + } + if (exp_cor_remove_memb) { + dim(calibrated) <- dim(calibrated)[-which(names(dim(calibrated)) == memb_dim)] + } return(calibrated) } -.data.set.sufficiently.large <- function(exp, obs){ +.data.set.sufficiently.large <- function(exp, obs) { amt.min.samples <- 3 amt.good.pts <- sum(!is.na(obs) & !apply(exp, c(2), function(x) all(is.na(x)))) return(amt.good.pts > amt.min.samples) } -.make.eval.train.dexes <- function(eval.method, amt.points, amt.points_cor){ - if(eval.method == "leave-one-out"){ +.make.eval.train.dexes <- function(eval.method, amt.points, amt.points_cor) { + if (eval.method == "leave-one-out") { dexes.lst <- lapply(seq(1, amt.points), function(x) return(list(eval.dexes = x, train.dexes = seq(1, amt.points)[-x]))) - } else if (eval.method == "in-sample"){ + } else if (eval.method == "in-sample") { dexes.lst <- list(list(eval.dexes = seq(1, amt.points), train.dexes = seq(1, amt.points))) - } else if (eval.method == "hindcast-vs-forecast"){ + } else if (eval.method == "hindcast-vs-forecast") { dexes.lst <- list(list(eval.dexes = seq(1,amt.points_cor), train.dexes = seq(1, amt.points))) } else { - stop(paste0("unknown sampling method: ",eval.method)) + stop(paste0("unknown sampling method: ", eval.method)) } return(dexes.lst) } -.cal <- function(exp, obs, exp_cor = NULL, cal.method, eval.method, multi.model, na.fill, na.rm, apply_to, alpha) { - if(is.null(exp_cor)){ - exp_cor <- exp ## generate a copy of exp so that the same function can run - ## when exp_cor is provided and when it's not - } - obs <- as.vector(obs) - dims.fc <- dim(exp) - dims.fc_cor <- dim(exp_cor) ## new line - amt.mbr <- dims.fc[1] - amt.sdate <- dims.fc[2] - amt.sdate_cor <- dims.fc_cor[2] ## new line - var.cor.fc <- NA * exp_cor ## modified line (exp_cor instead of exp); - ## in case of exp_cor not provided, at this point exp_cor - ## is already the same as exp so no change - names(dim(var.cor.fc)) <- dims.fc +.cal <- function(exp, obs, exp_cor = NULL, dat_dim = NULL, cal.method = "mse_min", + eval.method = "leave-one-out", multi.model = FALSE, na.fill = TRUE, + na.rm = TRUE, apply_to = NULL, alpha = NULL) { + + # exp: [memb, sdate, (dat)] + # obs: [sdate (dat)] + # exp_cor: [memb, sdate, (dat)] or NULL + + if (is.null(dat_dim)) { + nexp <- 1 + nobs <- 1 + exp <- InsertDim(exp, posdim = 3, lendim = 1, name = 'dataset') + obs <- InsertDim(obs, posdim = 2, lendim = 1, name = 'dataset') + } else { + nexp <- as.numeric(dim(exp)[dat_dim]) + nobs <- as.numeric(dim(obs)[dat_dim]) + } + + if (is.null(exp_cor)) { + # generate a copy of exp so that the same function can run for both cases + exp_cor <- exp + cor_dat_dim <- TRUE + } else { + if (length(dim(exp_cor)) == 2) { # exp_cor: [memb, sdate] + cor_dat_dim <- FALSE + } else { # exp_cor: [memb, sdate, dat] + cor_dat_dim <- TRUE + } + } + + expdims <- dim(exp) + expdims_cor <- dim(exp_cor) + memb <- expdims[1] # memb + sdate <- expdims[2] # sdate + sdate_cor <- expdims_cor[2] + + var.cor.fc <- array(dim = c(dim(exp_cor)[1:2], nexp = nexp, nobs = nobs)) - if(!.data.set.sufficiently.large(exp = exp, obs = obs)){ - if(na.fill){ - return(var.cor.fc) - } else { - var.cor.fc[] <- exp[] - return(var.cor.fc) - } - } - eval.train.dexeses <- .make.eval.train.dexes(eval.method, amt.points = amt.sdate, - amt.points_cor = amt.sdate_cor) - amt.resamples <- length(eval.train.dexeses) - for (i.sample in seq(1, amt.resamples)) { - # defining training (tr) and evaluation (ev) subsets - eval.dexes <- eval.train.dexeses[[i.sample]]$eval.dexes - train.dexes <- eval.train.dexeses[[i.sample]]$train.dexes - - fc.ev <- exp_cor[ , eval.dexes, drop = FALSE] ## modified line (exp_cor instead of exp) - ## fc.ev is used to evaluate (not train; train should be done with exp (hindcast)) - fc.tr <- exp[ , train.dexes] - obs.tr <- obs[train.dexes , drop = FALSE] - - if(cal.method == "bias"){ - var.cor.fc[ , eval.dexes] <- fc.ev + mean(obs.tr, na.rm = na.rm) - mean(fc.tr, na.rm = na.rm) - } else if(cal.method == "evmos"){ # forecast correction implemented - #calculate ensemble and observational characteristics - quant.obs.fc.tr <- .calc.obs.fc.quant(obs = obs.tr, fc = fc.tr, na.rm = na.rm) - #calculate value for regression parameters - init.par <- c(.calc.evmos.par(quant.obs.fc.tr, na.rm = na.rm)) - #correct evaluation subset - var.cor.fc[ , eval.dexes] <- .correct.evmos.fc(fc.ev , init.par, na.rm = na.rm) - } else if (cal.method == "mse_min"){ - #calculate ensemble and observational characteristics - quant.obs.fc.tr <- .calc.obs.fc.quant(obs = obs.tr, fc = fc.tr, na.rm = na.rm) - #calculate value for regression parameters - init.par <- .calc.mse.min.par(quant.obs.fc.tr, multi.model, na.rm = na.rm) - #correct evaluation subset - var.cor.fc[ , eval.dexes] <- .correct.mse.min.fc(fc.ev , init.par, na.rm = na.rm) - } else if (cal.method == "crps_min"){ - #calculate ensemble and observational characteristics - quant.obs.fc.tr <- .calc.obs.fc.quant.ext(obs = obs.tr, fc = fc.tr, na.rm = na.rm) - #calculate initial value for regression parameters - init.par <- c(.calc.mse.min.par(quant.obs.fc.tr, na.rm = na.rm), 0.001) - init.par[3] <- sqrt(init.par[3]) - #calculate regression parameters on training dataset - optim.tmp <- optim(par = init.par, - fn = .calc.crps.opt, - gr = .calc.crps.grad.opt, - quant.obs.fc = quant.obs.fc.tr, - na.rm = na.rm, - method = "BFGS") - - mbm.par <- optim.tmp$par - #correct evaluation subset - var.cor.fc[ , eval.dexes] <- .correct.crps.min.fc(fc.ev , mbm.par, na.rm = na.rm) - } else if (cal.method == 'rpc-based') { - ens_mean.ev <- multiApply::Apply(data = fc.ev, target_dims = names(amt.mbr), fun = mean, na.rm = na.rm)$output1 ## Ensemble mean - ens_mean.tr <- multiApply::Apply(data = fc.tr, target_dims = names(amt.mbr), fun = mean, na.rm = na.rm)$output1 ## Ensemble mean - ens_spread.tr <- multiApply::Apply(data = list(fc.tr, ens_mean.tr), target_dims = names(amt.sdate), fun = "-")$output1 ## Ensemble spread - exp_mean.tr <- mean(fc.tr, na.rm = na.rm) ## Mean (climatology) - var_signal.tr <- var(ens_mean.tr, na.rm = na.rm) ## Ensemble mean variance - var_noise.tr <- var(as.vector(ens_spread.tr), na.rm = na.rm) ## Variance of ensemble members about ensemble mean (= spread) - var_obs.tr <- var(obs.tr, na.rm = na.rm) ## Variance in the observations - r.tr <- cor(x = ens_mean.tr, y = obs.tr, method = 'pearson', use = ifelse(test = isTRUE(na.rm), yes = "pairwise.complete.obs", no = "everything")) ## Correlation between observations and the ensemble mean - if ((apply_to == 'all') || (apply_to == 'sign' && cor.test(ens_mean.tr, obs.tr, method = 'pearson', alternative = 'greater')$p.value < alpha)) { - ens_mean_cal <- (ens_mean.ev - exp_mean.tr) * r.tr * sqrt(var_obs.tr) / sqrt(var_signal.tr) + exp_mean.tr - var.cor.fc[ , eval.dexes] <- s2dv::Reorder(data = multiApply::Apply(data = list(exp = fc.ev, ens_mean = ens_mean.ev, ens_mean_cal = ens_mean_cal), target_dims = names(amt.sdate), fun = .CalibrationMembersRPC, var_obs = var_obs.tr, var_noise = var_noise.tr, r = r.tr)$output1, - order = names(dims.fc)) - dim(var.cor.fc) <- dims.fc - } else { ## no significant -> replacing with observed climatology - var.cor.fc[ , eval.dexes] <- array(data = mean(obs.tr, na.rm = na.rm), dim = dim(fc.ev)) + for (i in 1:nexp) { + for (j in 1:nobs) { + if (!.data.set.sufficiently.large(exp = exp[, , i, drop = FALSE], + obs = obs[, j, drop = FALSE])) { + if (!na.fill) { + exp_subset <- exp[, , i] + var.cor.fc[, , i, j] <- exp_subset + if (!warning_shown) { + warning("Some forecast data could not be corrected due to data lack", + " and is replaced with uncorrected values.") + warning_shown <<- TRUE + } + } else if (!warning_shown) { + warning("Some forecast data could not be corrected due to data lack", + " and is replaced with NA values.") + warning_shown <<- TRUE + } + } else { + # Subset data for dataset dimension + obs_data <- as.vector(obs[, j]) + exp_data <- exp[, , i] + dim(exp_data) <- dim(exp)[1:2] + if (cor_dat_dim) { + expcor_data <- exp_cor[, , i] + dim(expcor_data) <- dim(exp_cor)[1:2] + } else { + expcor_data <- exp_cor + } + + eval.train.dexeses <- .make.eval.train.dexes(eval.method = eval.method, + amt.points = sdate, + amt.points_cor = sdate_cor) + amt.resamples <- length(eval.train.dexeses) + for (i.sample in seq(1, amt.resamples)) { + # defining training (tr) and evaluation (ev) subsets + # fc.ev is used to evaluate (not train; train should be done with exp (hindcast)) + eval.dexes <- eval.train.dexeses[[i.sample]]$eval.dexes + train.dexes <- eval.train.dexeses[[i.sample]]$train.dexes + fc.ev <- expcor_data[, eval.dexes, drop = FALSE] + fc.tr <- exp_data[, train.dexes] + obs.tr <- obs_data[train.dexes, drop = FALSE] + + if (cal.method == "bias") { + var.cor.fc[, eval.dexes, i, j] <- fc.ev + mean(obs.tr, na.rm = na.rm) - mean(fc.tr, na.rm = na.rm) + # forecast correction implemented + } else if (cal.method == "evmos") { + # forecast correction implemented + # ensemble and observational characteristics + quant.obs.fc.tr <- .calc.obs.fc.quant(obs = obs.tr, fc = fc.tr, na.rm = na.rm) + # calculate value for regression parameters + init.par <- c(.calc.evmos.par(quant.obs.fc.tr, na.rm = na.rm)) + # correct evaluation subset + var.cor.fc[, eval.dexes, i, j] <- .correct.evmos.fc(fc.ev , init.par, na.rm = na.rm) + } else if (cal.method == "mse_min") { + quant.obs.fc.tr <- .calc.obs.fc.quant(obs = obs.tr, fc = fc.tr, na.rm = na.rm) + init.par <- .calc.mse.min.par(quant.obs.fc.tr, multi.model, na.rm = na.rm) + var.cor.fc[, eval.dexes, i, j] <- .correct.mse.min.fc(fc.ev , init.par, na.rm = na.rm) + } else if (cal.method == "crps_min") { + quant.obs.fc.tr <- .calc.obs.fc.quant.ext(obs = obs.tr, fc = fc.tr, na.rm = na.rm) + init.par <- c(.calc.mse.min.par(quant.obs.fc.tr, na.rm = na.rm), 0.001) + init.par[3] <- sqrt(init.par[3]) + # calculate regression parameters on training dataset + optim.tmp <- optim(par = init.par, fn = .calc.crps.opt, gr = .calc.crps.grad.opt, + quant.obs.fc = quant.obs.fc.tr, na.rm = na.rm, method = "BFGS") + mbm.par <- optim.tmp$par + var.cor.fc[, eval.dexes, i, j] <- .correct.crps.min.fc(fc.ev , mbm.par, na.rm = na.rm) + } else if (cal.method == 'rpc-based') { + # Ensemble mean + ens_mean.ev <- Apply(data = fc.ev, target_dims = names(memb), fun = mean, na.rm = na.rm)$output1 + ens_mean.tr <- Apply(data = fc.tr, target_dims = names(memb), fun = mean, na.rm = na.rm)$output1 + # Ensemble spread + ens_spread.tr <- Apply(data = list(fc.tr, ens_mean.tr), target_dims = names(sdate), fun = "-")$output1 + # Mean (climatology) + exp_mean.tr <- mean(fc.tr, na.rm = na.rm) + # Ensemble mean variance + var_signal.tr <- var(ens_mean.tr, na.rm = na.rm) + # Variance of ensemble members about ensemble mean (= spread) + var_noise.tr <- var(as.vector(ens_spread.tr), na.rm = na.rm) + # Variance in the observations + var_obs.tr <- var(obs.tr, na.rm = na.rm) + # Correlation between observations and the ensemble mean + r.tr <- cor(x = ens_mean.tr, y = obs.tr, method = 'pearson', + use = ifelse(test = isTRUE(na.rm), yes = "pairwise.complete.obs", no = "everything")) + if ((apply_to == 'all') || (apply_to == 'sign' && + cor.test(ens_mean.tr, obs.tr, method = 'pearson', alternative = 'greater')$p.value < alpha)) { + ens_mean_cal <- (ens_mean.ev - exp_mean.tr) * r.tr * sqrt(var_obs.tr) / sqrt(var_signal.tr) + exp_mean.tr + var.cor.fc[, eval.dexes, i, j] <- Reorder(data = Apply(data = list(exp = fc.ev, ens_mean = ens_mean.ev, + ens_mean_cal = ens_mean_cal), + target_dims = names(sdate), fun = .CalibrationMembersRPC, + var_obs = var_obs.tr, var_noise = var_noise.tr, r = r.tr)$output1, + order = names(expdims)[1:2]) + } else { + # no significant -> replacing with observed climatology + var.cor.fc[, eval.dexes, i, j] <- array(data = mean(obs.tr, na.rm = na.rm), dim = dim(fc.ev)) + } + } else { + stop("unknown calibration method: ", cal.method) + } + } } - } else { - stop("unknown calibration method: ",cal.method) } } - return(var.cor.fc) + + if (is.null(dat_dim)) { + dim(var.cor.fc) <- dim(exp_cor)[1:2] + } + + return(var.cor.fc) } -.calc.obs.fc.quant <- function(obs, fc, na.rm){ #function to calculate different quantities of a series of ensemble forecasts and corresponding observations +# Function to calculate different quantities of a series of ensemble forecasts and corresponding observations +.calc.obs.fc.quant <- function(obs, fc, na.rm) { + if (is.null(dim(fc))) { + dim(fc) <- c(length(fc), 1) + } amt.mbr <- dim(fc)[1] - obs.per.ens <- InsertDim(obs, posdim = 1, lendim = amt.mbr) + obs.per.ens <- InsertDim(obs, posdim = 1, lendim = amt.mbr, name = 'amt.mbr') fc.ens.av <- apply(fc, c(2), mean, na.rm = na.rm) cor.obs.fc <- cor(fc.ens.av, obs, use = "complete.obs") obs.av <- mean(obs, na.rm = na.rm) @@ -430,9 +719,10 @@ Calibration <- function(exp, obs, exp_cor=NULL, cal.method = "mse_min", ) } -.calc.obs.fc.quant.ext <- function(obs, fc, na.rm){ #extended function to calculate different quantities of a series of ensemble forecasts and corresponding observations +# Extended function to calculate different quantities of a series of ensemble forecasts and corresponding observations +.calc.obs.fc.quant.ext <- function(obs, fc, na.rm){ amt.mbr <- dim(fc)[1] - obs.per.ens <- InsertDim(obs, posdim = 1, lendim = amt.mbr) + obs.per.ens <- InsertDim(obs, posdim = 1, lendim = amt.mbr, name = 'amt.mbr') fc.ens.av <- apply(fc, c(2), mean, na.rm = na.rm) cor.obs.fc <- cor(fc.ens.av, obs, use = "complete.obs") obs.av <- mean(obs, na.rm = na.rm) @@ -451,13 +741,13 @@ Calibration <- function(exp, obs, exp_cor=NULL, cal.method = "mse_min", ) } - -.calc.fc.quant <- function(fc, na.rm){ #function to calculate different quantities of a series of ensemble forecasts +# Function to calculate different quantities of a series of ensemble forecasts +.calc.fc.quant <- function(fc, na.rm) { amt.mbr <- dim(fc)[1] fc.ens.av <- apply(fc, c(2), mean, na.rm = na.rm) fc.ens.av.av <- mean(fc.ens.av, na.rm = na.rm) fc.ens.av.sd <- sd(fc.ens.av, na.rm = na.rm) - fc.ens.av.per.ens <- InsertDim(fc.ens.av, posdim = 1, lendim = amt.mbr) + fc.ens.av.per.ens <- InsertDim(fc.ens.av, posdim = 1, lendim = amt.mbr, name = 'amt.mbr') fc.ens.sd <- apply(fc, c(2), sd, na.rm = na.rm) fc.ens.var.av.sqrt <- sqrt(mean(fc.ens.sd^2, na.rm = na.rm)) fc.dev <- fc - fc.ens.av.per.ens @@ -480,13 +770,13 @@ Calibration <- function(exp, obs, exp_cor=NULL, cal.method = "mse_min", ) } -.calc.fc.quant.ext <- function(fc, na.rm){ #extended function to calculate different quantities of a series of ensemble forecasts - +# Extended function to calculate different quantities of a series of ensemble forecasts +.calc.fc.quant.ext <- function(fc, na.rm) { amt.mbr <- dim(fc)[1] - repmat1.tmp <- InsertDim(fc, posdim = 1, lendim = amt.mbr) + repmat1.tmp <- InsertDim(fc, posdim = 1, lendim = amt.mbr, name = 'amt.mbr') repmat2.tmp <- aperm(repmat1.tmp, c(2, 1, 3)) spr.abs <- apply(abs(repmat1.tmp - repmat2.tmp), c(3), mean, na.rm = na.rm) - spr.abs.per.ens <- InsertDim(spr.abs, posdim = 1, lendim = amt.mbr) + spr.abs.per.ens <- InsertDim(spr.abs, posdim = 1, lendim = amt.mbr, name = 'amt.mbr') return( append(.calc.fc.quant(fc, na.rm = na.rm), @@ -494,11 +784,10 @@ Calibration <- function(exp, obs, exp_cor=NULL, cal.method = "mse_min", ) } -#Below are the core or elementary functions to calculate the regression parameters for the different methods -.calc.mse.min.par <- function(quant.obs.fc, multi.model = F, na.rm){ +# Below are the core or elementary functions to calculate the regression parameters for the different methods +.calc.mse.min.par <- function(quant.obs.fc, multi.model = F, na.rm) { par.out <- rep(NA, 3) - - if(multi.model){ + if (multi.model) { par.out[3] <- with(quant.obs.fc, obs.sd * sqrt(1. - cor.obs.fc^2) / fc.ens.var.av.sqrt) } else { par.out[3] <- with(quant.obs.fc, obs.sd * sqrt(1. - cor.obs.fc^2) / fc.dev.sd) @@ -508,13 +797,15 @@ Calibration <- function(exp, obs, exp_cor=NULL, cal.method = "mse_min", return(par.out) } -.calc.evmos.par <- function(quant.obs.fc, na.rm){ + +.calc.evmos.par <- function(quant.obs.fc, na.rm) { par.out <- rep(NA, 2) par.out[2] <- with(quant.obs.fc, obs.sd / fc.sd) par.out[1] <- with(quant.obs.fc, obs.av - par.out[2] * fc.ens.av.av, na.rm = na.rm) return(par.out) } -#Below are the core or elementary functions to calculate the functions necessary for the minimization of crps + +# Below are the core or elementary functions to calculate the functions necessary for the minimization of crps .calc.crps.opt <- function(par, quant.obs.fc, na.rm){ return( with(quant.obs.fc, @@ -525,10 +816,9 @@ Calibration <- function(exp, obs, exp_cor=NULL, cal.method = "mse_min", ) } -.calc.crps.grad.opt <- function(par, quant.obs.fc, na.rm){ - sgn1 <- with(quant.obs.fc,sign(obs.per.ens - - (par[1] + par[2] * fc.ens.av.per.ens + - ((par[3])^2 + par[4] / spr.abs.per.ens) * fc.dev))) +.calc.crps.grad.opt <- function(par, quant.obs.fc, na.rm) { + sgn1 <- with(quant.obs.fc,sign(obs.per.ens - (par[1] + par[2] * fc.ens.av.per.ens + + ((par[3])^2 + par[4] / spr.abs.per.ens) * fc.dev))) sgn2 <- with(quant.obs.fc, sign((par[3])^2 + par[4] / spr.abs.per.ens)) sgn3 <- with(quant.obs.fc,sign((par[3])^2 * spr.abs + par[4])) deriv.par1 <- mean(sgn1, na.rm = na.rm) @@ -542,22 +832,22 @@ Calibration <- function(exp, obs, exp_cor=NULL, cal.method = "mse_min", return(c(deriv.par1, deriv.par2, deriv.par3, deriv.par4)) } -#Below are the core or elementary functions to correct the evaluation set based on the regression parameters -.correct.evmos.fc <- function(fc, par, na.rm){ +# Below are the core or elementary functions to correct the evaluation set based on the regression parameters +.correct.evmos.fc <- function(fc, par, na.rm) { quant.fc.mp <- .calc.fc.quant(fc = fc, na.rm = na.rm) return(with(quant.fc.mp, par[1] + par[2] * fc)) } -.correct.mse.min.fc <- function(fc, par, na.rm){ +.correct.mse.min.fc <- function(fc, par, na.rm) { quant.fc.mp <- .calc.fc.quant(fc = fc, na.rm = na.rm) return(with(quant.fc.mp, par[1] + par[2] * fc.ens.av.per.ens + fc.dev * par[3])) } -.correct.crps.min.fc <- function(fc, par, na.rm){ +.correct.crps.min.fc <- function(fc, par, na.rm) { quant.fc.mp <- .calc.fc.quant.ext(fc = fc, na.rm = na.rm) return(with(quant.fc.mp, par[1] + par[2] * fc.ens.av.per.ens + fc.dev * abs((par[3])^2 + par[4] / spr.abs))) } # Function to calibrate the individual members with the RPC-based method -.CalibrationMembersRPC <- function(exp, ens_mean, ens_mean_cal, var_obs, var_noise, r){ +.CalibrationMembersRPC <- function(exp, ens_mean, ens_mean_cal, var_obs, var_noise, r) { member_cal <- (exp - ens_mean) * sqrt(var_obs) * sqrt(1 - r^2) / sqrt(var_noise) + ens_mean_cal return(member_cal) } diff --git a/R/CST_CategoricalEnsCombination.R b/R/CST_CategoricalEnsCombination.R index 3dc23579ccafaca5349ee7d7ea3a646fc371912e..86c40df1b4813dd653a8ea511ad1dd0582126bc0 100644 --- a/R/CST_CategoricalEnsCombination.R +++ b/R/CST_CategoricalEnsCombination.R @@ -1,82 +1,105 @@ -#' Make categorical forecast based on a multi-model forecast with potential for calibrate +#'Make categorical forecast based on a multi-model forecast with potential for +#'calibrate #' #'@author Bert Van Schaeybroeck, \email{bertvs@meteo.be} -#'@description This function converts a multi-model ensemble forecast -#' into a categorical forecast by giving the probability -#' for each category. Different methods are available to combine -#' the different ensemble forecasting models into -#' probabilistic categorical forecasts. +#'@description This function converts a multi-model ensemble forecast into a +#'categorical forecast by giving the probability for each category. Different +#'methods are available to combine the different ensemble forecasting models +#'into probabilistic categorical forecasts. #' -#' Motivation: -#' Beyond the short range, the unpredictable component of weather -#' predictions becomes substantial due to the chaotic nature of the earth -#' system. Therefore, predictions can mostly be skillful when used in a probabilistic sense. -#' In practice this is done using ensemble forecasts. It is then common to -#' convert the ensemble forecasts to occurence probabilities for different categories. -#' These categories typically are taken as terciles from climatolgical distributions. -#' For instance for temperature, there is a cold, normal and warm class. -#' Commonly multiple ensemble forecasting systems -#' are available but some models may be more competitive than others -#' for the variable, region and user need under consideration. Therefore, -#' when calculating the category probabilities, the ensemble members of -#' the different forecasting system may be differently weighted. -#' Such weighting is typically done by comparison of the ensemble forecasts -#' with observations. +#'Motivation: Beyond the short range, the unpredictable component of weather +#'predictions becomes substantial due to the chaotic nature of the earth system. +#'Therefore, predictions can mostly be skillful when used in a probabilistic +#'sense. In practice this is done using ensemble forecasts. It is then common to +#'convert the ensemble forecasts to occurence probabilities for different +#'categories. These categories typically are taken as terciles from +#'climatolgical distributions. For instance for temperature, there is a cold, +#'normal and warm class. Commonly multiple ensemble forecasting systems are +#'available but some models may be more competitive than others for the +#'variable, region and user need under consideration. Therefore, when +#'calculating the category probabilities, the ensemble members of the different +#'forecasting system may be differently weighted. Such weighting is typically +#'done by comparison of the ensemble forecasts with observations. #' -#' Description of the tool: -#' The tool considers all forecasts (all members from all forecasting systems) -#' and converts them into occurrence probabilities of different categories. -#' The amount of categories can be changed and are taken as the -#' climatological quantiles (e.g. terciles), extracted -#' from the observational data. -#' The methods that are available to combine the ensemble forecasting models into -#' probabilistic categorical forecasts are: 1) ensemble pooling where -#' all ensemble members of all ensemble systems are weighted equally, +#'Description of the tool: The tool considers all forecasts (all members from +#'all forecasting systems) and converts them into occurrence probabilities of +#'different categories. The amount of categories can be changed and are taken as +#'the climatological quantiles (e.g. terciles), extracted from the observational +#'data. The methods that are available to combine the ensemble forecasting +#'models into probabilistic categorical forecasts are: 1) ensemble pooling where +#'all ensemble members of all ensemble systems are weighted equally, #' 2) model combination where each model system is weighted equally, and, #' 3) model weighting. -#' The model weighting method is described in Rajagopalan et al. (2002), -#' Robertson et al. 2004 and Van Schaeybroeck and Vannitsem (2019). -#' More specifically, this method uses different weights for the -#' occurence probability predicted by the available models and by a climatological model -#' and optimizes the weights by minimizing the ignorance score. -#' Finally, the function can also be used to categorize the observations -#' in the categorical quantiles. -#' -#'@param exp an object of class \code{s2dv_cube} as returned by \code{CST_Load} function, containing the seasonal forecast experiment data in the element named \code{$data}. The amount of forecasting models is equal to the size of the \code{dataset} dimension of the data array. The amount of members per model may be different. The size of the \code{member} dimension of the data array is equal to the maximum of the ensemble members among the models. Models with smaller ensemble sizes have residual indices of \code{member} dimension in the data array filled with NA values. -#'@param obs an object of class \code{s2dv_cube} as returned by \code{CST_Load} function, containing the observed data in the element named \code{$data}. -#'@param amt.cat is the amount of categories. Equally-sized quantiles will be calculated based on the amount of categories. -#'@param cat.method method used to produce the categorical forecast, can be either \code{pool}, \code{comb}, \code{mmw} or \code{obs}. The method pool assumes equal weight for all ensemble members while the method comb assumes equal weight for each model. The weighting method is descirbed in Rajagopalan et al. (2002), Robertson et al. (2004) and Van Schaeybroeck and Vannitsem (2019). Finally, the \code{obs} method classifies the observations into the different categories and therefore contains only 0 and 1 values. -#'@param eval.method is the sampling method used, can be either \code{"in-sample"} or \code{"leave-one-out"}. Default value is the \code{"leave-one-out"} cross validation. +#'The model weighting method is described in Rajagopalan et al. (2002), +#'Robertson et al. 2004 and Van Schaeybroeck and Vannitsem (2019). More +#'specifically, this method uses different weights for the occurence probability +#'predicted by the available models and by a climatological model and optimizes +#'the weights by minimizing the ignorance score. Finally, the function can also +#'be used to categorize the observations in the categorical quantiles. +#' +#'@param exp An object of class \code{s2dv_cube} as returned by \code{CST_Load} +#' function, containing the seasonal forecast experiment data in the element +#' named \code{$data}. The amount of forecasting models is equal to the size of +#' the \code{dataset} dimension of the data array. The amount of members per +#' model may be different. The size of the \code{member} dimension of the data +#' array is equal to the maximum of the ensemble members among the models. +#' Models with smaller ensemble sizes have residual indices of \code{member} +#' dimension in the data array filled with NA values. +#'@param obs An object of class \code{s2dv_cube} as returned by \code{CST_Load} +#' function, containing the observed data in the element named \code{$data}. +#'@param amt.cat Is the amount of categories. Equally-sized quantiles will be +#' calculated based on the amount of categories. +#'@param cat.method Method used to produce the categorical forecast, can be +#' either \code{pool}, \code{comb}, \code{mmw} or \code{obs}. The method pool +#' assumes equal weight for all ensemble members while the method comb assumes +#' equal weight for each model. The weighting method is descirbed in +#' Rajagopalan et al. (2002), Robertson et al. (2004) and Van Schaeybroeck and +#' Vannitsem (2019). Finally, the \code{obs} method classifies the observations +#' into the different categories and therefore contains only 0 and 1 values. +#'@param eval.method Is the sampling method used, can be either +#' \code{"in-sample"} or \code{"leave-one-out"}. Default value is the +#' \code{"leave-one-out"} cross validation. #'@param ... other parameters to be passed on to the calibration procedure. #' -#'@return an object of class \code{s2dv_cube} containing the categorical forecasts in the element called \code{$data}. The first two dimensions of the returned object are named dataset and member and are both of size one. An additional dimension named category is introduced and is of size amt.cat. +#'@return An object of class \code{s2dv_cube} containing the categorical +#'forecasts in the element called \code{$data}. The first two dimensions of the +#'returned object are named dataset and member and are both of size one. An +#'additional dimension named category is introduced and is of size amt.cat. #' -#'@references Rajagopalan, B., Lall, U., & Zebiak, S. E. (2002). Categorical climate forecasts through regularization and optimal combination of multiple GCM ensembles. Monthly Weather Review, 130(7), 1792-1811. -#'@references Robertson, A. W., Lall, U., Zebiak, S. E., & Goddard, L. (2004). Improved combination of multiple atmospheric GCM ensembles for seasonal prediction. Monthly Weather Review, 132(12), 2732-2744. -#'@references Van Schaeybroeck, B., & Vannitsem, S. (2019). Postprocessing of Long-Range Forecasts. In Statistical Postprocessing of Ensemble Forecasts (pp. 267-290). +#'@references Rajagopalan, B., Lall, U., & Zebiak, S. E. (2002). Categorical +#'climate forecasts through regularization and optimal combination of multiple +#'GCM ensembles. Monthly Weather Review, 130(7), 1792-1811. +#'@references Robertson, A. W., Lall, U., Zebiak, S. E., & Goddard, L. (2004). +#'Improved combination of multiple atmospheric GCM ensembles for seasonal +#'prediction. Monthly Weather Review, 132(12), 2732-2744. +#'@references Van Schaeybroeck, B., & Vannitsem, S. (2019). Postprocessing of +#'Long-Range Forecasts. In Statistical Postprocessing of Ensemble Forecasts (pp. 267-290). #' -#'@importFrom s2dv InsertDim -#'@import abind #'@examples -#' -#'mod1 <- 1 : (2 * 3 * 4 * 5 * 6 * 7) -#'dim(mod1) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) -#'mod1[ 2, 3, , , , ] <- NA -#'dimnames(mod1)[[1]] <- c("MF", "UKMO") -#'obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) -#'dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) +#'mod1 <- 1 : (2 * 2* 4 * 5 * 2 * 2) +#'dim(mod1) <- c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, lon = 2) +#'mod1[2, 1, , , , ] <- NA +#'datasets <- c("MF", "UKMO") +#'obs1 <- 1 : (1 * 1 * 4 * 5 * 2 * 2) +#'dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, lon = 2) #'lon <- seq(0, 30, 5) #'lat <- seq(0, 25, 5) -#'exp <- list(data = mod1, lat = lat, lon = lon) -#'obs <- list(data = obs1, lat = lat, lon = lon) +#'coords <- list(lat = lat, lon = lon) +#'attrs <- list(Datasets = datasets) +#'exp <- list(data = mod1, coords = coords, attrs = attrs) +#'obs <- list(data = obs1, coords = coords) #'attr(exp, 'class') <- 's2dv_cube' #'attr(obs, 'class') <- 's2dv_cube' -#'\donttest{ -#'a <- CST_CategoricalEnsCombination(exp = exp, obs = obs, amt.cat = 3, cat.method = "mmw") -#'} +#'a <- CST_CategoricalEnsCombination(exp = exp, obs = obs, amt.cat = 3, +#' cat.method = "mmw") +#'@importFrom s2dv InsertDim +#'@import abind #'@export - -CST_CategoricalEnsCombination <- function(exp, obs, cat.method = "pool", eval.method = "leave-one-out", amt.cat = 3, ...) { +CST_CategoricalEnsCombination <- function(exp, obs, cat.method = "pool", + eval.method = "leave-one-out", + amt.cat = 3, + ...) { + # Check 's2dv_cube' if (!inherits(exp, "s2dv_cube") || !inherits(exp, "s2dv_cube")) { stop("Parameter 'exp' and 'obs' must be of the class 's2dv_cube', ", "as output by CSTools::CST_Load.") @@ -85,45 +108,73 @@ CST_CategoricalEnsCombination <- function(exp, obs, cat.method = "pool", eval.me stop("The length of the dimension 'member' in the component 'data' ", "of the parameter 'obs' must be equal to 1.") } + names.dim.tmp <- names(dim(exp$data)) - exp$data <- CategoricalEnsCombination(fc = exp$data, obs = obs$data, cat.method = cat.method, - eval.method = eval.method, amt.cat = amt.cat, ...) + exp$data <- CategoricalEnsCombination(fc = exp$data, obs = obs$data, + cat.method = cat.method, + eval.method = eval.method, + amt.cat = amt.cat, ...) + names.dim.tmp[which(names.dim.tmp == "member")] <- "category" names(dim(exp$data)) <- names.dim.tmp - exp$data <- suppressWarnings(InsertDim(exp$data, lendim = 1, posdim = 2)) - names(dim(exp$data))[2] <- "member" - exp$Datasets <- c(exp$Datasets, obs$Datasets) - exp$source_files <- c(exp$source_files, obs$source_files) + + exp$data <- InsertDim(exp$data, lendim = 1, posdim = 2, name = "member") + exp$attrs$Datasets <- c(exp$attrs$Datasets, obs$attrs$Datasets) + exp$attrs$source_files <- c(exp$attrs$source_files, obs$attrs$source_files) return(exp) } -#' Make categorical forecast based on a multi-model forecast with potential for calibrate +#'Make categorical forecast based on a multi-model forecast with potential for +#'calibrate #' #'@author Bert Van Schaeybroeck, \email{bertvs@meteo.be} -#'@description This function converts a multi-model ensemble forecast -#' into a categorical forecast by giving the probability -#' for each category. Different methods are available to combine -#' the different ensemble forecasting models into -#' probabilistic categorical forecasts. +#'@description This function converts a multi-model ensemble forecast into a +#'categorical forecast by giving the probability for each category. Different +#'methods are available to combine the different ensemble forecasting models +#'into probabilistic categorical forecasts. #' #' See details in ?CST_CategoricalEnsCombination -#'@param fc a multi-dimensional array with named dimensions containing the seasonal forecast experiment data in the element named \code{$data}. The amount of forecasting models is equal to the size of the \code{dataset} dimension of the data array. The amount of members per model may be different. The size of the \code{member} dimension of the data array is equal to the maximum of the ensemble members among the models. Models with smaller ensemble sizes have residual indices of \code{member} dimension in the data array filled with NA values. -#'@param obs a multidimensional array with named dimensions containing the observed data in the element named \code{$data}. -#'@param amt.cat is the amount of categories. Equally-sized quantiles will be calculated based on the amount of categories. -#'@param cat.method method used to produce the categorical forecast, can be either \code{pool}, \code{comb}, \code{mmw} or \code{obs}. The method pool assumes equal weight for all ensemble members while the method comb assumes equal weight for each model. The weighting method is descirbed in Rajagopalan et al. (2002), Robertson et al. (2004) and Van Schaeybroeck and Vannitsem (2019). Finally, the \code{obs} method classifies the observations into the different categories and therefore contains only 0 and 1 values. -#'@param eval.method is the sampling method used, can be either \code{"in-sample"} or \code{"leave-one-out"}. Default value is the \code{"leave-one-out"} cross validation. -#'@param ... other parameters to be passed on to the calibration procedure. +#'@param fc A multi-dimensional array with named dimensions containing the +#' seasonal forecast experiment data in the element named \code{$data}. The +#' amount of forecasting models is equal to the size of the \code{dataset} +#' dimension of the data array. The amount of members per model may be +#' different. The size of the \code{member} dimension of the data array is +#' equal to the maximum of the ensemble members among the models. Models with +#' smaller ensemble sizes have residual indices of \code{member} dimension in +#' the data array filled with NA values. +#'@param obs A multidimensional array with named dimensions containing the +#' observed data in the element named \code{$data}. +#'@param amt.cat Is the amount of categories. Equally-sized quantiles will be +#' calculated based on the amount of categories. +#'@param cat.method Method used to produce the categorical forecast, can be +#' either \code{pool}, \code{comb}, \code{mmw} or \code{obs}. The method pool +#' assumes equal weight for all ensemble members while the method comb assumes +#' equal weight for each model. The weighting method is descirbed in +#' Rajagopalan et al. (2002), Robertson et al. (2004) and Van Schaeybroeck and +#' Vannitsem (2019). Finally, the \code{obs} method classifies the observations +#' into the different categories and therefore contains only 0 and 1 values. +#'@param eval.method Is the sampling method used, can be either +#' \code{"in-sample"} or \code{"leave-one-out"}. Default value is the +#' \code{"leave-one-out"} cross validation. +#'@param ... Other parameters to be passed on to the calibration procedure. #' -#'@return an array containing the categorical forecasts in the element called \code{$data}. The first two dimensions of the returned object are named dataset and member and are both of size one. An additional dimension named category is introduced and is of size amt.cat. +#'@return An array containing the categorical forecasts in the element called +#'\code{$data}. The first two dimensions of the returned object are named +#'dataset and member and are both of size one. An additional dimension named +#'category is introduced and is of size amt.cat. #' -#'@references Rajagopalan, B., Lall, U., & Zebiak, S. E. (2002). Categorical climate forecasts through regularization and optimal combination of multiple GCM ensembles. Monthly Weather Review, 130(7), 1792-1811. -#'@references Robertson, A. W., Lall, U., Zebiak, S. E., & Goddard, L. (2004). Improved combination of multiple atmospheric GCM ensembles for seasonal prediction. Monthly Weather Review, 132(12), 2732-2744. -#'@references Van Schaeybroeck, B., & Vannitsem, S. (2019). Postprocessing of Long-Range Forecasts. In Statistical Postprocessing of Ensemble Forecasts (pp. 267-290). +#'@references Rajagopalan, B., Lall, U., & Zebiak, S. E. (2002). Categorical +#'climate forecasts through regularization and optimal combination of multiple +#'GCM ensembles. Monthly Weather Review, 130(7), 1792-1811. +#'@references Robertson, A. W., Lall, U., Zebiak, S. E., & Goddard, L. (2004). +#'Improved combination of multiple atmospheric GCM ensembles for seasonal +#'prediction. Monthly Weather Review, 132(12), 2732-2744. +#'@references Van Schaeybroeck, B., & Vannitsem, S. (2019). Postprocessing of +#'Long-Range Forecasts. In Statistical Postprocessing of Ensemble Forecasts (pp. 267-290). #' #'@importFrom s2dv InsertDim #'@import abind #'@export - CategoricalEnsCombination <- function (fc, obs, cat.method, eval.method, amt.cat, ...) { if (!all(c("member", "sdate") %in% names(dim(fc)))) { @@ -423,4 +474,4 @@ comb.dims <- function(arr.in, dims.to.combine){ freq.per.mdl[amt.coeff, , ] = 1 / amt.cat return(freq.per.mdl) -} +} \ No newline at end of file diff --git a/R/CST_DynBiasCorrection.R b/R/CST_DynBiasCorrection.R index f83def23d7ef54bc32a6ce7a6f9191658095454a..ffbba9b846f13ce20bf80fc5d2fc09aab1b5f342 100644 --- a/R/CST_DynBiasCorrection.R +++ b/R/CST_DynBiasCorrection.R @@ -20,57 +20,58 @@ #'@references Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., #'and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large #'scale atmospheric predictability.Nature Communications, 10(1), 1316. -#'DOI = https://doi.org/10.1038/s41467-019-09305-8 " +#'\doi{10.1038/s41467-019-09305-8}" #'@references Faranda, D., Gabriele Messori and Pascal Yiou. (2017). #' Dynamical proxies of North Atlantic predictability and extremes. #' Scientific Reports, 7-41278, 2017. #' -#'@param exp an s2v_cube object with the experiment data -#'@param obs an s2dv_cube object with the reference data -#'@param method a character string indicating the method to apply bias -#'correction among these ones: "PTF","RQUANT","QUANT","SSPLIN" -#'@param wetday logical indicating whether to perform wet day correction -#'or not OR a numeric threshold below which all values are set to zero (by -#'default is set to 'FALSE'). -#'@param proxy a character string indicating the proxy for local dimension -#' 'dim' or inverse of persistence 'theta' to apply the dynamical -#' conditioned bias correction method. -#'@param quanti a number lower than 1 indicating the quantile to perform -#'the computation of local dimension and theta -#'@param ncores The number of cores to use in parallel computation +#'@param exp An s2v_cube object with the experiment data. +#'@param obs An s2dv_cube object with the reference data. +#'@param method A character string indicating the method to apply bias +#' correction among these ones: "PTF","RQUANT","QUANT","SSPLIN". +#'@param wetday Logical indicating whether to perform wet day correction +#' or not OR a numeric threshold below which all values are set to zero (by +#' default is set to 'FALSE'). +#'@param proxy A character string indicating the proxy for local dimension +#' 'dim' or inverse of persistence 'theta' to apply the dynamical +#' conditioned bias correction method. +#'@param quanti A number lower than 1 indicating the quantile to perform +#' the computation of local dimension and theta. +#'@param ncores The number of cores to use in parallel computation. #' -#'@return dynbias an s2dvcube object with a bias correction performed -#'conditioned by local dimension 'dim' or inverse of persistence 'theta' +#'@return dynbias An s2dvcube object with a bias correction performed +#'conditioned by local dimension 'dim' or inverse of persistence 'theta'. #' #'@examples -#'# example 1: simple data s2dvcube style -#' set.seed(1) -#' expL <- rnorm(1:2000) -#' dim (expL) <- c(time =100,lat = 4, lon = 5) -#' obsL <- c(rnorm(1:1980),expL[1,,]*1.2) -#' dim (obsL) <- c(time = 100,lat = 4, lon = 5) -#' time_obsL <- paste(rep("01", 100), rep("01", 100), 1920 : 2019, sep = "-") -#' time_expL <- paste(rep("01", 100), rep("01", 100), 1929 : 2019, sep = "-") -#' lon <- seq(-1,5,1.5) -#' lat <- seq(30,35,1.5) -#' # qm=0.98 # too high for this short dataset, it is possible that doesn't -#' # get the requirement, in that case it would be necessary select a lower qm -#' # for instance qm=0.60 -#' expL <- s2dv_cube(data = expL, lat = lat, lon = lon, -#' Dates = list(start = time_expL, end = time_expL)) -#' obsL <- s2dv_cube(data = obsL, lat = lat, lon = lon, -#' Dates = list(start = time_obsL, end = time_obsL)) -#' # to use DynBiasCorrection -#' dynbias1 <- DynBiasCorrection(exp = expL$data, obs = obsL$data, proxy= "dim", +#'expL <- rnorm(1:2000) +#'dim(expL) <- c(time = 100, lat = 4, lon = 5) +#'obsL <- c(rnorm(1:1980), expL[1, , ] * 1.2) +#'dim(obsL) <- c(time = 100, lat = 4, lon = 5) +#'time_obsL <- as.POSIXct(paste(rep("01", 100), rep("01", 100), 1920:2019, sep = "-"), +#' format = "%d-%m-%y") +#'time_expL <- as.POSIXct(paste(rep("01", 100), rep("01", 100), 1929:2019, sep = "-"), +#' format = "%d-%m-%y") +#'lon <- seq(-1, 5, 1.5) +#'lat <- seq(30, 35, 1.5) +#'# qm = 0.98 #'too high for this short dataset, it is possible that doesn't +#'# get the requirement, in that case it would be necessary select a lower qm +#'# for instance qm = 0.60 +#'expL <- s2dv_cube(data = expL, coords = list(lon = lon, lat = lat), +#' Dates = time_expL) +#'obsL <- s2dv_cube(data = obsL, coords = list(lon = lon, lat = lat), +#' Dates = time_obsL) +#'# to use DynBiasCorrection +#'dynbias1 <- DynBiasCorrection(exp = expL$data, obs = obsL$data, proxy= "dim", +#' quanti = 0.6) +#'# to use CST_DynBiasCorrection +#'dynbias2 <- CST_DynBiasCorrection(exp = expL, obs = obsL, proxy= "dim", #' quanti = 0.6) -#' # to use CST_DynBiasCorrection -#' dynbias2 <- CST_DynBiasCorrection(exp = expL, obs = obsL, proxy= "dim", -#' quanti = 0.6) #' #'@export CST_DynBiasCorrection<- function(exp, obs, method = 'QUANT', wetday=FALSE, proxy = "dim", quanti, ncores = NULL) { + # Check 's2dv_cube' if (!inherits(obs, 's2dv_cube')) { stop("Parameter 'obs' must be of the class 's2dv_cube', ", "as output by CSTools::CST_Load.") @@ -80,7 +81,7 @@ CST_DynBiasCorrection<- function(exp, obs, method = 'QUANT', wetday=FALSE, "as output by CSTools::CST_Load.") } exp$data <- DynBiasCorrection(exp = exp$data, obs = obs$data, method = method, - wetday=wetday, + wetday = wetday, proxy = proxy, quanti = quanti, ncores = ncores) return(exp) } @@ -106,30 +107,30 @@ CST_DynBiasCorrection<- function(exp, obs, method = 'QUANT', wetday=FALSE, #'@references Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., #'and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large #'scale atmospheric predictability.Nature Communications, 10(1), 1316. -#'DOI = https://doi.org/10.1038/s41467-019-09305-8 " +#'\doi{10.1038/s41467-019-09305-8}" #'@references Faranda, D., Gabriele Messori and Pascal Yiou. (2017). #' Dynamical proxies of North Atlantic predictability and extremes. #' Scientific Reports, 7-41278, 2017. #' -#'@param exp a multidimensional array with named dimensions with the -#'experiment data -#'@param obs a multidimensional array with named dimensions with the -#'observation data -#'@param method a character string indicating the method to apply bias -#'correction among these ones: -#'"PTF","RQUANT","QUANT","SSPLIN" -#'@param wetday logical indicating whether to perform wet day correction -#'or not OR a numeric threshold below which all values are set to zero (by -#'default is set to 'FALSE'). -#'@param proxy a character string indicating the proxy for local dimension -#''dim' or inverse of persistence 'theta' to apply the dynamical conditioned -#'bias correction method. -#'@param quanti a number lower than 1 indicating the quantile to perform the -#'computation of local dimension and theta -#'@param ncores The number of cores to use in parallel computation +#'@param exp A multidimensional array with named dimensions with the +#' experiment data. +#'@param obs A multidimensional array with named dimensions with the +#' observation data. +#'@param method A character string indicating the method to apply bias +#' correction among these ones: +#' "PTF", "RQUANT", "QUANT", "SSPLIN". +#'@param wetday Logical indicating whether to perform wet day correction +#' or not OR a numeric threshold below which all values are set to zero (by +#' default is set to 'FALSE'). +#'@param proxy A character string indicating the proxy for local dimension +#' 'dim' or inverse of persistence 'theta' to apply the dynamical conditioned +#' bias correction method. +#'@param quanti A number lower than 1 indicating the quantile to perform the +#' computation of local dimension and theta. +#'@param ncores The number of cores to use in parallel computation. #' -#'@return a multidimensional array with named dimensions with a bias correction -#'performed conditioned by local dimension 'dim' or inverse of persistence 'theta' +#'@return A multidimensional array with named dimensions with a bias correction +#'performed conditioned by local dimension 'dim' or inverse of persistence 'theta'. #' #'@import multiApply #'@importFrom ClimProjDiags Subset @@ -140,7 +141,7 @@ CST_DynBiasCorrection<- function(exp, obs, method = 'QUANT', wetday=FALSE, #'obsL <- c(rnorm(1:1980),expL[1,,]*1.2) #'dim (obsL) <- c(time = 100,lat = 4, lon = 5) #'dynbias <- DynBiasCorrection(exp = expL, obs = obsL, method='QUANT', -#' proxy= "dim", quanti = 0.6) +#' proxy= "dim", quanti = 0.6) #'@export DynBiasCorrection<- function(exp, obs, method = 'QUANT',wetday=FALSE, proxy = "dim", quanti, ncores = NULL){ @@ -168,15 +169,15 @@ DynBiasCorrection<- function(exp, obs, method = 'QUANT',wetday=FALSE, predyn.exp <- Predictability(dim = attractor.exp$dim, theta = attractor.exp$theta) - if (!(any(names(dim(exp)) %in% 'time'))){ - if (any(names(dim(exp)) %in% 'sdate')) { - if (any(names(dim(exp)) %in% 'ftime')) { - exp <- MergeDims(exp, merge_dims = c('ftime', 'sdate'), - rename_dim = 'time') + if (!(any(names(dim(exp)) %in% 'time'))) { + if (any(names(dim(exp)) %in% 'sdate')) { + if (any(names(dim(exp)) %in% 'ftime')) { + exp <- MergeDims(exp, merge_dims = c('ftime', 'sdate'), + rename_dim = 'time') + } } - } } - if (!(any(names(dim(obs)) %in% 'time'))){ + if (!(any(names(dim(obs)) %in% 'time'))) { if (any(names(dim(obs)) %in% 'sdate')) { if (any(names(dim(obs)) %in% 'ftime')) { obs <- MergeDims(obs, merge_dims = c('ftime', 'sdate'), @@ -221,25 +222,25 @@ DynBiasCorrection<- function(exp, obs, method = 'QUANT',wetday=FALSE, stop ("Parameter 'proxy' must be set as 'dim' or 'theta'.") } - if(any(names(dim(adjusted)) %in% 'memberObs')){ - if(dim(adjusted)['memberObs'] == 1){ - adjusted <- Subset(adjusted,along='memberObs',indices=1,drop = 'selected') - }else{ - print('Dimension member in obs changed to memberObs') + if (any(names(dim(adjusted)) %in% 'memberObs')) { + if (dim(adjusted)['memberObs'] == 1) { + adjusted <- Subset(adjusted, along = 'memberObs', indices=1, drop = 'selected') + } else { + print('Dimension member in obs changed to memberObs') } } - if(any(names(dim(adjusted)) %in% 'datasetObs')){ - if(dim(adjusted)['datasetObs'] == 1){ - adjusted <- Subset(adjusted,along='datasetObs',indices=1,drop = 'selected') - }else{ + if (any(names(dim(adjusted)) %in% 'datasetObs')) { + if (dim(adjusted)['datasetObs'] == 1) { + adjusted <- Subset(adjusted, along = 'datasetObs', indices = 1, drop = 'selected') + } else { print('Dimension dataset in obs changed to datasetObs') } } return(adjusted) } -.dynbias <- function(exp, obs, method, wetday,predyn.exp, predyn.obs) { +.dynbias <- function(exp, obs, method, wetday, predyn.exp, predyn.obs) { result <- array(rep(NA, length(exp))) res <- lapply(1:3, function(x) { exp_sub <- exp[predyn.exp[[x]]] @@ -250,7 +251,7 @@ DynBiasCorrection<- function(exp, obs, method = 'QUANT',wetday=FALSE, }) return(result) } -.qbiascorrection <- function(expX, obsX, method,wetday) { +.qbiascorrection <- function(expX, obsX, method, wetday) { ## functions fitQmap and doQmap if (method == "PTF") { qm.fit <- fitQmap(obsX, expX, method = "PTF", transfun = "expasympt", diff --git a/R/CST_EnsClustering.R b/R/CST_EnsClustering.R index 6b1f335e96df89b77c6e0ff8371fff2bc93a53ea..ebebc1fdaf7a65b3a47cfe9046c3abdecf8bb682 100644 --- a/R/CST_EnsClustering.R +++ b/R/CST_EnsClustering.R @@ -1,177 +1,211 @@ -#' @rdname CST_EnsClustering -#' @title Ensemble clustering +#'@rdname CST_EnsClustering +#'@title Ensemble clustering #' -#' @author Federico Fabiano - ISAC-CNR, \email{f.fabiano@isac.cnr.it} -#' @author Ignazio Giuntoli - ISAC-CNR, \email{i.giuntoli@isac.cnr.it} -#' @author Danila Volpi - ISAC-CNR, \email{d.volpi@isac.cnr.it} -#' @author Paolo Davini - ISAC-CNR, \email{p.davini@isac.cnr.it} -#' @author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} +#'@author Federico Fabiano - ISAC-CNR, \email{f.fabiano@isac.cnr.it} +#'@author Ignazio Giuntoli - ISAC-CNR, \email{i.giuntoli@isac.cnr.it} +#'@author Danila Volpi - ISAC-CNR, \email{d.volpi@isac.cnr.it} +#'@author Paolo Davini - ISAC-CNR, \email{p.davini@isac.cnr.it} +#'@author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} #' -#' @description This function performs a clustering on members/starting dates -#' and returns a number of scenarios, with representative members for each of them. -#' The clustering is performed in a reduced EOF space. +#'@description This function performs a clustering on members/starting dates +#'and returns a number of scenarios, with representative members for each of +#'them. The clustering is performed in a reduced EOF space. #' -#' Motivation: -#' Ensemble forecasts give a probabilistic insight of average weather conditions -#' on extended timescales, i.e. from sub-seasonal to seasonal and beyond. -#' With large ensembles, it is often an advantage to be able to group members -#' according to similar characteristics and to select the most representative member for each cluster. -#' This can be useful to characterize the most probable forecast scenarios in a multi-model -#' (or single model) ensemble prediction. This approach, applied at a regional level, -#' can also be used to identify the subset of ensemble members that best represent the -#' full range of possible solutions for downscaling applications. -#' The choice of the ensemble members is made flexible in order to meet the requirements -#' of specific (regional) climate information products, to be tailored for different regions and user needs. +#'Motivation: +#'Ensemble forecasts give a probabilistic insight of average weather conditions +#'on extended timescales, i.e. from sub-seasonal to seasonal and beyond. +#'With large ensembles, it is often an advantage to be able to group members +#'according to similar characteristics and to select the most representative +#'member for each cluster. This can be useful to characterize the most probable +#'forecast scenarios in a multi-model (or single model) ensemble prediction. +#'This approach, applied at a regional level, can also be used to identify the +#'subset of ensemble members that best represent the full range of possible +#'solutions for downscaling applications. The choice of the ensemble members is +#'made flexible in order to meet the requirements of specific (regional) climate +#'information products, to be tailored for different regions and user needs. #' -#' Description of the tool: -#' EnsClustering is a cluster analysis tool, based on the k-means algorithm, for ensemble predictions. -#' The aim is to group ensemble members according to similar characteristics and -#' to select the most representative member for each cluster. -#' The user chooses which feature of the data is used to group the ensemble members by clustering: -#' time mean, maximum, a certain percentile (e.g., 75% as in the examples below), -#' standard deviation and trend over the time period. For each ensemble member this value -#' is computed at each grid point, obtaining N lat-lon maps, where N is the number of ensemble members. -#' The anomaly is computed subtracting the ensemble mean of these maps to each of the single maps. -#' The anomaly is therefore computed with respect to the ensemble members (and not with respect to the time) -#' and the Empirical Orthogonal Function (EOF) analysis is applied to these anomaly maps. -#' Regarding the EOF analysis, the user can choose either how many Principal Components (PCs) -#' to retain or the percentage of explained variance to keep. After reducing dimensionality via -#' EOF analysis, k-means analysis is applied using the desired subset of PCs. +#'Description of the tool: +#'EnsClustering is a cluster analysis tool, based on the k-means algorithm, for +#'ensemble predictions. The aim is to group ensemble members according to +#'similar characteristics and to select the most representative member for each +#'cluster. The user chooses which feature of the data is used to group the +#'ensemble members by clustering: time mean, maximum, a certain percentile +#'(e.g., 75% as in the examples below), standard deviation and trend over the +#'time period. For each ensemble member this value is computed at each grid +#'point, obtaining N lat-lon maps, where N is the number of ensemble members. +#'The anomaly is computed subtracting the ensemble mean of these maps to each of +#'the single maps. The anomaly is therefore computed with respect to the +#'ensemble members (and not with respect to the time) and the Empirical +#'Orthogonal Function (EOF) analysis is applied to these anomaly maps. Regarding +#'the EOF analysis, the user can choose either how many Principal Components +#'(PCs) to retain or the percentage of explained variance to keep. After +#'reducing dimensionality via EOF analysis, k-means analysis is applied using +#'the desired subset of PCs. #' -#' The major final outputs are the classification in clusters, i.e. which member belongs -#' to which cluster (in k-means analysis the number k of clusters needs to be defined -#' prior to the analysis) and the most representative member for each cluster, -#' which is the closest member to the cluster centroid. -#' Other outputs refer to the statistics of clustering: in the PC space, the minimum and -#' the maximum distance between a member in a cluster and the cluster centroid -#' (i.e. the closest and the furthest member), the intra-cluster standard -#' deviation for each cluster (i.e. how much the cluster is compact). +#'The major final outputs are the classification in clusters, i.e. which member +#'belongs to which cluster (in k-means analysis the number k of clusters needs +#'to be defined prior to the analysis) and the most representative member for +#'each cluster, which is the closest member to the cluster centroid. Other +#'outputs refer to the statistics of clustering: in the PC space, the minimum +#'and the maximum distance between a member in a cluster and the cluster +#'centroid (i.e. the closest and the furthest member), the intra-cluster +#'standard deviation for each cluster (i.e. how much the cluster is compact). #' -#' @param exp An object of the class 's2dv_cube', containing the variables to be analysed. -#' Each data object in the list is expected to have an element named \code{$data} with at least two -#' spatial dimensions named "lon" and "lat", and dimensions "dataset", "member", "ftime", "sdate". -#' @param time_moment Decides the moment to be applied to the time dimension. Can be either 'mean' (time mean), -#' 'sd' (standard deviation along time) or 'perc' (a selected percentile on time). -#' If 'perc' the keyword 'time_percentile' is also used. -#' @param time_percentile Set the percentile in time you want to analyse (used for `time_moment = "perc"). -#' @param numclus Number of clusters (scenarios) to be calculated. -#' If set to NULL the number of ensemble members divided by 10 is used, with a minimum of 2 and a maximum of 8. -#' @param lon_lim List with the two longitude margins in `c(-180,180)` format. -#' @param lat_lim List with the two latitude margins. -#' @param variance_explained variance (percentage) to be explained by the set of EOFs. -#' Defaults to 80. Not used if numpcs is specified. -#' @param numpcs Number of EOFs retained in the analysis (optional). -#' @param cluster_dim Dimension along which to cluster. Typically "member" or "sdate". -#' This can also be a list like c("member", "sdate"). -#' @param time_dim String or character array with name(s) of dimension(s) over which to compute statistics. -#' If omitted c("ftime", "sdate", "time") are searched in this order. -#' @param verbose Logical for verbose output -#' @return A list with elements \code{$cluster} (cluster assigned for each member), -#' \code{$freq} (relative frequency of each cluster), \code{$closest_member} -#' (representative member for each cluster), \code{$repr_field} (list of fields -#' for each representative member), \code{composites} (list of mean fields for each cluster), -#' \code{$lon} (selected longitudes of output fields), -#' \code{$lat} (selected longitudes of output fields). -#' @examples -#'\donttest{ -#' exp <- lonlat_temp$exp -#' # Example 1: Cluster on all start dates, members and models -#' res <- CST_EnsClustering(exp, numclus = 3, -#' cluster_dim = c("member", "dataset", "sdate")) -#' iclus <- res$cluster[2, 1, 3] -#' -#' #print(paste("Cluster of 2. member, 1. dataset, 3. sdate:", iclus)) -#' #print(paste("Frequency (numerosity) of cluster (", iclus, ") :", res$freq[iclus])) -#' s2dv::PlotEquiMap(res$repr_field[iclus, , ], exp$lon, exp$lat, -#' filled.continents = FALSE, -#' toptitle = paste("Representative field of cluster", iclus)) -#' -#' # Example 2: Cluster on members retaining 4 EOFs during -#' # preliminary dimensional reduction -#' res <- CST_EnsClustering(exp, numclus = 3, numpcs = 4, cluster_dim = "member") -#' -#' # Example 3: Cluster on members, retain 80% of variance during -#' # preliminary dimensional reduction -#' res <- CST_EnsClustering(exp, numclus = 3, variance_explained = 80, -#' cluster_dim = "member") -#' -#' # Example 4: Compute percentile in time -#' res <- CST_EnsClustering(exp, numclus = 3, time_percentile = 90, -#' time_moment = "perc", cluster_dim = "member") -#'} +#'@param exp An object of the class 's2dv_cube', containing the variables to be +#' analysed. The element 'data' in the 's2dv_cube' object must have, at +#' least, spatial and temporal dimensions. Latitudinal dimension accepted +#' names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal +#' dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'. +#'@param time_moment Decides the moment to be applied to the time dimension. Can +#' be either 'mean' (time mean), 'sd' (standard deviation along time) or 'perc' +#' (a selected percentile on time). If 'perc' the keyword 'time_percentile' is +#' also used. +#'@param time_percentile Set the percentile in time you want to analyse (used +#' for `time_moment = "perc"). +#'@param numclus Number of clusters (scenarios) to be calculated. If set to NULL +#' the number of ensemble members divided by 10 is used, with a minimum of 2 +#' and a maximum of 8. +#'@param lon_lim List with the two longitude margins in `c(-180,180)` format. +#'@param lat_lim List with the two latitude margins. +#'@param variance_explained variance (percentage) to be explained by the set of +#' EOFs. Defaults to 80. Not used if numpcs is specified. +#'@param numpcs Number of EOFs retained in the analysis (optional). +#'@param cluster_dim Dimension along which to cluster. Typically "member" or +#' "sdate". This can also be a list like c("member", "sdate"). +#'@param time_dim String or character array with name(s) of dimension(s) over +#' which to compute statistics. If omitted c("ftime", "sdate", "time") are +#' searched in this order. +#'@param verbose Logical for verbose output +#'@return A list with elements \code{$cluster} (cluster assigned for each +#'member), \code{$freq} (relative frequency of each cluster), +#'\code{$closest_member} (representative member for each cluster), +#'\code{$repr_field} (list of fields for each representative member), +#'\code{composites} (list of mean fields for each cluster), \code{$lon} +#'(selected longitudes of output fields), \code{$lat} (selected longitudes of +#'output fields). +#'@examples +#'dat_exp <- array(abs(rnorm(1152))*275, dim = c(dataset = 1, member = 4, +#' sdate = 6, ftime = 3, +#' lat = 4, lon = 4)) +#'lon <- seq(0, 3) +#'lat <- seq(48, 45) +#'coords <- list(lon = lon, lat = lat) +#'exp <- list(data = dat_exp, coords = coords) +#'attr(exp, 'class') <- 's2dv_cube' +#'res <- CST_EnsClustering(exp = exp, numclus = 3, +#' cluster_dim = c("sdate")) #' #'@export CST_EnsClustering <- function(exp, time_moment = "mean", numclus = NULL, - lon_lim = NULL, lat_lim = NULL, - variance_explained = 80, numpcs = NULL, time_dim = NULL, - time_percentile = 90, cluster_dim = "member", - verbose = F) { + lon_lim = NULL, lat_lim = NULL, + variance_explained = 80, numpcs = NULL, + time_dim = NULL, time_percentile = 90, + cluster_dim = "member", verbose = F) { + + # Check 's2dv_cube' if (!inherits(exp, "s2dv_cube")) { stop("Parameter 'exp' must be of the class 's2dv_cube', ", "as output by CSTools::CST_Load.") } + # Check 'exp' object structure + if (!all(c('data', 'coords') %in% names(exp))) { + stop("Parameter 'exp' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + } + # Check coordinates + if (!any(names(exp$coords) %in% .KnownLonNames()) | + !any(names(exp$coords) %in% .KnownLatNames())) { + stop("Spatial coordinate names do not match any of the names accepted by ", + "the package. Latitudes accepted names: 'lat', 'lats', 'latitude',", + " 'y', 'j', 'nav_lat'. Longitudes accepted names: 'lon', 'lons',", + " 'longitude', 'x', 'i', 'nav_lon'.") + } - result <- EnsClustering(exp$data, exp$lat, exp$lon, - time_moment = time_moment, numclus = numclus, - lon_lim = lon_lim, lat_lim = lat_lim, - variance_explained = variance_explained, numpcs = numpcs, - time_percentile = time_percentile, time_dim = time_dim, - cluster_dim = cluster_dim, verbose = verbose) + lon_name <- names(exp$coords)[[which(names(exp$coords) %in% .KnownLonNames())]] + lat_name <- names(exp$coords)[[which(names(exp$coords) %in% .KnownLatNames())]] + + result <- EnsClustering(exp$data, + lat = as.vector(exp$coords[[lat_name]]), + lon = as.vector(exp$coords[[lon_name]]), + time_moment = time_moment, numclus = numclus, + lon_lim = lon_lim, lat_lim = lat_lim, + variance_explained = variance_explained, + numpcs = numpcs, time_percentile = time_percentile, + time_dim = time_dim, cluster_dim = cluster_dim, + verbose = verbose) return(result) } - -#' @rdname EnsClustering -#' @title Ensemble clustering +#'@rdname EnsClustering +#'@title Ensemble clustering #' -#' @author Federico Fabiano - ISAC-CNR, \email{f.fabiano@isac.cnr.it} -#' @author Ignazio Giuntoli - ISAC-CNR, \email{i.giuntoli@isac.cnr.it} -#' @author Danila Volpi - ISAC-CNR, \email{d.volpi@isac.cnr.it} -#' @author Paolo Davini - ISAC-CNR, \email{p.davini@isac.cnr.it} -#' @author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} +#'@author Federico Fabiano - ISAC-CNR, \email{f.fabiano@isac.cnr.it} +#'@author Ignazio Giuntoli - ISAC-CNR, \email{i.giuntoli@isac.cnr.it} +#'@author Danila Volpi - ISAC-CNR, \email{d.volpi@isac.cnr.it} +#'@author Paolo Davini - ISAC-CNR, \email{p.davini@isac.cnr.it} +#'@author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} #' -#' @description This function performs a clustering on members/starting dates -#' and returns a number of scenarios, with representative members for each of them. -#' The clustering is performed in a reduced EOF space. +#'@description This function performs a clustering on members/starting dates +#'and returns a number of scenarios, with representative members for each of +#'them. The clustering is performed in a reduced EOF space. #' -#' @param data A matrix of dimensions 'dataset member sdate ftime lat lon' containing the variables to be analysed. -#' @param lat Vector of latitudes. -#' @param lon Vector of longitudes. -#' @param time_moment Decides the moment to be applied to the time dimension. Can be either 'mean' (time mean), -#' 'sd' (standard deviation along time) or 'perc' (a selected percentile on time). -#' If 'perc' the keyword 'time_percentile' is also used. -#' @param time_percentile Set the percentile in time you want to analyse (used for `time_moment = "perc"). -#' @param numclus Number of clusters (scenarios) to be calculated. -#' If set to NULL the number of ensemble members divided by 10 is used, with a minimum of 2 and a maximum of 8. -#' @param lon_lim List with the two longitude margins in `c(-180,180)` format. -#' @param lat_lim List with the two latitude margins. -#' @param variance_explained variance (percentage) to be explained by the set of EOFs. -#' Defaults to 80. Not used if numpcs is specified. -#' @param numpcs Number of EOFs retained in the analysis (optional). -#' @param cluster_dim Dimension along which to cluster. Typically "member" or "sdate". -#' This can also be a list like c("member", "sdate"). -#' @param time_dim String or character array with name(s) of dimension(s) over which to compute statistics. -#' If omitted c("ftime", "sdate", "time") are searched in this order. -#' @param verbose Logical for verbose output -#' @return A list with elements \code{$cluster} (cluster assigned for each member), -#' \code{$freq} (relative frequency of each cluster), \code{$closest_member} -#' (representative member for each cluster), \code{$repr_field} (list of fields -#' for each representative member), \code{composites} (list of mean fields for each cluster), -#' \code{$lon} (selected longitudes of output fields), -#' \code{$lat} (selected longitudes of output fields). +#'@param data A matrix of dimensions 'dataset member sdate ftime lat lon' +#' containing the variables to be analysed. Latitudinal dimension accepted +#' names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal +#' dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'. +#'@param lat Vector of latitudes. +#'@param lon Vector of longitudes. +#'@param time_moment Decides the moment to be applied to the time dimension. Can +#' be either 'mean' (time mean), 'sd' (standard deviation along time) or 'perc' +#' (a selected percentile on time). If 'perc' the keyword 'time_percentile' is +#' also used. +#'@param time_percentile Set the percentile in time you want to analyse (used +#' for `time_moment = "perc"). +#'@param numclus Number of clusters (scenarios) to be calculated. If set to NULL +#' the number of ensemble members divided by 10 is used, with a minimum of 2 +#' and a maximum of 8. +#'@param lon_lim List with the two longitude margins in `c(-180,180)` format. +#'@param lat_lim List with the two latitude margins. +#'@param variance_explained variance (percentage) to be explained by the set of +#' EOFs. Defaults to 80. Not used if numpcs is specified. +#'@param numpcs Number of EOFs retained in the analysis (optional). +#'@param cluster_dim Dimension along which to cluster. Typically "member" or +#' "sdate". This can also be a list like c("member", "sdate"). +#'@param time_dim String or character array with name(s) of dimension(s) over +#' which to compute statistics. If omitted c("ftime", "sdate", "time") are +#' searched in this order. +#'@param verbose Logical for verbose output +#'@return A list with elements \code{$cluster} (cluster assigned for each member), +#'\code{$freq} (relative frequency of each cluster), \code{$closest_member} +#'(representative member for each cluster), \code{$repr_field} (list of fields for +#'each representative member), \code{composites} (list of mean fields for each +#'cluster), \code{$lon} (selected longitudes of output fields), \code{$lat} +#'(selected longitudes of output fields). +#' +#'@examples +#'exp <- array(abs(rnorm(1152))*275, dim = c(dataset = 1, member = 4, +#' sdate = 6, ftime = 3, +#' lat = 4, lon = 4)) +#'lon <- seq(0, 3) +#'lat <- seq(48, 45) +#'res <- EnsClustering(exp, lat = lat, lon = lon, numclus = 2, +#' cluster_dim = c("member", "dataset", "sdate")) #' -#' @examples -#'\donttest{ -#' exp <- lonlat_temp$exp -#' res <- EnsClustering(exp$data, exp$lat, exp$lon, numclus = 3, -#' cluster_dim = c("member", "dataset", "sdate")) -#'} #'@export - EnsClustering <- function(data, lat, lon, time_moment = "mean", numclus = NULL, - lon_lim = NULL, lat_lim = NULL, variance_explained = 80, - numpcs = NULL, time_percentile = 90, time_dim = NULL, - cluster_dim = "member", verbose = T) { + lon_lim = NULL, lat_lim = NULL, variance_explained = 80, + numpcs = NULL, time_percentile = 90, time_dim = NULL, + cluster_dim = "member", verbose = T) { + + # Know spatial coordinates names + if (!any(names(dim(data)) %in% .KnownLonNames()) | + !any(names(dim(data)) %in% .KnownLatNames())) { + stop("Spatial coordinate names do not match any of the names accepted by ", + "the package.") + } + + lon_name <- names(dim(data))[[which(names(dim(data)) %in% .KnownLonNames())]] + lat_name <- names(dim(data))[[which(names(dim(data)) %in% .KnownLatNames())]] # Check/detect time_dim if (is.null(time_dim)) { @@ -210,14 +244,14 @@ EnsClustering <- function(data, lat, lon, time_moment = "mean", numclus = NULL, } # Repeatedly apply .ensclus - result <- Apply(exp, target_dims = c(cluster_dim, "lat", "lon"), .ensclus, + result <- Apply(exp, target_dims = c(cluster_dim, lat_name, lon_name), .ensclus, lat, lon, numclus = numclus, lon_lim = lon_lim, lat_lim = lat_lim, variance_explained = variance_explained, numpcs = numpcs, verbose = verbose) # Expand result$closest_member into indices in cluster_dim dimensions - cm=result$closest_member + cm = result$closest_member cml <- vector(mode = "list", length = length(cluster_dim)) cum <- cm * 0 dim_cd <- dim(exp)[cluster_dim] @@ -229,7 +263,10 @@ EnsClustering <- function(data, lat, lon, time_moment = "mean", numclus = NULL, names(cml) <- cluster_dim result$closest_member <- cml - return(append(result, list(lat = lat, lon = lon))) + result[[lon_name]] <- lon + result[[lat_name]] <- lat + + return(result) } # Atomic ensclus function diff --git a/R/CST_InsertDim.R b/R/CST_InsertDim.R new file mode 100644 index 0000000000000000000000000000000000000000..f765b15d64743c023d697afa9878b699fc4f2141 --- /dev/null +++ b/R/CST_InsertDim.R @@ -0,0 +1,71 @@ +#'Add a named dimension to an object of class s2dv_cube +#' +#'Insert an extra dimension into an array at position 'posdim' with length +#''lendim'. The array in \code{data} repeats along the new dimension. +#'The dimensions, coordinates and attributes are modified accordingly. +#' +#'@author Agudetse Roures Victoria, \email{victoria.agudetse@bsc.es} +#' +#'@param data An object of class \code{s2dv_cube} to which the additional +#' dimension should be added. +#'@param posdim An integer indicating the position of the new dimension. +#'@param lendim An integer indicating the length of the new dimension. +#'@param name A character string indicating the name for the new dimension. +#'@param values A vector containing the values of the new dimension and any +#' relevant attributes. If NULL, a sequence of integers from 1 to lendim will +#' be added. +#' +#'@return An object of class \code{s2dv_cube} with similar data, coordinates and +#'attributes as the \code{data} input, but with an additional dimension. +#' +#'@examples +#'#Example with sample data: +#'# Check original dimensions and coordinates +#'lonlat_temp$exp$dims +#'names(lonlat_temp$exp$coords) +#'# Add 'variable' dimension +#'exp <- CST_InsertDim(lonlat_temp$exp, +#' posdim = 2, +#' lendim = 1, +#' name = "variable", +#' values = c("tas")) +#'# Check new dimensions and coordinates +#'exp$dims +#'exp$coords$variable +#' +#'@seealso \link[s2dv]{InsertDim} +#' +#'@importFrom s2dv InsertDim +#'@export +CST_InsertDim <- function(data, posdim, lendim, name, values = NULL) { + # Check inputs + # Check 's2dv_cube' + if (!inherits(data, 's2dv_cube')) { + stop("Parameter 'data' must be of the class 's2dv_cube'.") + } + # Check name + if (!is.character(name) || length(name) > 1) { + stop("Parameter 'name' must be a character string") + } + # Check values + if (is.null(values)) { + warning(paste0("Parameter 'values' is not provided. Adding a sequence of ", + "integers from 1 to 'lendim' as the values for the new dimension.")) + values <- 1:lendim + } else { + if (!(length(values) == lendim)) { + stop(paste0("The length of the parameter 'values' must be consistent", + "with the parameter 'lendim'.")) + } + } + + # Insert dim in data + data$data <- s2dv::InsertDim(data$data, posdim = posdim, lendim = lendim, + name = name) + # Adjust dimensions + data$dims <- dim(data$data) + # Adjust coordinates + data$coords[[name]] <- values + data$coords <- data$coords[names(data$dims)] + return(data) +} diff --git a/R/CST_MergeDims.R b/R/CST_MergeDims.R index 3bd38a81f97c2a7f59b344e9b9f4a0773d6c1995..a1ecbd156b4b72f6d72efbc64c8fa4bd2e389f3c 100644 --- a/R/CST_MergeDims.R +++ b/R/CST_MergeDims.R @@ -2,126 +2,133 @@ #' #'@author Nuria Perez-Zanon, \email{nuria.perez@bsc.es} #' -#'@description This function merges two dimensions of the array \code{data} in a 's2dv_cube' object into one. The user can select the dimensions to merge and provide the final name of the dimension. The user can select to remove NA values or keep them. +#'@description This function merges two dimensions of the array \code{data} in a +#''s2dv_cube' object into one. The user can select the dimensions to merge and +#'provide the final name of the dimension. The user can select to remove NA +#'values or keep them. #' -#'@param data a 's2dv_cube' object -#'@param merge_dims a character vector indicating the names of the dimensions to merge -#'@param rename_dim a character string indicating the name of the output dimension. If left at NULL, the first dimension name provided in parameter \code{merge_dims} will be used. -#'@param na.rm a logical indicating if the NA values should be removed or not. +#'@param data An 's2dv_cube' object +#'@param merge_dims A character vector indicating the names of the dimensions to +#' merge. +#'@param rename_dim a character string indicating the name of the output +#' dimension. If left at NULL, the first dimension name provided in parameter +#' \code{merge_dims} will be used. +#'@param na.rm A logical indicating if the NA values should be removed or not. #' #'@import abind #'@importFrom ClimProjDiags Subset #'@examples -#' #'data <- 1 : c(2 * 3 * 4 * 5 * 6 * 7) #'dim(data) <- c(time = 7, lat = 2, lon = 3, monthly = 4, member = 6, #' dataset = 5, var = 1) #'data[2,,,,,,] <- NA #'data[c(3,27)] <- NA -#'data <-list(data = data) +#'data <- list(data = data) #'class(data) <- 's2dv_cube' #'new_data <- CST_MergeDims(data, merge_dims = c('time', 'monthly')) -#'dim(new_data$data) #'new_data <- CST_MergeDims(data, merge_dims = c('lon', 'lat'), rename_dim = 'grid') -#'dim(new_data$data) #'new_data <- CST_MergeDims(data, merge_dims = c('time', 'monthly'), na.rm = TRUE) -#'dim(new_data$data) #'@export -CST_MergeDims <- function(data, merge_dims = c('ftime', 'monthly'), rename_dim = NULL, - na.rm = FALSE) { - if (!inherits(data, 's2dv_cube')) { +CST_MergeDims <- function(data, merge_dims = c('ftime', 'monthly'), + rename_dim = NULL, na.rm = FALSE) { + # Check 's2dv_cube' + if (!inherits(data, 's2dv_cube')) { stop("Parameter 'data' must be of the class 's2dv_cube', ", "as output by CSTools::CST_Load.") - } - data$data <- MergeDims(data$data, merge_dims = merge_dims, - rename_dim = rename_dim, na.rm = na.rm) - return(data) + } + data$data <- MergeDims(data$data, merge_dims = merge_dims, + rename_dim = rename_dim, na.rm = na.rm) + return(data) } #'Function to Split Dimension #' #'@author Nuria Perez-Zanon, \email{nuria.perez@bsc.es} #' -#'@description This function merges two dimensions of an array into one. The user can select the dimensions to merge and provide the final name of the dimension. The user can select to remove NA values or keep them. +#'@description This function merges two dimensions of an array into one. The +#'user can select the dimensions to merge and provide the final name of the +#'dimension. The user can select to remove NA values or keep them. #' -#'@param data an n-dimensional array with named dimensions -#'@param merge_dims a character vector indicating the names of the dimensions to merge -#'@param rename_dim a character string indicating the name of the output dimension. If left at NULL, the first dimension name provided in parameter \code{merge_dims} will be used. -#'@param na.rm a logical indicating if the NA values should be removed or not. +#'@param data An n-dimensional array with named dimensions +#'@param merge_dims A character vector indicating the names of the dimensions to +#' merge. +#'@param rename_dim A character string indicating the name of the output +#' dimension. If left at NULL, the first dimension name provided in parameter +#' \code{merge_dims} will be used. +#'@param na.rm A logical indicating if the NA values should be removed or not. #' #'@import abind #'@importFrom ClimProjDiags Subset #'@examples -#' #'data <- 1 : 20 #'dim(data) <- c(time = 10, lat = 2) #'new_data <- MergeDims(data, merge_dims = c('time', 'lat')) #'@export -MergeDims <- function(data, merge_dims = c('time', 'monthly'), rename_dim = NULL, - na.rm = FALSE) { - # check data - if (is.null(data)) { - stop("Parameter 'data' cannot be NULL.") - } - if (is.null(dim(data))) { - stop("Parameter 'data' must have dimensions.") - } - if (is.null(names(dim(data)))) { - stop("Parameter 'data' must have dimension names.") - } - dims <- dim(data) - # check merge_dims - if (is.null(merge_dims)) { - stop("Parameter 'merge_dims' cannot be NULL.") - } - if (!is.character(merge_dims)) { - stop("Parameter 'merge_dims' must be a character vector ", - "indicating the names of the dimensions to be merged.") - } - if (length(merge_dims) > 2) { - warning("Only two dimensions can be merge, only the first two ", - "dimension will be used. To merge further dimensions ", - "consider to use this function multiple times.") - merge_dims <- merge_dims[1 : 2] - } else if (length(merge_dims) < 2) { - stop("Parameter 'merge_dims' must be of length two.") - } - if (is.null(rename_dim)) { - rename_dim <- merge_dims[1] - } - if (length(rename_dim) > 1) { - warning("Parameter 'rename_dim' has length greater than 1 ", - "and only the first element will be used.") - rename_dim <- as.character(rename_dim[1]) - } - if (!any(names(dims) %in% merge_dims)) { - stop("Parameter 'merge_dims' must match with dimension ", - "names in parameter 'data'.") - } - pos1 <- which(names(dims) == merge_dims[1]) - pos2 <- which(names(dims) == merge_dims[2]) - if (length(pos1) == 0 | length(pos2) == 0) { - stop("Parameter 'merge_dims' must match with dimension ", - "names in parameter 'data'.") - } - if (pos1 > pos2) { - pos1 <- pos1 - 1 - } - data <- lapply(1 : dims[pos2], function(x) {Subset(data, along = pos2, - indices = x, drop = 'selected')}) - data <- abind(data, along = pos1) - names(dim(data)) <- names(dims)[-pos2] - if (!is.null(rename_dim)) { - names(dim(data))[pos1] <- rename_dim - } - if (na.rm) { - nas <- which(is.na(Subset(data, along = -pos1, indices = 1))) - if (length(nas) != 0) { - nas <- unlist(lapply(nas, function(x) { - if(all(is.na(Subset(data, along = pos1, - indices = x)))) { - return(x)}})) - data <- Subset(data, along = pos1, indices = -nas) - } +MergeDims <- function(data, merge_dims = c('time', 'monthly'), + rename_dim = NULL, na.rm = FALSE) { + # check data + if (is.null(data)) { + stop("Parameter 'data' cannot be NULL.") + } + if (is.null(dim(data))) { + stop("Parameter 'data' must have dimensions.") + } + if (is.null(names(dim(data)))) { + stop("Parameter 'data' must have dimension names.") + } + dims <- dim(data) + # check merge_dims + if (is.null(merge_dims)) { + stop("Parameter 'merge_dims' cannot be NULL.") + } + if (!is.character(merge_dims)) { + stop("Parameter 'merge_dims' must be a character vector ", + "indicating the names of the dimensions to be merged.") + } + if (length(merge_dims) > 2) { + warning("Only two dimensions can be merge, only the first two ", + "dimension will be used. To merge further dimensions ", + "consider to use this function multiple times.") + merge_dims <- merge_dims[1 : 2] + } else if (length(merge_dims) < 2) { + stop("Parameter 'merge_dims' must be of length two.") + } + if (is.null(rename_dim)) { + rename_dim <- merge_dims[1] + } + if (length(rename_dim) > 1) { + warning("Parameter 'rename_dim' has length greater than 1 ", + "and only the first element will be used.") + rename_dim <- as.character(rename_dim[1]) + } + if (!any(names(dims) %in% merge_dims)) { + stop("Parameter 'merge_dims' must match with dimension ", + "names in parameter 'data'.") + } + pos1 <- which(names(dims) == merge_dims[1]) + pos2 <- which(names(dims) == merge_dims[2]) + if (length(pos1) == 0 | length(pos2) == 0) { + stop("Parameter 'merge_dims' must match with dimension ", + "names in parameter 'data'.") + } + if (pos1 > pos2) { + pos1 <- pos1 - 1 + } + data <- lapply(1:dims[pos2], function(x) {Subset(data, along = pos2, + indices = x, drop = 'selected')}) + data <- abind(data, along = pos1) + names(dim(data)) <- names(dims)[-pos2] + if (!is.null(rename_dim)) { + names(dim(data))[pos1] <- rename_dim + } + if (na.rm) { + nas <- which(is.na(Subset(data, along = -pos1, indices = 1))) + if (length(nas) != 0) { + nas <- unlist(lapply(nas, function(x) { + if(all(is.na(Subset(data, along = pos1, + indices = x)))) { + return(x)}})) + data <- Subset(data, along = pos1, indices = -nas) } -return(data) + } + return(data) } diff --git a/R/CST_MultiEOF.R b/R/CST_MultiEOF.R index 7bdf3698ef65cc569382d5d14061a44517b31e77..bd218423f314806e36a796b34c5b7e2ff6850869 100644 --- a/R/CST_MultiEOF.R +++ b/R/CST_MultiEOF.R @@ -1,126 +1,180 @@ -#' @rdname CST_MultiEOF -#' @title EOF analysis of multiple variables +#'@rdname CST_MultiEOF +#'@title EOF analysis of multiple variables #' -#' @author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} -#' @author Paolo Davini - ISAC-CNR, \email{p.davini@isac.cnr.it} +#'@author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} +#'@author Paolo Davini - ISAC-CNR, \email{p.davini@isac.cnr.it} #' -#' @description This function performs EOF analysis over multiple variables, -#' accepting in input a list of CSTools objects. Based on Singular Value Decomposition. For each field the EOFs are computed and the corresponding PCs are standardized (unit variance, zero mean); the minimum number of principal components needed to reach the user-defined variance is retained. The function weights the input data for the latitude cosine square root. - -#' -#' @param datalist A list of objects of the class 's2dv_cube', containing the variables to be analysed. -#' Each data object in the list is expected to have an element named \code{$data} with at least two -#' spatial dimensions named "lon" and "lat", a dimension "ftime" and a dimension "sdate". -#' @param neof_composed Number of composed eofs to return in output -#' @param minvar Minimum variance fraction to be explained in first decomposition -#' @param neof_max Maximum number of single eofs considered in the first decomposition -#' @param lon_lim Vector with longitudinal range limits for the EOF calculation for all input variables -#' @param lat_lim Vector with latitudinal range limits for the EOF calculation for all input variables -#' @return A list with elements \code{$coeff} (an array of time-varying principal component coefficients), -#' \code{$variance} (a matrix of explained variances), -#' \code{eof_pattern} (a matrix of EOF patterns obtained by regression for each variable). -#' @import abind -#' @examples -#' \donttest{ -#' library(zeallot) -#' library(ClimProjDiags) -#' c(exp, obs) %<-% lonlat_temp -#' # Create three datasets (from the members) -#' exp1 <- exp -#' exp2 <- exp -#' exp3 <- exp -#' exp1$data <- Subset(exp$data, along = 2, indices = 1 : 5) -#' exp2$data <- Subset(exp$data, along = 2, indices = 6 : 10) -#' exp3$data <- Subset(exp$data, along = 2, indices = 11 : 15) -#' -#' cal <- CST_MultiEOF(list(exp1, exp2, exp3), neof_max=5, neof_composed=2) -#' str(cal) -#' # List of 3 -#' # $ coeff : num [1:3, 1:6, 1:2, 1:5] -0.312 -0.588 0.724 1.202 1.181 ... -#' # $ variance : num [1:2, 1:5] 0.413 0.239 0.352 0.27 0.389 ... -#' # $ eof_pattern: num [1:3, 1:53, 1:22, 1:2, 1:5] -1.47 -0.446 -0.656 -1.534 -0.464 ... -#' dim(cal$coeff) -#' # ftime sdate eof member -#' # 3 6 2 3 +#'@description This function performs EOF analysis over multiple variables, +#'accepting in input a list of CSTools objects. Based on Singular Value +#'Decomposition. For each field the EOFs are computed and the corresponding PCs +#'are standardized (unit variance, zero mean); the minimum number of principal +#'components needed to reach the user-defined variance is retained. The function +#'weights the input data for the latitude cosine square root. #' -#' cal <- CST_MultiEOF(list(exp1, exp2, exp3) , minvar=0.9) -#' str(cal) -#' # $ coeff : num [1:3, 1:6, 1:5, 1:5] 0.338 0.603 -0.736 -1.191 -1.198 ... -#' # $ variance : num [1:5, 1:5] 0.3903 0.2264 0.1861 0.1032 0.0379 ... -#' # $ eof_pattern: num [1:3, 1:53, 1:22, 1:5, 1:5] 1.477 0.454 0.651 1.541 0.47 ... +#'@param datalist A list of objects of the class 's2dv_cube', containing the +#' variables to be analysed. Each data object in the list is expected to have +#' an element named \code{$data} with at least two spatial dimensions named +#' "lon" and "lat", a dimension "ftime" and a dimension "sdate". Latitudinal +#' dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. +#' Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', +#' 'nav_lon'. +#'@param neof_composed Number of composed eofs to return in output. +#'@param minvar Minimum variance fraction to be explained in first decomposition. +#'@param neof_max Maximum number of single eofs considered in the first +#' decomposition. +#'@param lon_lim Vector with longitudinal range limits for the EOF calculation +#' for all input variables. +#'@param lat_lim Vector with latitudinal range limits for the EOF calculation +#' for all input variables. +#'@return A list with elements \code{$coeff} (an array of time-varying principal +#'component coefficients), \code{$variance} (a matrix of explained variances), +#'\code{eof_pattern} (a matrix of EOF patterns obtained by regression for each +#'variable). +#'@import abind +#'@examples +#'seq <- 1 : (2 * 3 * 4 * 5 * 6 * 8) +#'mod1 <- sin( 0.7 + seq )^2 + cos( seq ^ 2 * 1.22 ) +#'dim(mod1) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, +#' lon = 8) +#'mod2 <- sin( seq * 2 ) ^ 3 + cos( seq ^ 2 ) +#'dim(mod2) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, +#' lon = 8) +#'lon <- seq(0, 35, 5) +#'lat <- seq(0, 25, 5) +#'exp1 <- list(data = mod1, coords = list(lat = lat, lon = lon)) +#'exp2 <- list(data = mod2, coords = list(lat = lat, lon = lon)) +#'attr(exp1, 'class') <- 's2dv_cube' +#'attr(exp2, 'class') <- 's2dv_cube' +#'d = as.POSIXct(c("2017/01/01", "2017/01/02", "2017/01/03", "2017/01/04", +#' "2017/01/05", "2018/01/01", "2018/01/02", "2018/01/03", +#' "2018/01/04", "2018/01/05", "2019/01/01", "2019/01/02", +#' "2019/01/03", "2019/01/04", "2019/01/05", "2020/01/01", +#' "2020/01/02", "2020/01/03", "2020/01/04", "2020/01/05")) +#'exp1$attrs$Dates = d +#'exp2$attrs$Dates = d #' -#' cal <- CST_MultiEOF(list(exp1, exp2)) -#' cal <- CST_MultiEOF(list(exp1, exp2, exp3), lon_lim=c(5, 30), lat_lim=c(35, 50), neof_composed=3) -#' } -#' @export - -CST_MultiEOF <- function(datalist, - neof_max = 40, neof_composed = 5, minvar = 0.6, - lon_lim = NULL, lat_lim = NULL) { - +#'cal <- CST_MultiEOF(datalist = list(exp1, exp2), neof_composed = 2) +#'@export +CST_MultiEOF <- function(datalist, neof_max = 40, neof_composed = 5, + minvar = 0.6, lon_lim = NULL, lat_lim = NULL) { + # Check s2dv_cube if (!(all(sapply(datalist, inherits, 's2dv_cube')))) { stop("Elements of the list in parameter 'datalist' must be of the class ", "'s2dv_cube', as output by CSTools::CST_Load.") } # Check if all dims equal - adims=lapply(lapply(datalist, function(x) x$data), dim) - if( !all(apply(apply(abind(adims, along = 0), 2, duplicated), 2, sum) == - (length(adims)-1))) { + adims = lapply(lapply(datalist, function(x) x$data), dim) + if(!all(apply(apply(abind(adims, along = 0), 2, duplicated), 2, sum) == + (length(adims)-1))) { stop("Input data fields must all have the same dimensions.") } - #print("Pasting data...") exp <- abind(lapply(datalist, '[[', 'data'), along = 0) dim(exp) <- c(var = length(datalist), dim(datalist[[1]]$data)) - #print("...done") if (any(is.na(exp))) { stop("Input data contain NA values.") } - result <- MultiEOF(exp, datalist[[1]]$lon, datalist[[1]]$lat, - datalist[[1]]$Dates$start, minvar = minvar, - neof_max = neof_max, neof_composed = neof_composed, - lon_lim = lon_lim, lat_lim = lat_lim) + # Check coordinates + if (!all(c('data', 'coords', 'attrs') %in% names(datalist[[1]]))) { + stop("Parameter 'datalist' must have 'data', 'coords' and 'attrs' elements ", + "within the 's2dv_cube' structure.") + } + if (!any(names(datalist[[1]]$coords) %in% .KnownLonNames()) | + !any(names(datalist[[1]]$coords) %in% .KnownLatNames())) { + stop("Spatial coordinate names do not match any of the names accepted by the ", + "package. Latitudes accepted names: 'lat', 'lats', 'latitude', 'y', 'j', ", + "'nav_lat'. Longitudes accepted names: 'lon', 'lons', 'longitude', 'x',", + " 'i', 'nav_lon'.") + } + # Check dimensions + if (!any(names(dim(datalist[[1]]$data)) %in% .KnownLonNames()) | + !any(names(dim(datalist[[1]]$data)) %in% .KnownLatNames())) { + stop("Spatial dimension names do not match any of the names accepted by ", + "the package.") + } + + lon <- names(datalist[[1]]$coords)[[which(names(datalist[[1]]$coords) %in% .KnownLonNames())]] + lat <- names(datalist[[1]]$coords)[[which(names(datalist[[1]]$coords) %in% .KnownLatNames())]] + + lon_name <- names(dim(datalist[[1]]$data))[[which(names(dim(datalist[[1]]$data)) %in% .KnownLonNames())]] + lat_name <- names(dim(datalist[[1]]$data))[[which(names(dim(datalist[[1]]$data)) %in% .KnownLatNames())]] + + result <- MultiEOF(exp, + lon = as.vector(datalist[[1]]$coords[[lon]]), + lat = as.vector(datalist[[1]]$coords[[lat]]), + lon_dim = lon_name, lat_dim = lat_name, + time = datalist[[1]]$attrs$Dates, minvar = minvar, + neof_max = neof_max, neof_composed = neof_composed, + lon_lim = lon_lim, lat_lim = lat_lim) return(result) } - -#' @rdname MultiEOF -#' @title EOF analysis of multiple variables starting from an array (reduced version) +#'@rdname MultiEOF +#'@title EOF analysis of multiple variables starting from an array (reduced +#'version) #' -#' @author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} -#' @author Paolo Davini - ISAC-CNR, \email{p.davini@isac.cnr.it} +#'@author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} +#'@author Paolo Davini - ISAC-CNR, \email{p.davini@isac.cnr.it} #' -#' @description This function performs EOF analysis over multiple variables, accepting in input an array with a dimension \code{"var"} for each variable to analyse. Based on Singular Value Decomposition. For each field the EOFs are computed and the corresponding PCs are standardized (unit variance, zero mean); the minimum number of principal components needed to reach the user-defined variance is retained. The function weights the input data for the latitude cosine square root. +#'@description This function performs EOF analysis over multiple variables, +#'accepting in input an array with a dimension \code{"var"} for each variable to +#'analyse. Based on Singular Value Decomposition. For each field the EOFs are +#'computed and the corresponding PCs are standardized (unit variance, zero mean); +#'the minimum number of principal components needed to reach the user-defined +#'variance is retained. The function weights the input data for the latitude +#'cosine square root. #' -#' @param data A multidimensional array with dimension \code{"var"}, -#' containing the variables to be analysed. The other diemnsions follow the same structure as the -#' \code{"exp"} element of a 's2dv_cube' object. -#' @param lon Vector of longitudes. -#' @param lat Vector of latitudes. -#' @param time Vector or matrix of dates in POSIXct format. -#' @param lon_dim String with dimension name of longitudinal coordinate -#' @param lat_dim String with dimension name of latitudinal coordinate -#' @param neof_composed Number of composed eofs to return in output -#' @param minvar Minimum variance fraction to be explained in first decomposition -#' @param neof_max Maximum number of single eofs considered in the first decomposition -#' @param lon_lim Vector with longitudinal range limits for the calculation for all input variables -#' @param lat_lim Vector with latitudinal range limits for the calculation for all input variables -#' @return A list with elements \code{$coeff} (an array of time-varying principal component coefficients), -#' \code{$variance} (a matrix of explained variances), -#' \code{eof_pattern} (a matrix of EOF patterns obtained by regression for each variable). -#' @import multiApply -#' @export - +#'@param data A multidimensional array with dimension \code{"var"}, containing +#' the variables to be analysed. The other diemnsions follow the same structure +#' as the \code{"exp"} element of a 's2dv_cube' object. Latitudinal +#' dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. +#' Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', +#' 'nav_lon'. +#'@param lon Vector of longitudes. +#'@param lat Vector of latitudes. +#'@param time Vector or matrix of dates in POSIXct format. +#'@param lon_dim String with dimension name of longitudinal coordinate. +#'@param lat_dim String with dimension name of latitudinal coordinate. +#'@param neof_composed Number of composed eofs to return in output. +#'@param minvar Minimum variance fraction to be explained in first decomposition. +#'@param neof_max Maximum number of single eofs considered in the first +#' decomposition. +#'@param lon_lim Vector with longitudinal range limits for the calculation for +#' all input variables. +#'@param lat_lim Vector with latitudinal range limits for the calculation for +#' all input variables. +#'@return A list with elements \code{$coeff} (an array of time-varying principal +#'component coefficients), \code{$variance} (a matrix of explained variances), +#'\code{eof_pattern} (a matrix of EOF patterns obtained by regression for each +#'variable). +#'@examples +#'exp <- array(runif(1280)*280, dim = c(dataset = 2, member = 2, sdate = 3, +#' ftime = 3, lat = 4, lon = 4, var = 1)) +#'lon <- seq(0, 3) +#'lat <- seq(47, 44) +#'dates <- c("2000-11-01", "2000-12-01", "2001-01-01", "2001-11-01", +#' "2001-12-01", "2002-01-01", "2002-11-01", "2002-12-01", "2003-01-01") +#'Dates <- as.POSIXct(dates, format = "%Y-%m-%d") +#'dim(Dates) <- c(ftime = 3, sdate = 3) +#'cal <- MultiEOF(data = exp, lon = lon, lat = lat, time = Dates) +#'@import multiApply +#'@export MultiEOF <- function(data, lon, lat, time, - lon_dim = "lon", lat_dim = "lat", - neof_max = 40, neof_composed = 5, minvar = 0.6, - lon_lim = NULL, lat_lim = NULL) { - # Check/detect time_dim + lon_dim = "lon", lat_dim = "lat", + neof_max = 40, neof_composed = 5, minvar = 0.6, + lon_lim = NULL, lat_lim = NULL) { + + # Know spatial coordinates names + if (!any(lon_dim %in% .KnownLonNames()) | + !any(lat_dim %in% .KnownLatNames())) { + stop("Spatial coordinate names do not match any of the names accepted by ", + "the package.") + } - # reorder and group ftime and sdate together at the end in that order + # Reorder and group ftime and sdate together at the end in that order cdim0 <- dim(data) imaskt <- names(cdim0) %in% "ftime" imasks <- names(cdim0) %in% "sdate" @@ -135,37 +189,42 @@ MultiEOF <- function(data, lon, lat, time, result <- Apply(data, c("var", lon_dim, lat_dim, "samples"), .multi.eofs, lon, lat, time, neof_max = neof_max, neof_composed = neof_composed, minvar = minvar, - xlim = lon_lim, ylim = lat_lim) - + xlim = lon_lim, ylim = lat_lim, + lon_dim = lon_dim, lat_dim = lat_dim) + # Expand back samples to compacted dims dim(result$coeff) <- c(cdim[-ind], dim(result$coeff)[-1]) # Recover first lon and first lat list - dd=dim(result$lon)[1]; m=matrix(1, nrow=dd, ncol=length(dim(result$lon))); - m[1:dd]=1:dd; result$lon = result$lon[m] - dd=dim(result$lat)[1]; m=matrix(1, nrow=dd, ncol=length(dim(result$lat))); - m[1:dd]=1:dd; result$lat = result$lat[m] + dd = dim(result[[lon_dim]])[1]; m = matrix(1, nrow = dd, ncol = length(dim(result[[lon_dim]]))); + m[1:dd] = 1:dd; result[[lon_dim]] = result[[lon_dim]][m] + dd = dim(result[[lat_dim]])[1]; m = matrix(1, nrow = dd, ncol = length(dim(result[[lat_dim]]))); + m[1:dd] = 1:dd; result[[lat_dim]] = result[[lat_dim]][m] return(result) } - -#' Atomic MultiEOF -#' @param field_arr_raw an array of dimension: (n_field, lon, lat, time). -#' where n_field is the number of variables over which to calculate -#' the multi_eofs. -#' @param neof_composed Number of composed eofs to return in output -#' @param minvar Minimum variance fraction to be explained in first decomposition -#' @param neof_max Maximum number of single eofs considered in the first decomposition -#' @param xlim Vector with longitudinal range limits for the calculation -#' @param ylim Vector with latitudinal range limits for the calculation -#' @return A list with elements \code{$coeff} (an array of time-varying principal component coefficients), -#' \code{$variance} (a matrix of explained variances), -#' \code{eof_pattern} (a matrix of EOF patterns obtained by regression for each variable). -#' @noRd - -.multi.eofs <- function(field_arr_raw, lon, lat, time, - neof_max = 40, neof_composed = 5, minvar = 0.6, - xlim = NULL, ylim = NULL) { +#'Atomic MultiEOF +#'@param field_arr_raw An array of dimension: (n_field, lon, lat, time). +#' where n_field is the number of variables over which to calculate the +#' multi_eofs. +#'@param neof_composed Number of composed eofs to return in output. +#'@param minvar Minimum variance fraction to be explained in first decomposition. +#'@param neof_max Maximum number of single eofs considered in the first +#' decomposition. +#'@param xlim Vector with longitudinal range limits for the calculation. +#'@param ylim Vector with latitudinal range limits for the calculation. +#'@param lon_dim String with dimension name of longitudinal coordinate. +#'@param lat_dim String with dimension name of latitudinal coordinate. +#' +#'@return A list with elements \code{$coeff} (an array of time-varying principal +#'component coefficients), \code{$variance} (a matrix of explained variances), +#'\code{eof_pattern} (a matrix of EOF patterns obtained by regression for each +#'variable). +#'@noRd + +.multi.eofs <- function(field_arr_raw, lon, lat, time, neof_max = 40, + neof_composed = 5, minvar = 0.6, xlim = NULL, + ylim = NULL, lon_dim = "lon", lat_dim = "lat") { if (exists(".lm.fit")) { lin.fit <- .lm.fit @@ -174,25 +233,20 @@ MultiEOF <- function(data, lon, lat, time, } n_field <- dim(field_arr_raw)[1] - etime <- .power.date(time) - #print("Calculating anomalies...") + field_arr <- array(dim = dim(field_arr_raw)) for (k in seq(1, n_field, 1)) { - field_arr[k, , , ] <- .daily.anom.mean( - lon, lat, field_arr_raw[k, , , ], etime - ) + field_arr[k, , , ] <- .daily.anom.mean(lon, lat, field_arr_raw[k, , , ], etime) } # area weighting, based on the root of cosine - #print("Area Weighting...") ww <- .area.weight(lon, lat, root = T) for (k in seq(1, n_field, 1)) { field_orig <- field_arr[k, , , ] # calculate the area weight field <- sweep(field_orig, c(1, 2), ww, "*") - idx <- .selbox(lon, lat, xlim, ylim) slon <- lon[idx$ilon] slat <- lat[idx$ilat] @@ -209,10 +263,7 @@ MultiEOF <- function(data, lon, lat, time, pattern <- array(SVD$u, dim = c(dim(field)[1], dim(field)[2], neof_max)) coefficient <- SVD$v variance <- (SVD$d[1:neof_max]) ^ 2 / sum((SVD$d) ^ 2) - #print("Accumulated variance:") - #print(cumsum(variance)) reqPC <- which(cumsum(variance) > minvar)[1] - #print("Number of EOFs needed for var:") variance <- variance[1:reqPC] coefficient <- coefficient[, 1:reqPC] if (reqPC == 1) { @@ -222,16 +273,12 @@ MultiEOF <- function(data, lon, lat, time, regression <- array(NA, dim = c(length(lon), length(lat), neof_max)) for (i in 1:reqPC) { regression[, , i] <- apply(field_orig, c(1, 2), - function(x) lin.fit(as.matrix(coefficient[, i], - ncol = 1), x)$coefficients - ) + function(x) lin.fit(as.matrix(coefficient[, i], + ncol = 1), x)$coefficients) } - - assign( - paste0("pc", k), list(coeff = coefficient, variance = variance, - wcoeff = sweep(coefficient, c(2), variance, "*"), - regression = regression) - ) + assign(paste0("pc", k), list(coeff = coefficient, variance = variance, + wcoeff = sweep(coefficient, c(2), variance, "*"), + regression = regression)) } newpc <- NULL @@ -240,7 +287,6 @@ MultiEOF <- function(data, lon, lat, time, } newpc <- t(newpc) - #print("Calculating composed EOFs") SVD <- svd(newpc, nu = neof_composed, nv = neof_composed) # extracting EOFs, expansions coefficient and variance explained coefficient <- SVD$v @@ -251,30 +297,26 @@ MultiEOF <- function(data, lon, lat, time, regression <- array(dim = c(n_field, length(lon), length(lat), neof_composed)) for (k in seq(1, n_field, 1)) { - #print("Linear Regressions (it can take a while)... ") for (i in 1:neof_composed) { - regression[k, , , i] <- apply( - field_arr[k, , , ], c(1, 2), - function(x) lin.fit( - as.matrix(coefficient[, i], - ncol = 1), - x)$coefficients - ) + regression[k, , , i] <- apply(field_arr[k, , , ], c(1, 2), + function(x) lin.fit(as.matrix(coefficient[, i], + ncol = 1), x)$coefficients) } } - #print("Finalize...") names(dim(coefficient)) <- c("time", "eof") variance <- array(variance) names(dim(variance)) <- "eof" names(dim(regression)) <- c("var", "lon", "lat", "eof") - out <- list(coeff = coefficient, variance = variance, eof_pattern = regression, lon = slon, lat = slat) + out <- list(coeff = coefficient, variance = variance, eof_pattern = regression) + + out[[lon_dim]] <- slon + out[[lat_dim]] <- slat return(out) } - # new function to create simple list with date values - Oct-18 # it needs a date or PCICt object, and returns also the season subdivision .power.date <- function(datas, verbose = FALSE) { @@ -297,7 +339,6 @@ MultiEOF <- function(data, lon, lat, time, return(etime) } - # function for daily anomalies, use array predeclaration and rowMeans (40 times faster!) .daily.anom.mean <- function(ics, ipsilon, field, etime) { condition <- paste(etime$day, etime$month) diff --git a/R/CST_MultiMetric.R b/R/CST_MultiMetric.R index 93b7392df34e82eba0cff2090cd4b3ca1675588a..7f847a4298107c211bec991a1065bf86ac0d239c 100644 --- a/R/CST_MultiMetric.R +++ b/R/CST_MultiMetric.R @@ -2,20 +2,40 @@ #' #'@author Mishra Niti, \email{niti.mishra@bsc.es} #'@author Perez-Zanon Nuria, \email{nuria.perez@bsc.es} -#'@description This function calculates correlation (Anomaly Correlation Coefficient; ACC), root mean square error (RMS) and the root mean square error skill score (RMSSS) of individual anomaly models and multi-models mean (if desired) with the observations. +#'@description This function calculates correlation (Anomaly Correlation +#'Coefficient; ACC), root mean square error (RMS) and the root mean square error +#'skill score (RMSSS) of individual anomaly models and multi-models mean (if +#'desired) with the observations. #' -#'@param exp an object of class \code{s2dv_cube} as returned by \code{CST_Anomaly} function, containing the anomaly of the seasonal forecast experiments data in the element named \code{$data}. -#'@param obs an object of class \code{s2dv_cube} as returned by \code{CST_Anomaly} function, containing the anomaly of observed data in the element named \code{$data}. -#'@param metric a character string giving the metric for computing the maximum skill. This must be one of the strings 'correlation', 'rms', 'rmsss' and 'rpss'. If 'rpss' is chossen the terciles probabilities are evaluated. -#'@param multimodel a logical value indicating whether a Multi-Model Mean should be computed. -#' -#'@param time_dim name of the temporal dimension where a mean will be applied. It can be NULL, the default value is 'ftime'. -#'@param memb_dim name of the member dimension. It can be NULL, the default value is 'member'. -#'@param sdate_dim name of the start date dimension or a dimension name identifiying the different forecast. It can be NULL, the default value is 'sdate'. -#'@return an object of class \code{s2dv_cube} containing the statistics of the selected metric in the element \code{$data} which is a list of arrays: for the metric requested and others for statistics about its signeificance. The arrays have two dataset dimensions equal to the 'dataset' dimension in the \code{exp$data} and \code{obs$data} inputs. If \code{multimodel} is TRUE, the first position in the first 'nexp' dimension correspons to the Multi-Model Mean. -#'@seealso \code{\link[s2dv]{Corr}}, \code{\link[s2dv]{RMS}}, \code{\link[s2dv]{RMSSS}} and \code{\link{CST_Load}} -#'@references -#'Mishra, N., Prodhomme, C., & Guemas, V. (n.d.). Multi-Model Skill Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, 29-31.\url{https://link.springer.com/article/10.1007/s00382-018-4404-z} +#'@param exp An object of class \code{s2dv_cube} as returned by +#' \code{CST_Anomaly} function, containing the anomaly of the seasonal forecast +#' experiments data in the element named \code{$data}. +#'@param obs An object of class \code{s2dv_cube} as returned by +#' \code{CST_Anomaly} function, containing the anomaly of observed data in the +#' element named \code{$data}. +#'@param metric A character string giving the metric for computing the maximum +#' skill. This must be one of the strings 'correlation', 'rms', 'rmsss' and +#' 'rpss'. If 'rpss' is chossen the terciles probabilities are evaluated. +#'@param multimodel A logical value indicating whether a Multi-Model Mean should +#' be computed. +#'@param time_dim Name of the temporal dimension where a mean will be applied. +#' It can be NULL, the default value is 'ftime'. +#'@param memb_dim Name of the member dimension. It can be NULL, the default +#' value is 'member'. +#'@param sdate_dim Name of the start date dimension or a dimension name +#' identifiying the different forecast. It can be NULL, the default value is +#' 'sdate'. +#'@return An object of class \code{s2dv_cube} containing the statistics of the +#'selected metric in the element \code{$data} which is a list of arrays: for the +#'metric requested and others for statistics about its signeificance. The arrays +#'have two dataset dimensions equal to the 'dataset' dimension in the +#'\code{exp$data} and \code{obs$data} inputs. If \code{multimodel} is TRUE, the +#'first position in the first 'nexp' dimension correspons to the Multi-Model Mean. +#'@seealso \code{\link[s2dv]{Corr}}, \code{\link[s2dv]{RMS}}, +#'\code{\link[s2dv]{RMSSS}} and \code{\link{CST_Load}} +#'@references Mishra, N., Prodhomme, C., & Guemas, V. (n.d.). Multi-Model Skill +#'Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, +#'29-31. \doi{10.1007/s00382-018-4404-z} #' #'@importFrom s2dv MeanDims Reorder Corr RMS RMSSS InsertDim #'@import abind @@ -23,32 +43,23 @@ #'@import stats #'@import multiApply #'@examples -#'library(zeallot) -#'mod <- 1 : (2 * 3 * 4 * 5 * 6 * 7) -#'dim(mod) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) -#'obs <- 1 : (1 * 1 * 4 * 5 * 6 * 7) -#'dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) +#'mod <- rnorm(2*2*4*5*2*2) +#'dim(mod) <- c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, lon = 2) +#'obs <- rnorm(1*1*4*5*2*2) +#'dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, lon = 2) #'lon <- seq(0, 30, 5) #'lat <- seq(0, 25, 5) -#'exp <- list(data = mod, lat = lat, lon = lon) -#'obs <- list(data = obs, lat = lat, lon = lon) +#'coords <- list(lat = lat, lon = lon) +#'exp <- list(data = mod, coords = coords) +#'obs <- list(data = obs, coords = coords) #'attr(exp, 'class') <- 's2dv_cube' #'attr(obs, 'class') <- 's2dv_cube' -#'c(ano_exp, ano_obs) %<-% CST_Anomaly(exp = exp, obs = obs, cross = TRUE, memb = TRUE) -#'a <- CST_MultiMetric(exp = ano_exp, obs = ano_obs) -#'str(a) -#'\donttest{ -#'exp <- lonlat_temp$exp -#'obs <- lonlat_temp$obs -#'a <- CST_MultiMetric(exp, obs, metric = 'rpss', multimodel = FALSE) -#'a <- CST_MultiMetric(exp, obs, metric = 'correlation') -#'a <- CST_MultiMetric(exp, obs, metric = 'rms') -#'a <- CST_MultiMetric(exp, obs, metric = 'rmsss') -#'} +#'a <- CST_MultiMetric(exp = exp, obs = obs) #'@export CST_MultiMetric <- function(exp, obs, metric = "correlation", multimodel = TRUE, time_dim = 'ftime', memb_dim = 'member', sdate_dim = 'sdate') { + # Check 's2dv_cube' if (!inherits(exp, 's2dv_cube') || !inherits(obs, 's2dv_cube')) { stop("Parameter 'exp' and 'obs' must be of the class 's2dv_cube', ", "as output by CSTools::CST_Load.") @@ -56,26 +67,45 @@ CST_MultiMetric <- function(exp, obs, metric = "correlation", multimodel = TRUE, result <- MultiMetric(exp$data, obs$data, metric = metric, multimodel = multimodel, time_dim = time_dim, memb_dim = memb_dim, sdate_dim = sdate_dim) exp$data <- result + exp$attrs$Datasets <- c(exp$attrs$Datasets, obs$attrs$Datasets) + exp$attrs$source_files <- c(exp$attrs$source_files, obs$attrs$source_files) + return(exp) } + #'Multiple Metrics applied in Multiple Model Anomalies #' #'@author Mishra Niti, \email{niti.mishra@bsc.es} #'@author Perez-Zanon Nuria, \email{nuria.perez@bsc.es} -#'@description This function calculates correlation (Anomaly Correlation Coefficient; ACC), root mean square error (RMS) and the root mean square error skill score (RMSSS) of individual anomaly models and multi-models mean (if desired) with the observations on arrays with named dimensions. +#'@description This function calculates correlation (Anomaly Correlation +#'Coefficient; ACC), root mean square error (RMS) and the root mean square error +#'skill score (RMSSS) of individual anomaly models and multi-models mean (if +#'desired) with the observations on arrays with named dimensions. #' -#'@param exp a multidimensional array with named dimensions. -#'@param obs a multidimensional array with named dimensions. -#'@param metric a character string giving the metric for computing the maximum skill. This must be one of the strings 'correlation', 'rms' or 'rmsss. -#'@param multimodel a logical value indicating whether a Multi-Model Mean should be computed. -#' -#'@param time_dim name of the temporal dimension where a mean will be applied. It can be NULL, the default value is 'ftime'. -#'@param memb_dim name of the member dimension. It can be NULL, the default value is 'member'. -#'@param sdate_dim name of the start date dimension or a dimension name identifiying the different forecast. It can be NULL, the default value is 'sdate'. -#'@return a list of arrays containing the statistics of the selected metric in the element \code{$data} which is a list of arrays: for the metric requested and others for statistics about its signeificance. The arrays have two dataset dimensions equal to the 'dataset' dimension in the \code{exp$data} and \code{obs$data} inputs. If \code{multimodel} is TRUE, the greatest position in the first dimension correspons to the Multi-Model Mean. -#'@seealso \code{\link[s2dv]{Corr}}, \code{\link[s2dv]{RMS}}, \code{\link[s2dv]{RMSSS}} and \code{\link{CST_Load}} -#'@references -#'Mishra, N., Prodhomme, C., & Guemas, V. (n.d.). Multi-Model Skill Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, 29-31.\url{https://link.springer.com/article/10.1007/s00382-018-4404-z} +#'@param exp A multidimensional array with named dimensions. +#'@param obs A multidimensional array with named dimensions. +#'@param metric A character string giving the metric for computing the maximum +#' skill. This must be one of the strings 'correlation', 'rms' or 'rmsss. +#'@param multimodel A logical value indicating whether a Multi-Model Mean should +#' be computed. +#'@param time_dim Name of the temporal dimension where a mean will be applied. +#' It can be NULL, the default value is 'ftime'. +#'@param memb_dim Name of the member dimension. It can be NULL, the default +#' value is 'member'. +#'@param sdate_dim Name of the start date dimension or a dimension name +#' identifiying the different forecast. It can be NULL, the default value is +#' 'sdate'. +#'@return A list of arrays containing the statistics of the selected metric in +#'the element \code{$data} which is a list of arrays: for the metric requested +#'and others for statistics about its signeificance. The arrays have two dataset +#'dimensions equal to the 'dataset' dimension in the \code{exp$data} and +#'\code{obs$data} inputs. If \code{multimodel} is TRUE, the greatest position in +#'the first dimension correspons to the Multi-Model Mean. +#'@seealso \code{\link[s2dv]{Corr}}, \code{\link[s2dv]{RMS}}, +#'\code{\link[s2dv]{RMSSS}} and \code{\link{CST_Load}} +#'@references Mishra, N., Prodhomme, C., & Guemas, V. (n.d.). Multi-Model Skill +#'Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, +#'29-31. \doi{10.1007/s00382-018-4404-z} #' #'@importFrom s2dv MeanDims Reorder Corr RMS RMSSS InsertDim #'@import abind @@ -83,10 +113,18 @@ CST_MultiMetric <- function(exp, obs, metric = "correlation", multimodel = TRUE, #'@import stats #'@import multiApply #'@examples -#'res <- MultiMetric(lonlat_temp$exp$data, lonlat_temp$obs$data) +#'exp <- array(rnorm(2*2*4*5*2*2), +#' dim = c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, +#' lon = 2)) +#'obs <- array(rnorm(1*1*4*5*2*2), +#' dim = c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, +#' lon = 2)) +#'res <- MultiMetric(exp = exp, obs = obs) #'@export MultiMetric <- function(exp, obs, metric = "correlation", multimodel = TRUE, - time_dim = 'ftime', memb_dim = 'member', sdate_dim = 'sdate') { + time_dim = 'ftime', memb_dim = 'member', + sdate_dim = 'sdate') { + if (!is.null(names(dim(exp))) & !is.null(names(dim(obs)))) { if (all(names(dim(exp)) %in% names(dim(obs)))) { dimnames <- names(dim(exp)) @@ -96,7 +134,7 @@ MultiMetric <- function(exp, obs, metric = "correlation", multimodel = TRUE, } } else { stop("Element 'data' from parameters 'exp' and 'obs'", - " should have dimmension names.") + " should have dimension names.") } if (!is.logical(multimodel)) { stop("Parameter 'multimodel' must be a logical value.") diff --git a/R/CST_MultivarRMSE.R b/R/CST_MultivarRMSE.R index 7841a19aa4d2a76093c31fc1c682dacc8fd1c340..bfa9bc01ca7d7df192efce9c93c59bf2f743d3d3 100644 --- a/R/CST_MultivarRMSE.R +++ b/R/CST_MultivarRMSE.R @@ -1,117 +1,185 @@ #'Multivariate Root Mean Square Error (RMSE) #' #'@author Deborah Verfaillie, \email{deborah.verfaillie@bsc.es} -#'@description This function calculates the RMSE from multiple variables, as the mean of each variable's RMSE scaled by its observed standard deviation. Variables can be weighted based on their relative importance (defined by the user). +#'@description This function calculates the RMSE from multiple variables, as the +#'mean of each variable's RMSE scaled by its observed standard deviation. +#'Variables can be weighted based on their relative importance (defined by the +#'user). #' -#'@param exp a list of objects, one for each variable, of class \code{s2dv_cube} as returned by \code{CST_Anomaly} function, containing the anomaly of the seasonal forecast experiment data in the element named \code{$data}. -#'@param obs a list of objects, one for each variable (in the same order than the input in 'exp') of class \code{s2dv_cube} as returned by \code{CST_Anomaly} function, containing the observed anomaly data in the element named \code{$data}. +#'@param exp A list of objects, one for each variable, of class \code{s2dv_cube} +#' as returned by \code{CST_Anomaly} function, containing the anomaly of the +#' seasonal forecast experiment data in the element named \code{$data}. +#'@param obs A list of objects, one for each variable (in the same order than +#' the input in 'exp') of class \code{s2dv_cube} as returned by +#' \code{CST_Anomaly} function, containing the observed anomaly data in the +#' element named \code{$data}. +#'@param weight (optional) A vector of weight values to assign to each variable. +#' If no weights are defined, a value of 1 is assigned to every variable. +#'@param memb_dim A character string indicating the name of the member +#' dimension. It must be one dimension in 'exp' and 'obs'. The default value is +#' 'member'. +#'@param dat_dim A character string indicating the name of the dataset +#' dimension. It must be one dimension in 'exp' and 'obs'. If there is no +#' dataset dimension, it can be NULL. The default value is 'dataset'. +#'@param sdate_dim A character string indicating the name of the start date +#' dimension. It must be one dimension in 'exp' and 'obs'. The default value is +#' 'sdate'. +#'@param ftime_dim A character string indicating the name of the forecast time +#' dimension. It must be one dimension in 'exp' and 'obs'. The default value is +#' 'ftime'. #' -#'@param weight (optional) a vector of weight values to assign to each variable. If no weights are defined, a value of 1 is assigned to every variable. -#' -#'@return an object of class \code{s2dv_cube} containing the RMSE in the element \code{$data} which is an array with two datset dimensions equal to the 'dataset' dimension in the \code{exp$data} and \code{obs$data} inputs. An array with dimensions: c(number of exp, number of obs, 1 (the multivariate RMSE value), number of lat, number of lon) +#'@return An object of class \code{s2dv_cube} containing the RMSE in the element +#' \code{$data} which is an array with two datset dimensions equal to the +#' 'dataset' dimension in the \code{exp$data} and \code{obs$data} inputs. An +#' array with dimensions: c(number of exp, number of obs, 1 (the multivariate +#' RMSE value), number of lat, number of lon) #' #'@seealso \code{\link[s2dv]{RMS}} and \code{\link{CST_Load}} -#'@importFrom s2dv RMS MeanDims #'@examples -#'# Creation of sample s2dv objects. These are not complete s2dv objects -#'# though. The Load function returns complete objects. -#'# using package zeallot is optional: -#' library(zeallot) #'# Example with 2 variables -#'mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) -#'mod2 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) +#'mod1 <- abs(rnorm(1 * 3 * 4 * 5 * 6 * 7)) +#'mod2 <- abs(rnorm(1 * 3 * 4 * 5 * 6 * 7)) #'dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) #'dim(mod2) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) -#'obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) -#'obs2 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) +#'obs1 <- abs(rnorm(1 * 1 * 4 * 5 * 6 * 7)) +#'obs2 <- abs(rnorm(1 * 1 * 4 * 5 * 6 * 7)) #'dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) #'dim(obs2) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) #'lon <- seq(0, 30, 5) #'lat <- seq(0, 25, 5) -#'exp1 <- list(data = mod1, lat = lat, lon = lon, Datasets = "EXP1", -#' source_files = "file1", Variable = list('pre')) +#'coords <- list(lat = lat, lon = lon) +#'exp1 <- list(data = mod1, coords = coords, +#' attrs = list(Datasets = "EXP1", source_files = "file1", +#' Variable = list(varName = 'pre'))) +#'exp2 <- list(data = mod2, coords = coords, +#' attrs = list(Datasets = "EXP2", source_files = "file2", +#' Variable = list(varName = 'tas'))) +#'obs1 <- list(data = obs1, coords = coords, +#' attrs = list(Datasets = "OBS1", source_files = "file1", +#' Variable = list(varName = 'pre'))) +#'obs2 <- list(data = obs2, coords = coords, +#' attrs = list(Datasets = "OBS2", source_files = "file2", +#' Variable = list(varName = 'tas'))) #'attr(exp1, 'class') <- 's2dv_cube' -#'exp2 <- list(data = mod2, lat = lat, lon = lon, Datasets = "EXP2", -#' source_files = "file2", Variable = list('tas')) #'attr(exp2, 'class') <- 's2dv_cube' -#'obs1 <- list(data = obs1, lat = lat, lon = lon, Datasets = "OBS1", -#' source_files = "file1", Variable = list('pre')) #'attr(obs1, 'class') <- 's2dv_cube' -#'obs2 <- list(data = obs2, lat = lat, lon = lon, Datasets = "OBS2", -#' source_files = "file2", Variable = list('tas')) #'attr(obs2, 'class') <- 's2dv_cube' -#' -#'c(ano_exp1, ano_obs1) %<-% CST_Anomaly(exp1, obs1, cross = TRUE, memb = TRUE) -#'c(ano_exp2, ano_obs2) %<-% CST_Anomaly(exp2, obs2, cross = TRUE, memb = TRUE) -#'ano_exp <- list(exp1, exp2) -#'ano_obs <- list(ano_obs1, ano_obs2) -#'weight <- c(1, 2) -#'a <- CST_MultivarRMSE(exp = ano_exp, obs = ano_obs, weight = weight) -#'str(a) +#'anom1 <- CST_Anomaly(exp1, obs1, cross = TRUE, memb = TRUE) +#'anom2 <- CST_Anomaly(exp2, obs2, cross = TRUE, memb = TRUE) +#'ano_exp <- list(anom1$exp, anom2$exp) +#'ano_obs <- list(anom1$obs, anom2$obs) +#'a <- CST_MultivarRMSE(exp = ano_exp, obs = ano_obs, weight = c(1, 2)) +#'@importFrom s2dv RMS MeanDims #'@export -CST_MultivarRMSE <- function(exp, obs, weight = NULL) { +CST_MultivarRMSE <- function(exp, obs, weight = NULL, memb_dim = 'member', + dat_dim = 'dataset', sdate_dim = 'sdate', + ftime_dim = 'ftime') { + + # s2dv_cube if (!is.list(exp) | !is.list(obs)) { stop("Parameters 'exp' and 'obs' must be lists of 's2dv_cube' objects") } - if (!(all(sapply(exp, inherits, 's2dv_cube')))) { - stop("Elements of the list in parameter 'exp' must be of the class ", - "'s2dv_cube', as output by CSTools::CST_Load.") + stop("Elements of the list in parameter 'exp' must be of the class ", + "'s2dv_cube', as output by CSTools::CST_Load.") } - if (!(all(sapply(obs, inherits, 's2dv_cube')))) { - stop("Elements of the list in parameter 'obs' must be of the class ", - "'s2dv_cube', as output by CSTools::CST_Load.") + stop("Elements of the list in parameter 'obs' must be of the class ", + "'s2dv_cube', as output by CSTools::CST_Load.") } - if (length(exp) != length(obs)) { stop("Parameters 'exp' and 'obs' must be of the same length.") } - + nvar <- length(exp) - if (nvar < 2) { stop("Parameters 'exp' and 'obs' must contain at least two", " s2dv objects for two different variables.") } - for (j in 1 : nvar) { if (!is.null(names(dim(exp[[j]]$data))) & !is.null(names(dim(obs[[j]]$data)))) { if (all(names(dim(exp[[j]]$data)) %in% names(dim(obs[[j]]$data)))) { dimnames <- names(dim(exp[[j]]$data)) } else { stop("Dimension names of element 'data' from parameters 'exp'", - " and 'obs' should have the same name dimmension.") + " and 'obs' should be equal.") } } else { stop("Element 'data' from parameters 'exp' and 'obs'", - " should have dimmension names.") + " should have dimmension names.") } } - + # weight if (is.null(weight)) { weight <- c(rep(1, nvar)) - } else if (length(weight) != nvar) { + } else if (!is.numeric(weight)) { + stop("Parameter 'weight' must be numeric.") + } else if (length(weight) != nvar){ stop("Parameter 'weight' must have a length equal to the number ", "of variables.") } - obs_var <- unlist(lapply(obs, function(x) { - x[[which(names(x) == 'Variable')]]})) + # memb_dim + if (!is.null(memb_dim)) { + if (!is.character(memb_dim)) { + stop("Parameter 'memb_dim' must be a character string.") + } + if (!memb_dim %in% names(dim(exp[[1]]$data)) | !memb_dim %in% names(dim(obs[[1]]$data))) { + stop("Parameter 'memb_dim' is not found in 'exp' or in 'obs' dimension.") + } + } else { + stop("Parameter 'memb_dim' cannot be NULL.") + } + # dat_dim + if (!is.null(dat_dim)) { + if (!is.character(dat_dim)) { + stop("Parameter 'dat_dim' must be a character string.") + } + if (!dat_dim %in% names(dim(exp[[1]]$data)) | !dat_dim %in% names(dim(obs[[1]]$data))) { + stop("Parameter 'dat_dim' is not found in 'exp' or in 'obs' dimension.") + } + } + # ftime_dim + if (!is.null(ftime_dim)) { + if (!is.character(ftime_dim)) { + stop("Parameter 'ftime_dim' must be a character string.") + } + if (!ftime_dim %in% names(dim(exp[[1]]$data)) | !ftime_dim %in% names(dim(obs[[1]]$data))) { + stop("Parameter 'ftime_dim' is not found in 'exp' or in 'obs' dimension.") + } + } else { + stop("Parameter 'ftime_dim' cannot be NULL.") + } + # sdate_dim + if (!is.null(sdate_dim)) { + if (!is.character(sdate_dim)) { + stop("Parameter 'sdate_dim' must be a character string.") + } + if (!sdate_dim %in% names(dim(exp[[1]]$data)) | !sdate_dim %in% names(dim(obs[[1]]$data))) { + stop("Parameter 'sdate_dim' is not found in 'exp' or in 'obs' dimension.") + } + } else { + stop("Parameter 'sdate_dim' cannot be NULL.") + } + # Variables + obs_var <- unlist(lapply(exp, function(x) { + x$attrs$Variable$varName})) exp_var <- unlist(lapply(exp, function(x) { - x[[which(names(x) == 'Variable')]]})) + x$attrs$Variable$varName})) if (all(exp_var != obs_var)) { stop("Variables in parameters 'exp' and 'obs' must be in the same order.") } + mvrmse <- 0 sumweights <- 0 + for (j in 1 : nvar) { # seasonal average of anomalies - AvgExp <- MeanDims(exp[[j]]$data, c('member', 'ftime'), na.rm = TRUE) - AvgObs <- MeanDims(obs[[j]]$data, c('member', 'ftime'), na.rm = TRUE) + AvgExp <- MeanDims(exp[[j]]$data, c(memb_dim, ftime_dim), na.rm = TRUE) + AvgObs <- MeanDims(obs[[j]]$data, c(memb_dim, ftime_dim), na.rm = TRUE) # multivariate RMSE (weighted) - rmse <- s2dv::RMS(AvgExp, AvgObs, dat_dim = 'dataset', time_dim = 'sdate', + rmse <- RMS(AvgExp, AvgObs, dat_dim = dat_dim, time_dim = sdate_dim, conf = FALSE)$rms stdev <- sd(AvgObs) mvrmse <- mvrmse + (rmse / stdev * as.numeric(weight[j])) @@ -119,20 +187,21 @@ CST_MultivarRMSE <- function(exp, obs, weight = NULL) { } mvrmse <- mvrmse / sumweights - # names(dim(mvrmse)) <- c(dimnames[1], dimnames[1], 'statistics', dimnames[5 : 6]) + # names(dim(mvrmse)) <- c(dimnames[1], dimnames[1], 'statistics', dimnames[5 : 6]) exp_Datasets <- unlist(lapply(exp, function(x) { - x[[which(names(x) == 'Datasets')]]})) + x$attrs[[which(names(x$attrs) == 'Datasets')]]})) exp_source_files <- unlist(lapply(exp, function(x) { - x[[which(names(x) == 'source_files')]]})) + x$attrs[[which(names(x$attrs) == 'source_files')]]})) obs_Datasets <- unlist(lapply(obs, function(x) { - x[[which(names(x) == 'Datasets')]]})) + x$attrs[[which(names(x$attrs) == 'Datasets')]]})) obs_source_files <- unlist(lapply(obs, function(x) { - x[[which(names(x) == 'source_files')]]})) + x$attrs[[which(names(x$attrs) == 'source_files')]]})) - exp <- exp[[1]] - exp$data <- mvrmse - exp$Datasets <- c(exp_Datasets, obs_Datasets) - exp$source_files <- c(exp_source_files, obs_source_files) - exp$Variable <- c(exp_var) - return(exp) + exp1 <- exp[[1]] + exp1$data <- mvrmse + exp1$attrs$Datasets <- c(exp_Datasets, obs_Datasets) + exp1$attrs$source_files <- c(exp_source_files, obs_source_files) + exp1$attrs$Variable$varName <- as.character(exp_var) + exp1$attrs$Variable$metadata <- c(exp1$attrs$Variable$metadata, exp[[2]]$attrs$Variable$metadata) + return(exp1) } diff --git a/R/CST_ProxiesAttractor.R b/R/CST_ProxiesAttractor.R index 3839a8961ca8acbec97f435a6749a9bd204a5876..e99677176490c53be8b881e190f1168ab532f39e 100644 --- a/R/CST_ProxiesAttractor.R +++ b/R/CST_ProxiesAttractor.R @@ -9,41 +9,49 @@ #'@description This function computes two dinamical proxies of the attractor: #'The local dimension (d) and the inverse of the persistence (theta) for an #''s2dv_cube' object. -#'These two parameters will be used as a condition for the computation of dynamical -#'scores to measure predictability and to compute bias correction conditioned by -#'the dynamics with the function DynBiasCorrection -#'Funtion based on the matlab code (davide.faranda@lsce.ipsl.fr) used in -#'@references Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). -#' The hammam effect or how a warm ocean enhances large scale atmospheric predictability. -#' Nature Communications, 10(1), 1316. DOI = https://doi.org/10.1038/s41467-019-09305-8 " +#'These two parameters will be used as a condition for the computation of +#'dynamical scores to measure predictability and to compute bias correction +#'conditioned by the dynamics with the function DynBiasCorrection Function +#'based on the matlab code (davide.faranda@lsce.ipsl.fr) used in +#'@references Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., +#'and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large +#'scale atmospheric predictability. Nature Communications, 10(1), 1316. +#'\doi{10.1038/s41467-019-09305-8}" #'@references Faranda, D., Gabriele Messori and Pascal Yiou. (2017). -#' Dynamical proxies of North Atlantic predictability and extremes. -#' Scientific Reports, 7-41278, 2017. +#'Dynamical proxies of North Atlantic predictability and extremes. +#'Scientific Reports, 7-41278, 2017. #' -#'@param data a s2dv_cube object with the data to create the attractor. Must be a matrix with the timesteps in nrow -#'and the grids in ncol(dat(time,grids) -# -#'@param quanti a number lower than 1 indicating the quantile to perform the computation of local dimension and theta -#' -#'@param ncores The number of cores to use in parallel computation -#' +#'@param data An s2dv_cube object with the data to create the attractor. Must be +#' a matrix with the timesteps in nrow and the grids in ncol(dat(time,grids) +#'@param quanti A number lower than 1 indicating the quantile to perform the +#' computation of local dimension and theta. +#'@param ncores The number of cores to use in parallel computation. #'@return dim and theta -#' #'@examples #'# Example 1: Computing the attractor using simple s2dv data -#'attractor <- CST_ProxiesAttractor(data = lonlat_temp$obs, quanti = 0.6) -#' +#'obs <- rnorm(2 * 3 * 4 * 8 * 8) +#'dim(obs) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) +#'lon <- seq(10, 13.5, 0.5) +#'lat <- seq(40, 43.5, 0.5) +#'coords <- list(lon = lon, lat = lat) +#'data <- list(data = obs, coords = coords) +#'class(data) <- "s2dv_cube" +#'attractor <- CST_ProxiesAttractor(data = data, quanti = 0.6) +#'@import multiApply #'@export -CST_ProxiesAttractor <- function(data, quanti, ncores = NULL){ +CST_ProxiesAttractor <- function(data, quanti, ncores = NULL) { + # Check 's2dv_cube' if (!inherits(data, 's2dv_cube')) { stop("Parameter 'data' must be of the class 's2dv_cube', ", "as output by CSTools::CST_Load.") } + # Check quanti if (is.null(quanti)) { stop("Parameter 'quanti' cannot be NULL.") } - data$data <- ProxiesAttractor(data = data$data, quanti = quanti, ncores = ncores) + data$data <- ProxiesAttractor(data = data$data, quanti = quanti, + ncores = ncores) return(data) } @@ -58,24 +66,26 @@ CST_ProxiesAttractor <- function(data, quanti, ncores = NULL){ #'@description This function computes two dinamical proxies of the attractor: #'The local dimension (d) and the inverse of the persistence (theta). #'These two parameters will be used as a condition for the computation of dynamical -#'scores to measure predictability and to compute bias correction conditioned by +#'scores to measure predictability and to compute bias correction conditioned by #'the dynamics with the function DynBiasCorrection. #'Funtion based on the matlab code (davide.faranda@lsce.ipsl.fr) used in: -#'@references Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). -#' The hammam effect or how a warm ocean enhances large scale atmospheric predictability. -#' Nature Communications, 10(1), 1316. DOI = https://doi.org/10.1038/s41467-019-09305-8 " +#'@references Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and +#'Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale +#'atmospheric predictability. Nature Communications, 10(1), 1316. +#'\doi{10.1038/s41467-019-09305-8}" #'@references Faranda, D., Gabriele Messori and Pascal Yiou. (2017). #' Dynamical proxies of North Atlantic predictability and extremes. #' Scientific Reports, 7-41278, 2017. #' -#'@param data a multidimensional array with named dimensions to create the attractor. It requires a temporal dimension named 'time' and spatial dimensions called 'lat' and 'lon', or 'latitude' and 'longitude' or 'grid'. -#'@param quanti a number lower than 1 indicating the quantile to perform the computation of local dimension and theta +#'@param data A multidimensional array with named dimensions to create the +#' attractor. It requires a temporal dimension named 'time' and spatial +#' dimensions called 'lat' and 'lon', or 'latitude' and 'longitude' or 'grid'. +#'@param quanti A number lower than 1 indicating the quantile to perform the +#' computation of local dimension and theta #'@param ncores The number of cores to use in parallel computation. #' #'@return dim and theta -#' -#'@import multiApply -#' +#' #'@examples #'# Example 1: Computing the attractor using simple data #'# Creating an example of matrix data(time,grids): @@ -84,16 +94,10 @@ CST_ProxiesAttractor <- function(data, quanti, ncores = NULL){ #'Attractor <- ProxiesAttractor(data = mat, quanti = qm) #'# to plot the result #'time = c(1:length(Attractor$theta)) -#'layout(matrix(c(1, 3, 2, 3), 2, 2)) #'plot(time, Attractor$dim, xlab = 'time', ylab = 'd', #' main = 'local dimension', type = 'l') -#'plot(time, Attractor$theta, xlab = 'time', ylab = 'theta', main = 'theta') -#'plot(Attractor$dim, Attractor$theta, col = 'blue', -#' main = "Proxies of the Attractor", -#' xlab = "local dimension", ylab = "theta", lwd = 8, 'p') -#' +#'@import multiApply #'@export - ProxiesAttractor <- function(data, quanti, ncores = NULL){ if (is.null(data)) { stop("Parameter 'data' cannot be NULL.") @@ -104,7 +108,7 @@ ProxiesAttractor <- function(data, quanti, ncores = NULL){ if (any(names(dim(data)) %in% 'sdate')) { if (any(names(dim(data)) %in% 'ftime')) { data <- MergeDims(data, merge_dims = c('ftime', 'sdate'), - rename_dim = 'time') + rename_dim = 'time') } } if (!(any(names(dim(data)) %in% 'time'))){ diff --git a/R/CST_QuantileMapping.R b/R/CST_QuantileMapping.R index edb2fb895b4c32e754349a979403ec1e522c067e..92b7ac038161d037ab3b9409bf42c97dd3376aa1 100644 --- a/R/CST_QuantileMapping.R +++ b/R/CST_QuantileMapping.R @@ -14,14 +14,15 @@ #'@param sdate_dim A character string indicating the dimension name in which #' cross-validation would be applied when exp_cor is not provided. 'sdate' by #' default. -#'@param memb_dim A character string indicating the dimension name where -#' ensemble members are stored in the experimental arrays. 'member' by default. +#'@param memb_dim A character string indicating the dimension name where +#' ensemble members are stored in the experimental arrays. It can be NULL if +#' there is no ensemble member dimension. It is set as 'member' by default. #'@param window_dim A character string indicating the dimension name where #' samples have been stored. It can be NULL (default) in case all samples are #' used. -#'@param method A character string indicating the method to be used:'PTF', -#' 'DIST','RQUANT','QUANT','SSPLIN'. By default, the empirical quantile mapping -#' 'QUANT' is used. +#'@param method A character string indicating the method to be used:'PTF', +#' 'DIST', 'RQUANT', 'QUANT', 'SSPLIN'. By default, the empirical quantile +#' mapping 'QUANT' is used. #'@param na.rm A logical value indicating if missing values should be removed #' (FALSE by default). #'@param ncores An integer indicating the number of cores for parallel @@ -40,29 +41,13 @@ #'dim(exp$data) <- c(dataset = 1, member = 3, sdate = 5, ftime = 4, #' lat = 3, lon = 2) #'class(exp) <- 's2dv_cube' +#'obs <- NULL #'obs$data <- 101 : c(100 + 1 * 1 * 5 * 4 * 3 * 2) #'dim(obs$data) <- c(dataset = 1, member = 1, sdate = 5, ftime = 4, #' lat = 3, lon = 2) #'class(obs) <- 's2dv_cube' #'res <- CST_QuantileMapping(exp, obs) #' -#'# Use data in package -#'\donttest{ -#'exp <- lonlat_temp$exp -#'exp$data <- exp$data[, , 1:4, , 1:2, 1:3] -#'dim(exp$data) <- c(dataset = 1, member = 15, sdate = 4, ftime = 3, -#' lat = 2, lon = 3) -#'obs <- lonlat_temp$obs -#'obs$data <- obs$data[, , 1:4, , 1:2, 1:3] -#'dim(obs$data) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, -#' lat = 2, lon = 3) -#'exp_cor <- lonlat_temp$exp -#'exp_cor$data <- exp_cor$data[, 1, 5:6, , 1:2, 1:3] -#'dim(exp_cor$data) <- c(dataset = 1, member = 1, sdate = 2, ftime = 3, -#' lat = 2, lon = 3) -#'res <- CST_QuantileMapping(exp, obs, exp_cor, window_dim = 'ftime') -#'} -#' #'@import qmap #'@import multiApply #'@import s2dv @@ -71,36 +56,36 @@ CST_QuantileMapping <- function(exp, obs, exp_cor = NULL, sdate_dim = 'sdate', memb_dim = 'member', window_dim = NULL, method = 'QUANT', na.rm = FALSE, ncores = NULL, ...) { - if (!inherits(exp, 's2dv_cube') || !inherits(obs, 's2dv_cube')) { - stop("Parameter 'exp' and 'obs' must be of the class 's2dv_cube', ", - "as output by CSTools::CST_Load.") - } - if (!is.null(exp_cor)) { - if (!inherits(exp_cor, 's2dv_cube')) { - stop("Parameter 'exp_cor' must be of the class 's2dv_cube', ", - "as output by CSTools::CST_Load.") - } + # Check 's2dv_cube' + if (!inherits(exp, 's2dv_cube') || !inherits(obs, 's2dv_cube')) { + stop("Parameter 'exp' and 'obs' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + } + if (!is.null(exp_cor)) { + if (!inherits(exp_cor, 's2dv_cube')) { + stop("Parameter 'exp_cor' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") } + } - QMapped <- QuantileMapping(exp = exp$data, obs = obs$data, - exp_cor = exp_cor$data, - sdate_dim = sdate_dim, memb_dim = memb_dim, - window_dim = window_dim, method = method, - na.rm = na.rm, ncores = ncores, ...) - if (is.null(exp_cor)) { - exp$data <- QMapped - exp$Datasets <- c(exp$Datasets, obs$Datasets) - exp$source_files <- c(exp$source_files, obs$source_files) - return(exp) - - } else { - exp_cor$data <- QMapped - exp_cor$Datasets <- c(exp_cor$Datasets, exp$Datasets, obs$Datasets) - exp_cor$source_files <- c(exp_cor$source_files, exp$source_files, obs$source_files) - return(exp_cor) - } - - + QMapped <- QuantileMapping(exp = exp$data, obs = obs$data, + exp_cor = exp_cor$data, + sdate_dim = sdate_dim, memb_dim = memb_dim, + window_dim = window_dim, method = method, + na.rm = na.rm, ncores = ncores, ...) + if (is.null(exp_cor)) { + exp$data <- QMapped + exp$attrs$Datasets <- c(exp$attrs$Datasets, obs$attrs$Datasets) + exp$attrs$source_files <- c(exp$attrs$source_files, obs$attrs$source_files) + return(exp) + } else { + exp_cor$data <- QMapped + exp_cor$attrs$Datasets <- c(exp_cor$attrs$Datasets, exp$attrs$Datasets, + obs$attrs$Datasets) + exp_cor$attrs$source_files <- c(exp_cor$attrs$source_files, exp$attrs$source_files, + obs$attrs$source_files) + return(exp_cor) + } } #'Quantile Mapping for seasonal or decadal forecast data @@ -122,14 +107,14 @@ CST_QuantileMapping <- function(exp, obs, exp_cor = NULL, sdate_dim = 'sdate', #' cross-validation would be applied when exp_cor is not provided. 'sdate' by #' default. #'@param memb_dim A character string indicating the dimension name where -#' ensemble members are stored in the experimental arrays. 'member' by -#' default. +#' ensemble members are stored in the experimental arrays. It can be NULL if +#' there is no ensemble member dimension. It is set as 'member' by default. #'@param window_dim A character string indicating the dimension name where #' samples have been stored. It can be NULL (default) in case all samples are #' used. #'@param method A character string indicating the method to be used: 'PTF', -#' 'DIST','RQUANT','QUANT','SSPLIN'. By default, the empirical quantile mapping -#' 'QUANT' is used. +#' 'DIST', 'RQUANT', 'QUANT', 'SSPLIN'. By default, the empirical quantile +#' mapping 'QUANT' is used. #'@param na.rm A logical value indicating if missing values should be removed #' (FALSE by default). #'@param ncores An integer indicating the number of cores for parallel @@ -151,28 +136,14 @@ CST_QuantileMapping <- function(exp, obs, exp_cor = NULL, sdate_dim = 'sdate', #' lat = 3, lon = 2) #'res <- QuantileMapping(exp, obs) #' -#'# Use data in package -#'\donttest{ -#'exp <- lonlat_temp$exp$data[, , 1:4, , 1:2, 1:3] -#'dim(exp) <- c(dataset = 1, member = 15, sdate = 4, ftime = 3, -#' lat = 2, lon = 3) -#'obs <- lonlat_temp$obs$data[, , 1:4, , 1:2, 1:3] -#'dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, -#' lat = 2, lon = 3) -#'exp_cor <- lonlat_temp$exp$data[, 1, 5:6, , 1:2, 1:3] -#'dim(exp_cor) <- c(dataset = 1, member = 1, sdate = 2, ftime = 3, -#' lat = 2, lon = 3) -#'res <- QuantileMapping(exp, obs, exp_cor, window_dim = 'ftime') -#'} -#' #'@import qmap #'@import multiApply #'@import s2dv #'@export QuantileMapping <- function(exp, obs, exp_cor = NULL, sdate_dim = 'sdate', memb_dim = 'member', window_dim = NULL, - method = 'QUANT', - na.rm = FALSE, ncores = NULL, ...) { + method = 'QUANT', na.rm = FALSE, + ncores = NULL, ...) { # exp and obs obsdims <- names(dim(obs)) expdims <- names(dim(exp)) @@ -213,12 +184,27 @@ QuantileMapping <- function(exp, obs, exp_cor = NULL, sdate_dim = 'sdate', "'PTF', 'DIST', 'RQUANT', 'QUANT', 'SSPLIN'.") } # memb_dim - if (!all(memb_dim %in% obsdims)) { - obs <- InsertDim(obs, posdim = 1, lendim = 1, - name = memb_dim[!(memb_dim %in% obsdims)]) - } - if (any(!memb_dim %in% expdims)) { - stop("Parameter 'memb_dim' is not found in 'exp' dimensions.") + if (is.null(memb_dim)) { + remove_member <- TRUE + memb_dim <- "temp_memb_dim" + exp <- InsertDim(exp, posdim = 1, lendim = 1, name = "temp_memb_dim") + obs <- InsertDim(obs, posdim = 1, lendim = 1, name = "temp_memb_dim") + obsdims <- names(dim(obs)) + expdims <- names(dim(exp)) + if (!is.null(exp_cor)) { + exp_cor <- InsertDim(exp_cor, posdim = 1, lendim = 1, name = "temp_memb_dim") + } + } else { + remove_member <- FALSE + if (!all(memb_dim %in% obsdims)) { + obs <- InsertDim(obs, posdim = 1, lendim = 1, + name = memb_dim[!(memb_dim %in% obsdims)]) + obsdims <- names(dim(obs)) + } + if (any(!memb_dim %in% expdims)) { + stop(paste0("Parameter 'memb_dim' is not found in 'exp' dimensions. ", + "Set it as NULL if there is no member dimension.")) + } } sample_dims <- c(memb_dim, sdate_dim) # window_dim @@ -245,7 +231,6 @@ QuantileMapping <- function(exp, obs, exp_cor = NULL, sdate_dim = 'sdate', } ############################### - if (!is.null(exp_cor)) { qmaped <- Apply(list(exp, obs, exp_cor), target_dims = sample_dims, fun = .qmapcor, method = method, sdate_dim = sdate_dim, @@ -256,12 +241,17 @@ QuantileMapping <- function(exp, obs, exp_cor = NULL, sdate_dim = 'sdate', fun = .qmapcor, exp_cor = NULL, method = method, sdate_dim = sdate_dim, na.rm = na.rm, ..., ncores = ncores)$output1 - } + } + # remove added 'temp_memb_dim' + if (remove_member) { + dim(qmaped) <- dim(qmaped)[-which(names(dim(qmaped)) == "temp_memb_dim")] + } + return(qmaped) } -.qmapcor <- function(exp, obs, exp_cor = NULL, sdate_dim = 'sdate', method = 'QUANT', - na.rm = FALSE, ...) { +.qmapcor <- function(exp, obs, exp_cor = NULL, sdate_dim = 'sdate', + method = 'QUANT', na.rm = FALSE, ...) { # exp: [memb (+ window), sdate] # obs: [memb (+ window), sdate] diff --git a/R/CST_RFSlope.R b/R/CST_RFSlope.R index a8457742cbb16bd66d53830f4fdeb1b9f12d7169..66647845883536a2de8ff7a0b1a53e063a46ec22 100644 --- a/R/CST_RFSlope.R +++ b/R/CST_RFSlope.R @@ -1,105 +1,117 @@ -#' @rdname CST_RFSlope -#' @title RainFARM spectral slopes from a CSTools object +#'@rdname CST_RFSlope +#'@title RainFARM spectral slopes from a CSTools object #' -#' @author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} +#'@author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} #' -#' @description This function computes spatial spectral slopes from a CSTools object -#' to be used for RainFARM stochastic precipitation downscaling method and accepts a CSTools object (of the class 's2dv_cube') as input. +#'@description This function computes spatial spectral slopes from a CSTools +#'object to be used for RainFARM stochastic precipitation downscaling method and +#'accepts a CSTools object (of the class 's2dv_cube') as input. #' -#' @param data An object of the class 's2dv_cube', containing the spatial precipitation fields to downscale. -#' The data object is expected to have an element named \code{$data} with at least two -#' spatial dimensions named "lon" and "lat" and one or more dimensions over which -#' to average these slopes, which can be specified by parameter \code{time_dim}. -#' @param kmin First wavenumber for spectral slope (default \code{kmin=1}). -#' @param time_dim String or character array with name(s) of dimension(s) (e.g. "ftime", "sdate", "member" ...) -#' over which to compute spectral slopes. If a character array of dimension names is provided, the spectral slopes -#' will be computed as an average over all elements belonging to those dimensions. -#' If omitted one of c("ftime", "sdate", "time") is searched and the first one with more than one element is chosen. -#' @param ncores is an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. -#' @return CST_RFSlope() returns spectral slopes using the RainFARM convention -#' (the logarithmic slope of k*|A(k)|^2 where A(k) are the spectral amplitudes). -#' The returned array has the same dimensions as the \code{exp} element of the input object, -#' minus the dimensions specified by \code{lon_dim}, \code{lat_dim} and \code{time_dim}. -#' @import rainfarmr -#' @examples -#' #Example using CST_RFSlope for a CSTools object -#' exp <- 1 : (2 * 3 * 4 * 8 * 8) -#' dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) -#' lon <- seq(10, 13.5, 0.5) -#' dim(lon) <- c(lon = length(lon)) -#' lat <- seq(40, 43.5, 0.5) -#' dim(lat) <- c(lat = length(lat)) -#' data <- list(data = exp, lon = lon, lat = lat) -#' slopes <- CST_RFSlope(data) -#' dim(slopes) -#' # dataset member sdate -#' # 1 2 3 -#' slopes -#' # [,1] [,2] [,3] -#' #[1,] 1.893503 1.893503 1.893503 -#' #[2,] 1.893503 1.893503 1.893503 -#' @export -CST_RFSlope <- function(data, kmin = 1, time_dim = NULL, ncores = 1) { +#'@param data An object of the class 's2dv_cube', containing the spatial +#' precipitation fields to downscale. The data object is expected to have an +#' element named \code{$data} with at least two spatial dimensions named "lon" +#' and "lat" and one or more dimensions over which to average these slopes, +#' which can be specified by parameter \code{time_dim}. +#'@param kmin First wavenumber for spectral slope (default \code{kmin=1}). +#'@param time_dim String or character array with name(s) of dimension(s) (e.g. +#' "ftime", "sdate", "member" ...) over which to compute spectral slopes. If a +#' character array of dimension names is provided, the spectral slopes will be +#' computed as an average over all elements belonging to those dimensions. If +#' omitted one of c("ftime", "sdate", "time") is searched and the first one +#' with more than one element is chosen. +#'@param ncores Is an integer that indicates the number of cores for parallel +#' computations using multiApply function. The default value is one. +#'@return CST_RFSlope() returns spectral slopes using the RainFARM convention +#' (the logarithmic slope of k*|A(k)|^2 where A(k) are the spectral amplitudes). +#' The returned array has the same dimensions as the \code{exp} element of the +#' input object, minus the dimensions specified by \code{lon_dim}, +#' \code{lat_dim} and \code{time_dim}. +#'@examples +#'exp <- 1 : (2 * 3 * 4 * 8 * 8) +#'dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) +#'lon <- seq(10, 13.5, 0.5) +#'lat <- seq(40, 43.5, 0.5) +#'coords <- list(lon = lon, lat = lat) +#'data <- list(data = exp, coords = coords) +#'class(data) <- 's2dv_cube' +#'slopes <- CST_RFSlope(data) +#'@import multiApply +#'@import rainfarmr +#'@importFrom ClimProjDiags Subset +#'@export +CST_RFSlope <- function(data, kmin = 1, time_dim = NULL, ncores = NULL) { + + # Check 's2dv_cube' + if (!inherits(data, "s2dv_cube")) { + stop("Parameter 'data' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + } + + # Check dimensions + if (!any(names(dim(data$data)) %in% .KnownLonNames()) | + !any(names(dim(data$data)) %in% .KnownLatNames())) { + stop("Spatial dimension names do not match any of the names accepted by ", + "the package.") + } + + lon_name <- names(dim(data$data))[[which(names(dim(data$data)) %in% .KnownLonNames())]] + lat_name <- names(dim(data$data))[[which(names(dim(data$data)) %in% .KnownLatNames())]] slopes <- RFSlope(data$data, kmin, time_dim, - lon_dim = "lon", lat_dim = "lat") + lon_dim = lon_name, lat_dim = lat_name) return(slopes) } -#' @rdname RFSlope -#' @title RainFARM spectral slopes from an array (reduced version) +#'@rdname RFSlope +#'@title RainFARM spectral slopes from an array (reduced version) #' -#' @author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} +#'@author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} #' -#' @description This function computes spatial spectral slopes from an array, -#' to be used for RainFARM stochastic precipitation downscaling method. +#'@description This function computes spatial spectral slopes from an array, +#'to be used for RainFARM stochastic precipitation downscaling method. #' -#' @param data Array containing the spatial precipitation fields to downscale. -#' The input array is expected to have at least two dimensions named "lon" and "lat" by default -#' (these default names can be changed with the \code{lon_dim} and \code{lat_dim} parameters) -#' and one or more dimensions over which to average the slopes, -#' which can be specified by parameter \code{time_dim}. -#' @param kmin First wavenumber for spectral slope (default \code{kmin=1}). -#' @param time_dim String or character array with name(s) of dimension(s) -#' (e.g. "ftime", "sdate", "member" ...) over which to compute spectral slopes. -#' If a character array of dimension names is provided, the spectral slopes -#' will be computed as an average over all elements belonging to those dimensions. -#' If omitted one of c("ftime", "sdate", "time") is searched and the first one -#' with more than one element is chosen. -#' @param lon_dim Name of lon dimension ("lon" by default). -#' @param lat_dim Name of lat dimension ("lat" by default). -#' @param ncores is an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. +#'@param data Array containing the spatial precipitation fields to downscale. +#' The input array is expected to have at least two dimensions named "lon" and +#' "lat" by default (these default names can be changed with the \code{lon_dim} +#' and \code{lat_dim} parameters) and one or more dimensions over which to +#' average the slopes, which can be specified by parameter \code{time_dim}. +#'@param kmin First wavenumber for spectral slope (default \code{kmin=1}). +#'@param time_dim String or character array with name(s) of dimension(s) +#' (e.g. "ftime", "sdate", "member" ...) over which to compute spectral slopes. +#' If a character array of dimension names is provided, the spectral slopes +#' will be computed as an average over all elements belonging to those dimensions. +#' If omitted one of c("ftime", "sdate", "time") is searched and the first one +#' with more than one element is chosen. +#'@param lon_dim Name of lon dimension ("lon" by default). +#'@param lat_dim Name of lat dimension ("lat" by default). +#'@param ncores is an integer that indicates the number of cores for parallel +#' computations using multiApply function. The default value is one. #' -#' @return RFSlope() returns spectral slopes using the RainFARM convention -#' (the logarithmic slope of k*|A(k)|^2 where A(k) are the spectral amplitudes). -#' The returned array has the same dimensions as the input array, -#' minus the dimensions specified by \code{lon_dim}, \code{lat_dim} and \code{time_dim}. -#' @import multiApply -#' @import rainfarmr -#' @importFrom ClimProjDiags Subset -#' @export -#' @examples -#' # Example for the 'reduced' RFSlope function -#' # Create a test array with dimension 8x8 and 20 timesteps, -#' # 3 starting dates and 20 ensemble members. -#' pr <- 1:(4*3*8*8*20) -#' dim(pr) <- c(ensemble = 4, sdate = 3, lon = 8, lat = 8, ftime = 20) -#' -#' # Compute the spectral slopes ignoring the wavenumber -#' # corresponding to the largest scale (the box) -#' slopes <- RFSlope(pr, kmin=2) -#' dim(slopes) -#' # ensemble sdate -#' # 4 3 -#' slopes -#' # [,1] [,2] [,3] -#' #[1,] 1.893503 1.893503 1.893503 -#' #[2,] 1.893503 1.893503 1.893503 -#' #[3,] 1.893503 1.893503 1.893503 -#' #[4,] 1.893503 1.893503 1.893503 +#'@return RFSlope() returns spectral slopes using the RainFARM convention +#'(the logarithmic slope of k*|A(k)|^2 where A(k) are the spectral amplitudes). +#'The returned array has the same dimensions as the input array, +#'minus the dimensions specified by \code{lon_dim}, \code{lat_dim} and \code{time_dim}. +#'@examples +#'# Example for the 'reduced' RFSlope function +#'# Create a test array with dimension 8x8 and 20 timesteps, +#'# 3 starting dates and 20 ensemble members. +#'pr <- 1:(4*3*8*8*20) +#'dim(pr) <- c(ensemble = 4, sdate = 3, lon = 8, lat = 8, ftime = 20) +#'# Compute the spectral slopes ignoring the wavenumber +#'# corresponding to the largest scale (the box) +#'slopes <- RFSlope(pr, kmin = 2, time_dim = 'ftime') +#'@import multiApply +#'@import rainfarmr +#'@importFrom ClimProjDiags Subset +#'@export RFSlope <- function(data, kmin = 1, time_dim = NULL, - lon_dim = "lon", lat_dim = "lat", ncores = 1) { + lon_dim = "lon", lat_dim = "lat", ncores = NULL) { + # Know spatial coordinates names + if (!all(c(lon_dim, lat_dim) %in% names(dim(data)))) { + stop("Spatial coordinate names do not match data dimension names.") + } + if (length(ncores) > 1) { ncores = ncores[1] warning("Parameter 'ncores' has length > 1 and only the first element will be used.") @@ -120,7 +132,7 @@ RFSlope <- function(data, kmin = 1, time_dim = NULL, data <- .subset(data, lat_dim, 1:nmin) data <- .subset(data, lon_dim, 1:nmin) warning(paste("The input data have been cut to a square of", - nmin, "pixels on each side.")) + nmin, "pixels on each side.")) } # Check/detect time_dim @@ -158,17 +170,16 @@ RFSlope <- function(data, kmin = 1, time_dim = NULL, return(slopes = result) } -#' Atomic RFSlope -#' @param pr precipitation array to downscale with dims (lon, lat, time). -#' @param kmin first wavenumber for spectral slope (default kmin=1). -#' @return .RFSlope returns a scalar spectral slope using the RainFARM convention -#' (the logarithmic slope of k*|A(k)|^2 where A(k) is the spectral amplitude). -#' @noRd - +#'Atomic RFSlope +#'@param pr precipitation array to downscale with dims (lon, lat, time). +#'@param kmin first wavenumber for spectral slope (default kmin=1). +#'@return .RFSlope returns a scalar spectral slope using the RainFARM convention +#'(the logarithmic slope of k*|A(k)|^2 where A(k) is the spectral amplitude). +#'@noRd .RFSlope <- function(pr, kmin) { if (any(is.na(pr))) { posna <- unlist(lapply(1:dim(pr)['rainfarm_samples'], - function(x){!is.na(pr[1, 1, x])})) + function(x){!is.na(pr[1, 1, x])})) pr <- Subset(pr, 'rainfarm_samples', posna) } fxp <- fft2d(pr) diff --git a/R/CST_RFTemp.R b/R/CST_RFTemp.R index cebac85aa555252b19ba156d7d2d4e2a60f946a3..1bf7ecdeedab5eed1078f395e120d17d47b31997 100644 --- a/R/CST_RFTemp.R +++ b/R/CST_RFTemp.R @@ -1,75 +1,79 @@ -#' @rdname CST_RFTemp -#' @title Temperature downscaling of a CSTools object using lapse rate -#' correction or a reference field -#' @author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} -#' @description This function implements a simple lapse rate correction of a -#' temperature field (an object of class 's2dv_cube' as provided by -#' `CST_Load`) as input. -#' The input lon grid must be increasing (but can be modulo 360). -#' The input lat grid can be irregularly spaced (e.g. a Gaussian grid) -#' The output grid can be irregularly spaced in lon and/or lat. -#' @references Method described in ERA4CS MEDSCOPE milestone M3.2: -#' High-quality climate prediction data available to WP4 -#' [https://www.medscope-project.eu/the-project/deliverables-reports/]([https://www.medscope-project.eu/the-project/deliverables-reports/) -#' and in H2020 ECOPOTENTIAL Deliverable No. 8.1: -#' High resolution (1-10 km) climate, land use and ocean change scenarios -#' [https://www.ecopotential-project.eu/images/ecopotential/documents/D8.1.pdf](https://www.ecopotential-project.eu/images/ecopotential/documents/D8.1.pdf) -#' @param data An object of the class 's2dv_cube' as returned by `CST_Load`, -#' containing the temperature fields to downscale. -#' The data object is expected to have an element named \code{$data} -#' with at least two spatial dimensions named "lon" and "lat". -#' (these default names can be changed with the \code{lon_dim} and -#' \code{lat_dim} parameters) -#' @param oro An object of the class 's2dv_cube' as returned by `CST_Load`, -#' containing fine scale orography (in meters). -#' The destination downscaling area must be contained in the orography field. -#' @param xlim vector with longitude bounds for downscaling; -#' the full input field is downscaled if `xlim` and `ylim` are not specified. -#' @param ylim vector with latitude bounds for downscaling -#' @param lapse float with environmental lapse rate -#' @param lon_dim string with name of longitude dimension -#' @param lat_dim string with name of latitude dimension -#' @param time_dim a vector of character string indicating the name of temporal dimension. By default, it is set to NULL and it considers "ftime", "sdate" and "time" as temporal dimensions. -#' @param verbose logical if to print diagnostic output -#' @param nolapse logical, if true `oro` is interpreted as a fine-scale -#' climatology and used directly for bias correction -#' @param compute_delta logical if true returns only a delta to be used for -#' out-of-sample forecasts. Returns an object of the class 's2dv_cube', -#' containing a delta. Activates `nolapse = TRUE`. -#' @param delta An object of the class 's2dv_cube', containing a delta -#' to be applied to the downscaled input data. Activates `nolapse = TRUE`. -#' The grid of this object must coincide with that of the required output. -#' @param method string indicating the method used for interpolation: -#' "nearest" (nearest neighbours followed by smoothing with a circular -#' uniform weights kernel), "bilinear" (bilinear interpolation) -#' The two methods provide similar results, but nearest is slightly better -#' provided that the fine-scale grid is correctly centered as a subdivision -#' of the large-scale grid -#' @return CST_RFTemp() returns a downscaled CSTools object -#' (i.e., of the class 's2dv_cube'). -#' @export -#' @import multiApply -#' @examples -#' # Generate simple synthetic data and downscale by factor 4 -#' t <- rnorm(7 * 6 * 2 * 3 * 4)*10 + 273.15 + 10 -#' dim(t) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 6, lon = 7) -#' lon <- seq(3, 9, 1) -#' lat <- seq(42, 47, 1) -#' exp <- list(data = t, lat = lat, lon = lon) -#' attr(exp, 'class') <- 's2dv_cube' -#' o <- runif(29*29)*3000 -#' dim(o) <- c(lat = 29, lon = 29) -#' lon <- seq(3, 10, 0.25) -#' lat <- seq(41, 48, 0.25) -#' oro <- list(data = o, lat = lat, lon = lon) -#' attr(oro, 'class') <- 's2dv_cube' -#' res <- CST_RFTemp(exp, oro, xlim=c(4,8), ylim=c(43, 46), lapse=6.5) - +#'@rdname CST_RFTemp +#'@title Temperature downscaling of a CSTools object using lapse rate +#'correction or a reference field +#'@author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} +#'@description This function implements a simple lapse rate correction of a +#'temperature field (an object of class 's2dv_cube' as provided by +#'`CST_Load`) as input. +#'The input lon grid must be increasing (but can be modulo 360). +#'The input lat grid can be irregularly spaced (e.g. a Gaussian grid) +#'The output grid can be irregularly spaced in lon and/or lat. +#'@references Method described in ERA4CS MEDSCOPE milestone M3.2: +#'High-quality climate prediction data available to WP4 here: +#'\url{https://www.medscope-project.eu/the-project/deliverables-reports/} +#'and in H2020 ECOPOTENTIAL Deliverable No. 8.1: +#'High resolution (1-10 km) climate, land use and ocean change scenarios available +#'here: \url{https://ec.europa.eu/research/participants/documents/downloadPublic?documentIds=080166e5b6cd2324&appId=PPGMS} +#'@param data An object of the class 's2dv_cube' as returned by `CST_Load`, +#' containing the temperature fields to downscale. The data object is expected +#' to have an element named \code{$data} with at least two spatial dimensions +#' named "lon" and "lat". (these default names can be changed with the +#' \code{lon_dim} and \code{lat_dim} parameters). +#'@param oro An object of the class 's2dv_cube' as returned by `CST_Load`, +#' containing fine scale orography (in meters). The destination downscaling +#' area must be contained in the orography field. +#'@param xlim Vector with longitude bounds for downscaling; the full input +#' field is downscaled if `xlim` and `ylim` are not specified. +#'@param ylim Vector with latitude bounds for downscaling +#'@param lapse Float with environmental lapse rate +#'@param lon_dim String with name of longitude dimension +#'@param lat_dim String with name of latitude dimension +#'@param time_dim A vector of character string indicating the name of temporal +#' dimension. By default, it is set to NULL and it considers "ftime", "sdate" +#' and "time" as temporal dimensions. +#'@param verbose Logical if to print diagnostic output. +#'@param nolapse Logical, if true `oro` is interpreted as a fine-scale +#' climatology and used directly for bias correction. +#'@param compute_delta Logical if true returns only a delta to be used for +#' out-of-sample forecasts. Returns an object of the class 's2dv_cube', +#' containing a delta. Activates `nolapse = TRUE`. +#'@param delta An object of the class 's2dv_cube', containing a delta +#' to be applied to the downscaled input data. Activates `nolapse = TRUE`. +#' The grid of this object must coincide with that of the required output. +#'@param method String indicating the method used for interpolation: +#' "nearest" (nearest neighbours followed by smoothing with a circular +#' uniform weights kernel), "bilinear" (bilinear interpolation) +#' The two methods provide similar results, but nearest is slightly better +#' provided that the fine-scale grid is correctly centered as a subdivision +#' of the large-scale grid. +#'@return CST_RFTemp() returns a downscaled CSTools object (i.e., of the class +#''s2dv_cube'). +#'@examples +#'# Generate simple synthetic data and downscale by factor 4 +#'t <- rnorm(7 * 6 * 2 * 3 * 4)*10 + 273.15 + 10 +#'dim(t) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 6, lon = 7) +#'lon <- seq(3, 9, 1) +#'lat <- seq(42, 47, 1) +#'coords <- list(lat = lat, lon = lon) +#'exp <- list(data = t, coords = coords) +#'attr(exp, 'class') <- 's2dv_cube' +#'o <- runif(29*29)*3000 +#'dim(o) <- c(lats = 29, lons = 29) +#'lon <- seq(3, 10, 0.25) +#'lat <- seq(41, 48, 0.25) +#'coords <- list(lat = lat, lon = lon) +#'oro <- list(data = o, coords = coords) +#'attr(oro, 'class') <- 's2dv_cube' +#'res <- CST_RFTemp(data = exp, oro = oro, xlim = c(4,8), ylim = c(43, 46), +#' lapse = 6.5, time_dim = 'ftime', +#' lon_dim = 'lon', lat_dim = 'lat') +#'@import multiApply +#'@export CST_RFTemp <- function(data, oro, xlim = NULL, ylim = NULL, lapse = 6.5, lon_dim = "lon", lat_dim = "lat", time_dim = NULL, nolapse = FALSE, verbose = FALSE, compute_delta = FALSE, method = "bilinear", delta = NULL) { - + # Check 's2dv_cube' if (!inherits(data, "s2dv_cube")) { stop("Parameter 'data' must be of the class 's2dv_cube', ", "as output by CSTools::CST_Load.") @@ -79,95 +83,130 @@ CST_RFTemp <- function(data, oro, xlim = NULL, ylim = NULL, lapse = 6.5, "as output by CSTools::CST_Load.") } if (!is.null(delta)) { - if (!inherits(delta, "s2dv_cube")) { - stop("Parameter 'delta' must be of the class 's2dv_cube', ", - "as output by CSTools::CST_Load.") - } + if (!inherits(delta, "s2dv_cube")) { + stop("Parameter 'delta' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + } + } + # Check 's2dv_cube' structure + if (!all(c('data', 'coords') %in% names(data))) { + stop("Parameter 'data' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + } + if (!all(c('data', 'coords') %in% names(oro))) { + stop("Parameter 'oro' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + } + # Check coordinates + if (!any(names(data$coords) %in% .KnownLonNames()) | + !any(names(data$coords) %in% .KnownLatNames())) { + stop("Spatial coordinate names of 'data' do not match any of the names ", + "accepted by the package.") } + if (!any(names(oro$coords) %in% .KnownLonNames()) | + !any(names(oro$coords) %in% .KnownLatNames())) { + stop("Spatial coordinate names of 'oro' do not match any of the names ", + "accepted by the package.") + } + + lon_data <- names(data$coords)[[which(names(data$coords) %in% .KnownLonNames())]] + lat_data <- names(data$coords)[[which(names(data$coords) %in% .KnownLatNames())]] - res <- RFTemp(data$data, data$lon, data$lat, - oro$data, oro$lon, oro$lat, xlim, ylim, lapse, + lon_oro <- names(oro$coords)[[which(names(oro$coords) %in% .KnownLonNames())]] + lat_oro <- names(oro$coords)[[which(names(oro$coords) %in% .KnownLatNames())]] + + res <- RFTemp(data = data$data, + lon = as.vector(data$coords[[lon_data]]), + lat = as.vector(data$coords[[lat_data]]), + oro = oro$data, + lonoro = as.vector(oro$coords[[lon_oro]]), + latoro = as.vector(oro$coords[[lat_oro]]), + xlim = xlim, ylim = ylim, lapse = lapse, lon_dim = lon_dim, lat_dim = lat_dim, time_dim = time_dim, nolapse = nolapse, verbose = verbose, method = method, compute_delta = compute_delta, delta = delta$data) data$data <- res$data - data$lon <- res$lon - data$lat <- res$lat + data$coords[[lon_data]] <- res$coords[[lon_dim]] + data$coords[[lat_data]] <- res$coords[[lat_dim]] return(data) } -#' @rdname RFTemp -#' @title Temperature downscaling of a CSTools object using lapse rate -#' correction (reduced version) -#' @author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} -#' @description This function implements a simple lapse rate correction of a -#' temperature field (a multidimensional array) as input. -#' The input lon grid must be increasing (but can be modulo 360). -#' The input lat grid can be irregularly spaced (e.g. a Gaussian grid) -#' The output grid can be irregularly spaced in lon and/or lat. -#' @references Method described in ERA4CS MEDSCOPE milestone M3.2: -#' High-quality climate prediction data available to WP4 -#' [https://www.medscope-project.eu/the-project/deliverables-reports/]([https://www.medscope-project.eu/the-project/deliverables-reports/) -#' and in H2020 ECOPOTENTIAL Deliverable No. 8.1: -#' High resolution (1-10 km) climate, land use and ocean change scenarios -#' [https://www.ecopotential-project.eu/images/ecopotential/documents/D8.1.pdf](https://www.ecopotential-project.eu/images/ecopotential/documents/D8.1.pdf) -#' @param data Temperature array to downscale. -#' The input array is expected to have at least two dimensions named -#' "lon" and "lat" by default -#' (these default names can be changed with the \code{lon_dim} and -#' \code{lat_dim} parameters) -#' @param lon Vector or array of longitudes. -#' @param lat Vector or array of latitudes. -#' @param lonoro Vector or array of longitudes corresponding to the fine orography. -#' @param latoro Vector or array of latitudes corresponding to the fine orography. -#' @param oro Array containing fine-scale orography (in m) -#' The destination downscaling area must be contained in the orography field. -#' @param xlim vector with longitude bounds for downscaling; -#' the full input field is downscaled if `xlim` and `ylim` are not specified. -#' @param ylim vector with latitude bounds for downscaling -#' @param lapse float with environmental lapse rate -#' @param lon_dim string with name of longitude dimension -#' @param lat_dim string with name of latitude dimension -#' @param time_dim a vector of character string indicating the name of temporal dimension. By default, it is set to NULL and it considers "ftime", "sdate" and "time" as temporal dimensions. -#' @param verbose logical if to print diagnostic output -#' @param nolapse logical, if true `oro` is interpreted as a -#' fine-scale climatology and used directly for bias correction -#' @param compute_delta logical if true returns only a delta to be used for -#' out-of-sample forecasts. -#' @param delta matrix containing a delta to be applied to the downscaled -#' input data. The grid of this matrix is supposed to be same as that of -#' the required output field -#' @param method string indicating the method used for interpolation: -#' "nearest" (nearest neighbours followed by smoothing with a circular -#' uniform weights kernel), "bilinear" (bilinear interpolation) -#' The two methods provide similar results, but nearest is slightly better -#' provided that the fine-scale grid is correctly centered as a subdivision -#' of the large-scale grid -#' @return CST_RFTemp() returns a downscaled CSTools object -#' @return RFTemp() returns a list containing the fine-scale -#' longitudes, latitudes and the downscaled fields. -#' @export -#' @import multiApply -#' @examples -#' # Generate simple synthetic data and downscale by factor 4 -#' t <- rnorm(7 * 6 * 4 * 3) * 10 + 273.15 + 10 -#' dim(t) <- c(sdate = 3, ftime = 4, lat = 6, lon = 7) -#' lon <- seq(3, 9, 1) -#' lat <- seq(42, 47, 1) -#' o <- runif(29 * 29) * 3000 -#' dim(o) <- c(lat = 29, lon = 29) -#' lono <- seq(3, 10, 0.25) -#' lato <- seq(41, 48, 0.25) -#' res <- RFTemp(t, lon, lat, o, lono, lato, xlim = c(4, 8), ylim = c(43, 46), -#' lapse = 6.5) - +#'@rdname RFTemp +#'@title Temperature downscaling of a CSTools object using lapse rate +#'correction (reduced version) +#'@author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} +#'@description This function implements a simple lapse rate correction of a +#'temperature field (a multidimensional array) as input. +#'The input lon grid must be increasing (but can be modulo 360). +#'The input lat grid can be irregularly spaced (e.g. a Gaussian grid) +#'The output grid can be irregularly spaced in lon and/or lat. +#'@references Method described in ERA4CS MEDSCOPE milestone M3.2: +#'High-quality climate prediction data available to WP4 here: +#'\ url{https://www.medscope-project.eu/the-project/deliverables-reports/} +#'and in H2020 ECOPOTENTIAL Deliverable No. 8.1: +#'High resolution (1-10 km) climate, land use and ocean change scenarios here: +#'\url{https://ec.europa.eu/research/participants/documents/downloadPublic?documentIds=080166e5b6cd2324&appId=PPGMS}. +#'@param data Temperature array to downscale. The input array is expected to +#' have at least two dimensions named "lon" and "lat" by default (these default +#' names can be changed with the \code{lon_dim} and \code{lat_dim} parameters). +#'@param lon Vector or array of longitudes. +#'@param lat Vector or array of latitudes. +#'@param lonoro Vector or array of longitudes corresponding to the fine orography. +#'@param latoro Vector or array of latitudes corresponding to the fine orography. +#'@param oro Array containing fine-scale orography (in m). The destination +#' downscaling area must be contained in the orography field. +#'@param xlim Vector with longitude bounds for downscaling; the full input field +#' is downscaled if `xlim` and `ylim` are not specified. +#'@param ylim Vector with latitude bounds for downscaling. +#'@param lapse Float with environmental lapse rate. +#'@param lon_dim String with name of longitude dimension. +#'@param lat_dim String with name of latitude dimension. +#'@param time_dim A vector of character string indicating the name of temporal +#' dimension. By default, it is set to NULL and it considers "ftime", "sdate" +#' and "time" as temporal dimensions. +#'@param verbose Logical if to print diagnostic output. +#'@param nolapse Logical, if true `oro` is interpreted as a fine-scale +#' climatology and used directly for bias correction. +#'@param compute_delta Logical if true returns only a delta to be used for +#' out-of-sample forecasts. +#'@param delta Matrix containing a delta to be applied to the downscaled +#' input data. The grid of this matrix is supposed to be same as that of +#' the required output field. +#'@param method String indicating the method used for interpolation: +#' "nearest" (nearest neighbours followed by smoothing with a circular +#' uniform weights kernel), "bilinear" (bilinear interpolation) +#' The two methods provide similar results, but nearest is slightly better +#' provided that the fine-scale grid is correctly centered as a subdivision +#' of the large-scale grid. +#'@return CST_RFTemp() returns a downscaled CSTools object. +#'@return RFTemp() returns a list containing the fine-scale +#'longitudes, latitudes and the downscaled fields. +#'@examples +#'# Generate simple synthetic data and downscale by factor 4 +#'t <- rnorm(7 * 6 * 4 * 3) * 10 + 273.15 + 10 +#'dim(t) <- c(sdate = 3, ftime = 4, lat = 6, lon = 7) +#'lon <- seq(3, 9, 1) +#'lat <- seq(42, 47, 1) +#'o <- runif(29 * 29) * 3000 +#'dim(o) <- c(lat = 29, lon = 29) +#'lono <- seq(3, 10, 0.25) +#'lato <- seq(41, 48, 0.25) +#'res <- RFTemp(t, lon, lat, o, lono, lato, xlim = c(4, 8), ylim = c(43, 46), +#' lapse = 6.5, time_dim = 'ftime') +#'@import multiApply +#'@export RFTemp <- function(data, lon, lat, oro, lonoro, latoro, xlim = NULL, ylim = NULL, lapse = 6.5, lon_dim = "lon", lat_dim = "lat", time_dim = NULL, nolapse = FALSE, verbose = FALSE, compute_delta = FALSE, method = "bilinear", delta = NULL) { + # Check 'lon_dim' and 'lat_dim' parameters + if (!all(c(lon_dim, lat_dim) %in% names(dim(data)))) { + stop("Parameters 'lon_dim' and 'lat_dim' do not match with 'data' ", + "dimension names.") + } # Check/detect time_dim if (is.null(time_dim)) { @@ -211,45 +250,49 @@ RFTemp <- function(data, lon, lat, oro, lonoro, latoro, result$lat <- array(result$lat[1:dim(result$lat)[1]]) names(dim(result$lon)) <- lon_dim names(dim(result$lat)) <- lat_dim + + names(result) <- c('data', lon_dim, lat_dim) + return(result) } -#' Lapse-rate temperature correction downscaling +#'Lapse-rate temperature correction downscaling #' -#' @description Downscales a temperature field using a lapse-rate -#' correction based on a reference orography. Time-averaging is done on all -#' dimensions after the first two. -#' @author Jost von Hardenberg, \email{j.vonhardenberg@isac.cnr.it} -#' @param lon vector of input longitudes -#' @param lat vector of input latitudes -#' @param t matrix of input temperature data -#' @param lono vector of orography longitudes -#' @param lato vector of orography latitudes -#' @param oro matrix of topographical elevations (in meters) -#' The destination downscaling area must be contained in the orography field. -#' @param xlim vector of longitude bounds; the full input field is downscaled if `xlim` and `ylim` are not specified. -#' @param ylim vector of latitude bounds -#' @param radius smoothing radius expressed in longitude units -#' (default is half a large-scale pixel) -#' @param lapse environmental lapse rate (in K/Km) -#' @param nolapse logical, if true `oro` is interpreted as a fine-scale -#' climatology and used directly for bias correction -#' @param compute_delta logical if true returns only a delta to be used for -#' out-of-sample forecasts. -#' @param delta matrix containing a delta to be applied to the input data. -#' The grid of this matrix is supposed to be same as -#' that of the required output field -#' @param verbose logical if to print diagnostic output -#' @return A downscaled temperature matrix -#' @examples -#' lon = 5:20 -#' lat = 35:40 -#' t = runif(16 * 6); dim(t) = c(16, 6) -#' lono = seq(5, 20, 0.1) -#' lato = seq(35, 40, 0.1) -#' o = runif(151 * 51) * 2000; dim(o) = c(151, 51) -#' td = .downscalet(t, lon, lat, o, lono, lato, c(8, 12), c(36, 38)) -#' @noRd +#'@description Downscales a temperature field using a lapse-rate +#'correction based on a reference orography. Time-averaging is done on all +#'dimensions after the first two. +#'@author Jost von Hardenberg, \email{j.vonhardenberg@isac.cnr.it} +#'@param lon Vector of input longitudes. +#'@param lat Vector of input latitudes. +#'@param t Matrix of input temperature data. +#'@param lono Vector of orography longitudes. +#'@param lato Vector of orography latitudes. +#'@param oro Matrix of topographical elevations (in meters). The destination +#' downscaling area must be contained in the orography field. +#'@param xlim Vector of longitude bounds; the full input field is downscaled if +#' `xlim` and `ylim` are not specified. +#'@param ylim Vector of latitude bounds. +#'@param radius Smoothing radius expressed in longitude units (default is half a +#' large-scale pixel). +#'@param lapse Environmental lapse rate (in K/Km). +#'@param nolapse Logical, if true `oro` is interpreted as a fine-scale +#' climatology and used directly for bias correction. +#'@param compute_delta Logical if true returns only a delta to be used for +#' out-of-sample forecasts. +#'@param delta Matrix containing a delta to be applied to the input data. +#' The grid of this matrix is supposed to be same as that of the required +#' output field. +#'@param verbose Logical if to print diagnostic output. +#'@return A downscaled temperature matrix. +#'@examples +#'lon = 5:20 +#'lat = 35:40 +#'t = runif(16 * 6); dim(t) = c(16, 6) +#'lono = seq(5, 20, 0.1) +#'lato = seq(35, 40, 0.1) +#'o = runif(151 * 51) * 2000; dim(o) = c(151, 51) +#'td = .downscalet(t, lon, lat, o, lono, lato, c(8, 12), c(36, 38)) +#'@noRd .downscalet <- function(t, lon, lat, oro, lono, lato, xlim = NULL, ylim = NULL, radius = 0, lapse = 6.5, nolapse = FALSE, @@ -403,31 +446,31 @@ RFTemp <- function(data, lon, lat, oro, lonoro, latoro, method = method) } -#' Nearest neighbour interpolation +#'Nearest neighbour interpolation #' -#' @description The input field is interpolated onto the output -#' coordinate grid using nearest neighbours or bilinear interpolation. -#' The input lon grid must be monotone increasing. -#' The input lat grid can be irregularly spaced (e.g. a Gaussian grid) -#' The output grid can be irregularly spaced in lon and/or lat. -#' @author Jost von Hardenberg, \email{j.vonhardenberg@isac.cnr.it} -#' @param z matrix with the input field to interpolate (assumed to -#' include also a third time dimension) -#' @param lon vector of input longitudes -#' @param lat vector of input latitudes -#' @param lonp vector of output longitudes -#' @param latp vector of output latitudes -#' @param method string indicating the interpolation method -#' ("nearest" or "bilinear" (default)) -#' @return The interpolated field. -#' @examples -#' lon = 5:11 -#' lat = 35:40 -#' z = runif(7 * 6 * 2); dim(z) = c(7, 6, 2) -#' lonp = seq(5, 10, 0.2) -#' latp = seq(35, 40, 0.2) -#' zo <- .interp2d(z, lon, lat, lonp, latp, method = "nearest") -#' @noRd +#'@description The input field is interpolated onto the output +#'coordinate grid using nearest neighbours or bilinear interpolation. +#'The input lon grid must be monotone increasing. +#'The input lat grid can be irregularly spaced (e.g. a Gaussian grid) +#'The output grid can be irregularly spaced in lon and/or lat. +#'@author Jost von Hardenberg, \email{j.vonhardenberg@isac.cnr.it} +#'@param z Matrix with the input field to interpolate (assumed to +#' include also a third time dimension) +#'@param lon Vector of input longitudes. +#'@param lat Vector of input latitudes. +#'@param lonp Vector of output longitudes. +#'@param latp Vector of output latitudes. +#'@param method String indicating the interpolation method ("nearest" or +#' "bilinear" (default)). +#'@return The interpolated field. +#'@examples +#'lon = 5:11 +#'lat = 35:40 +#'z = runif(7 * 6 * 2); dim(z) = c(7, 6, 2) +#'lonp = seq(5, 10, 0.2) +#'latp = seq(35, 40, 0.2) +#'zo <- .interp2d(z, lon, lat, lonp, latp, method = "nearest") +#'@noRd .interp2d <- function(z, lon, lat, lonp, latp, method="bilinear") { nx <- length(lonp) @@ -502,21 +545,21 @@ RFTemp <- function(data, lon, lat, oro, lonoro, latoro, return(zo) } -#' Smoothening using convolution with a circular kernel +#'Smoothening using convolution with a circular kernel #' -#' @description The input field is convolved with a circular kernel with equal -#' weights. Takes into account missing values. -#' @author Jost von Hardenberg, \email{j.vonhardenberg@isac.cnr.it} -#' @param z matrix with the input field to smoothen, with dimensions `c(ns, ns)` -#' @param sdim the smoothing kernel radius in pixel -#' @return The smoothened field. -#' @examples -#' z <- rnorm(64 * 64) -#' dim(z) <- c(64, 64) -#' zs <- smooth(z, 8) -#' sd(zs) -#' # [1] 0.1334648 -#' @noRd +#'@description The input field is convolved with a circular kernel with equal +#'weights. Takes into account missing values. +#'@author Jost von Hardenberg, \email{j.vonhardenberg@isac.cnr.it} +#'@param z Matrix with the input field to smoothen, with dimensions `c(ns, ns)` +#'@param sdim The smoothing kernel radius in pixel. +#'@return The smoothened field. +#'@examples +#'z <- rnorm(64 * 64) +#'dim(z) <- c(64, 64) +#'zs <- smooth(z, 8) +#'sd(zs) +#'# [1] 0.1334648 +#'@noRd .smooth <- function(z, sdim) { nsx <- dim(z)[1] nsy <- dim(z)[2] diff --git a/R/CST_RFWeights.R b/R/CST_RFWeights.R index fa7eec75045717fdd08918dcc1b1d0b19cbd77f6..e93931d7f96659b19ae6b02a3f055dd544382804 100644 --- a/R/CST_RFWeights.R +++ b/R/CST_RFWeights.R @@ -1,55 +1,63 @@ -#' Compute climatological weights for RainFARM stochastic precipitation downscaling +#'Compute climatological weights for RainFARM stochastic precipitation downscaling #' -#' @author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} +#'@author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} #' -#' @description Compute climatological ("orographic") weights from a fine-scale precipitation climatology file. -#' @references Terzago, S., Palazzi, E., & von Hardenberg, J. (2018). -#' Stochastic downscaling of precipitation in complex orography: -#' A simple method to reproduce a realistic fine-scale climatology. -#' Natural Hazards and Earth System Sciences, 18(11), -#' 2825-2840. http://doi.org/10.5194/nhess-18-2825-2018 . -#' @param climfile Filename of a fine-scale precipitation climatology. -#' The file is expected to be in NetCDF format and should contain -#' at least one precipitation field. If several fields at different times are provided, -#' a climatology is derived by time averaging. -#' Suitable climatology files could be for example a fine-scale precipitation climatology -#' from a high-resolution regional climate model (see e.g. Terzago et al. 2018), a local -#' high-resolution gridded climatology from observations, or a reconstruction such as those which -#' can be downloaded from the WORLDCLIM (http://www.worldclim.org) or CHELSA (http://chelsa-climate.org) -#' websites. The latter data will need to be converted to NetCDF format before being used (see for example the GDAL tools (https://www.gdal.org). -#' It could also be a 's2dv_cube' object. -#' @param nf Refinement factor for downscaling (the output resolution is increased by this factor). -#' @param lon Vector of longitudes. -#' @param lat Vector of latitudes. -#' The number of longitudes and latitudes is expected to be even and the same. If not -#' the function will perform a subsetting to ensure this condition. -#' @param varname Name of the variable to be read from \code{climfile}. -#' @param fsmooth Logical to use smooth conservation (default) or large-scale box-average conservation. -#' @param lonname a character string indicating the name of the longitudinal dimension set as 'lon' by default. -#' @param latname a character string indicating the name of the latitudinal dimension set as 'lat' by default. -#' @param ncores an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. +#'@description Compute climatological ("orographic") weights from a fine-scale +#'precipitation climatology file. +#'@references Terzago, S., Palazzi, E., & von Hardenberg, J. (2018). +#'Stochastic downscaling of precipitation in complex orography: +#'A simple method to reproduce a realistic fine-scale climatology. +#'Natural Hazards and Earth System Sciences, 18(11), +#'2825-2840. \doi{10.5194/nhess-18-2825-2018}. +#'@param climfile Filename of a fine-scale precipitation climatology. The file +#' is expected to be in NetCDF format and should contain at least one +#' precipitation field. If several fields at different times are provided, +#' a climatology is derived by time averaging. Suitable climatology files could +#' be for example a fine-scale precipitation climatology from a high-resolution +#' regional climate model (see e.g. Terzago et al. 2018), a local +#' high-resolution gridded climatology from observations, or a reconstruction +#' such as those which can be downloaded from the WORLDCLIM +#' (\url{https://www.worldclim.org}) or CHELSA (\url{https://chelsa-climate.org/}) +#' websites. The latter data will need to be converted to NetCDF format before +#' being used (see for example the GDAL tools (\url{https://gdal.org/}). It +#' could also be an 's2dv_cube' object. +#'@param nf Refinement factor for downscaling (the output resolution is +#' increased by this factor). +#'@param lon Vector of longitudes. +#'@param lat Vector of latitudes. The number of longitudes and latitudes is +#' expected to be even and the same. If not the function will perform a +#' subsetting to ensure this condition. +#'@param varname Name of the variable to be read from \code{climfile}. +#'@param fsmooth Logical to use smooth conservation (default) or large-scale +#' box-average conservation. +#'@param lonname A character string indicating the name of the longitudinal +#' dimension set as 'lon' by default. +#'@param latname A character string indicating the name of the latitudinal +#' dimension set as 'lat' by default. +#'@param ncores An integer that indicates the number of cores for parallel +#' computations using multiApply function. The default value is one. #' -#' @return An object of class 's2dv_cube' containing in matrix \code{data} the weights with dimensions (lon, lat). -#' @import ncdf4 -#' @import rainfarmr -#' @import multiApply -#' @importFrom utils tail -#' @importFrom utils head -#' @examples -#' # Create weights to be used with the CST_RainFARM() or RainFARM() functions -#' # using an external fine-scale climatology file. -#' -#' \dontrun{ -#' # Specify lon and lat of the input -#' lon <- seq(10,13.5,0.5) -#' lat <- seq(40,43.5,0.5) -#' nf <- 8 -#' ww <- CST_RFWeights("./worldclim.nc", nf, lon, lat, fsmooth = TRUE) -#' } -#' @export - -CST_RFWeights <- function(climfile, nf, lon, lat, varname = NULL, - fsmooth = TRUE, +#'@return An object of class 's2dv_cube' containing in matrix \code{data} the +#'weights with dimensions (lon, lat). +#'@examples +#'# Create weights to be used with the CST_RainFARM() or RainFARM() functions +#'# using an external random data in the form of 's2dv_cube'. +#'obs <- rnorm(2 * 3 * 4 * 8 * 8) +#'dim(obs) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) +#'lon <- seq(10, 13.5, 0.5) +#'lat <- seq(40, 43.5, 0.5) +#'coords <- list(lon = lon, lat = lat) +#'data <- list(data = obs, coords = coords) +#'class(data) <- "s2dv_cube" +#'res <- CST_RFWeights(climfile = data, nf = 3, lon, lat, lonname = 'lon', +#' latname = 'lat', fsmooth = TRUE) +#'@import ncdf4 +#'@import rainfarmr +#'@import multiApply +#'@importFrom utils tail +#'@importFrom utils head +#'@export +CST_RFWeights <- function(climfile, nf, lon, lat, varname = NULL, fsmooth = TRUE, lonname = 'lon', latname = 'lat', ncores = NULL) { if (!inherits(climfile, "s2dv_cube")) { if (!is.null(varname) & !is.character(varname)) { @@ -83,9 +91,23 @@ CST_RFWeights <- function(climfile, nf, lon, lat, varname = NULL, zclim <- ncvar_get(ncin, varname) nc_close(ncin) } else if (inherits(climfile, "s2dv_cube")) { + # Check object structure + if (!all(c('data', 'coords') %in% names(climfile))) { + stop("Parameter 'climfile' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + } + # Check coordinates + if (!any(names(climfile$coords) %in% .KnownLonNames()) | + !any(names(climfile$coords) %in% .KnownLatNames())) { + stop("Spatial coordinate names do not match any of the names accepted by ", + "the package.") + } + loncoordname <- names(climfile$coords)[[which(names(climfile$coords) %in% .KnownLonNames())]] + latcoordname <- names(climfile$coords)[[which(names(climfile$coords) %in% .KnownLatNames())]] + zclim <- climfile$data - latin <- climfile$lat - lonin <- climfile$lon + latin <- as.vector(climfile$coords[[latcoordname]]) + lonin <- as.vector(climfile$coords[[loncoordname]]) } else { stop("Parameter 'climfile' is expected to be a character string indicating", " the path to the files or an object of class 's2dv_cube'.") @@ -99,57 +121,76 @@ CST_RFWeights <- function(climfile, nf, lon, lat, varname = NULL, lonname = lonname, latname = latname, ncores = ncores) if (inherits(climfile, "s2dv_cube")) { climfile$data <- result$data - climfile$lon <- result$lon - climfile$lat <- result$lat - } else { - climfile <- s2dv_cube(data = result, lon = result$lon, lat = result$lat) + climfile$coords[[loncoordname]] <- result[[lonname]] + climfile$coords[[latcoordname]] <- result[[latname]] + } else { + climfile <- NULL + climfile$data <- result + climfile$coords[[lonname]] <- result[[lonname]] + climfile$coords[[latname]] <- result[[latname]] } return(climfile) } -#' Compute climatological weights for RainFARM stochastic precipitation downscaling +#'Compute climatological weights for RainFARM stochastic precipitation downscaling #' -#' @author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} +#'@author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} #' -#' @description Compute climatological ("orographic") weights from a fine-scale precipitation climatology file. -#' @references Terzago, S., Palazzi, E., & von Hardenberg, J. (2018). -#' Stochastic downscaling of precipitation in complex orography: -#' A simple method to reproduce a realistic fine-scale climatology. -#' Natural Hazards and Earth System Sciences, 18(11), -#' 2825-2840. http://doi.org/10.5194/nhess-18-2825-2018 . -#' @param zclim a multi-dimensional array with named dimension containing at least one precipiation field with spatial dimensions. -#' @param lonin a vector indicating the longitudinal coordinates corresponding to the \code{zclim} parameter. -#' @param latin a vector indicating the latitudinal coordinates corresponding to the \code{zclim} parameter. -#' @param nf Refinement factor for downscaling (the output resolution is increased by this factor). -#' @param lon Vector of longitudes. -#' @param lat Vector of latitudes. -#' The number of longitudes and latitudes is expected to be even and the same. If not -#' the function will perform a subsetting to ensure this condition. -#' @param fsmooth Logical to use smooth conservation (default) or large-scale box-average conservation. -#' @param lonname a character string indicating the name of the longitudinal dimension set as 'lon' by default. -#' @param latname a character string indicating the name of the latitudinal dimension set as 'lat' by default. -#' @param ncores an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. +#'@description Compute climatological ("orographic") weights from a fine-scale +#'precipitation climatology file. +#'@references Terzago, S., Palazzi, E., & von Hardenberg, J. (2018). +#'Stochastic downscaling of precipitation in complex orography: +#'A simple method to reproduce a realistic fine-scale climatology. +#'Natural Hazards and Earth System Sciences, 18(11), +#'2825-2840. \doi{10.5194/nhess-18-2825-2018}. +#'@param zclim A multi-dimensional array with named dimension containing at +#' least one precipiation field with spatial dimensions. +#'@param lonin A vector indicating the longitudinal coordinates corresponding to +#' the \code{zclim} parameter. +#'@param latin A vector indicating the latitudinal coordinates corresponding to +#' the \code{zclim} parameter. +#'@param nf Refinement factor for downscaling (the output resolution is +#' increased by this factor). +#'@param lon Vector of longitudes. +#'@param lat Vector of latitudes. The number of longitudes and latitudes is +#' expected to be even and the same. If not the function will perform a +#' subsetting to ensure this condition. +#'@param fsmooth Logical to use smooth conservation (default) or large-scale +#' box-average conservation. +#'@param lonname A character string indicating the name of the longitudinal +#' dimension set as 'lon' by default. +#'@param latname A character string indicating the name of the latitudinal +#' dimension set as 'lat' by default. +#'@param ncores An integer that indicates the number of cores for parallel +#' computations using multiApply function. The default value is one. #' -#' @return An object of class 's2dv_cube' containing in matrix \code{data} the weights with dimensions (lon, lat). -#' @import ncdf4 -#' @import rainfarmr -#' @import multiApply -#' @importFrom utils tail -#' @importFrom utils head -#' @examples -#' a <- array(1:2500, c(lat = 50, lon = 50)) -#' res <- RF_Weights(a, seq(0.1 ,5, 0.1), seq(0.1 ,5, 0.1), -#' nf = 5, lat = 1:5, lon = 1:5) -#' @export +#'@return An object of class 's2dv_cube' containing in matrix \code{data} the +#'weights with dimensions (lon, lat). +#'@examples +#'a <- array(1:2500, c(lat = 50, lon = 50)) +#'res <- RF_Weights(a, seq(0.1 ,5, 0.1), seq(0.1 ,5, 0.1), +#' nf = 5, lat = 1:5, lon = 1:5) +#'@import ncdf4 +#'@import rainfarmr +#'@import multiApply +#'@importFrom utils tail +#'@importFrom utils head +#'@export RF_Weights <- function(zclim, latin, lonin, nf, lat, lon, fsmooth = TRUE, lonname = 'lon', latname = 'lat', ncores = NULL) { x <- Apply(list(zclim), target_dims = c(lonname, latname), fun = rf_weights, latin = latin, lonin = lonin, nf = nf, lat = lat, lon = lon, + lonname = lonname, latname = latname, fsmooth = fsmooth, ncores = ncores)$output1 - grid <- lon_lat_fine(lon, lat, nf) - return(list(data = x, lon = grid$lon, lat = grid$lat)) + grid <- lon_lat_fine(lon, lat, nf) + res <- NULL + res$data <- x + res[[lonname]] <- grid$lon + res[[latname]] <- grid$lon + return(res) } -rf_weights <- function(zclim, latin, lonin, nf, lat, lon, fsmooth = TRUE) { +rf_weights <- function(zclim, latin, lonin, nf, lat, lon, lonname = 'lon', + latname = 'lat', fsmooth = TRUE) { # Check if lon and lat need to be reversed if (lat[1] > lat[2]) { lat <- rev(lat) @@ -195,6 +236,6 @@ rf_weights <- function(zclim, latin, lonin, nf, lat, lon, fsmooth = TRUE) { if (frev) { ww <- ww[, seq(dim(ww)[2], 1)] } - attributes(dim(ww))$names <- c("lon", "lat") + attributes(dim(ww))$names <- c(lonname, latname) return(ww) } diff --git a/R/CST_RainFARM.R b/R/CST_RainFARM.R index 62e9339e913c1fbc3613b8c032efd8265dc43730..d01d9a1bc9584723737f3412b27125c8e3573a70 100644 --- a/R/CST_RainFARM.R +++ b/R/CST_RainFARM.R @@ -1,215 +1,247 @@ -#' @rdname CST_RainFARM -#' @title RainFARM stochastic precipitation downscaling of a CSTools object +#'@rdname CST_RainFARM +#'@title RainFARM stochastic precipitation downscaling of a CSTools object #' -#' @author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} +#'@author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} #' -#' @description This function implements the RainFARM stochastic precipitation -#' downscaling method and accepts a CSTools object (an object of the class -#' 's2dv_cube' as provided by `CST_Load`) as input. -#' Adapted for climate downscaling and including orographic correction -#' as described in Terzago et al. 2018. -#' @references Terzago, S. et al. (2018). NHESS 18(11), 2825-2840. -#' http://doi.org/10.5194/nhess-18-2825-2018 ; -#' D'Onofrio et al. (2014), J of Hydrometeorology 15, 830-843; Rebora et. al. (2006), JHM 7, 724. -#' @param data An object of the class 's2dv_cube' as returned by `CST_Load`, -#' containing the spatial precipitation fields to downscale. -#' The data object is expected to have an element named \code{$data} with at least two -#' spatial dimensions named "lon" and "lat" and one or more dimensions over which -#' to compute average spectral slopes (unless specified with parameter \code{slope}), -#' which can be specified by parameter \code{time_dim}. -#' The number of longitudes and latitudes in the input data is expected to be even and the same. If not -#' the function will perform a subsetting to ensure this condition. -#' @param weights Matrix with climatological weights which can be obtained using -#' the \code{CST_RFWeights} function. If \code{weights=1.} (default) no weights are used. -#' The names of these dimensions must be at least 'lon' and 'lat'. -#' @param nf Refinement factor for downscaling (the output resolution is increased by this factor). -#' @param slope Prescribed spectral slope. The default is \code{slope=0.} -#' meaning that the slope is determined automatically over the dimensions specified by \code{time_dim}. A 1D array with named dimension can be provided (see details and examples) -#' @param kmin First wavenumber for spectral slope (default: \code{kmin=1}). -#' @param nens Number of ensemble members to produce (default: \code{nens=1}). -#' @param fglob Logical to conserve global precipitation over the domain (default: FALSE). -#' @param fsmooth Logical to conserve precipitation with a smoothing kernel (default: TRUE). -#' @param time_dim String or character array with name(s) of dimension(s) -#' (e.g. "ftime", "sdate", "member" ...) over which to compute spectral slopes. -#' If a character array of dimension names is provided, the spectral slopes -#' will be computed as an average over all elements belonging to those dimensions. -#' If omitted one of c("ftime", "sdate", "time") is searched and the first one with more -#' than one element is chosen. -#' @param verbose Logical for verbose output (default: FALSE). -#' @param drop_realization_dim Logical to remove the "realization" stochastic ensemble dimension, -#' needed for saving data through function CST_SaveData (default: FALSE) -#' with the following behaviour if set to TRUE: +#'@description This function implements the RainFARM stochastic precipitation +#'downscaling method and accepts a CSTools object (an object of the class +#''s2dv_cube' as provided by `CST_Load`) as input. +#'Adapted for climate downscaling and including orographic correction +#'as described in Terzago et al. 2018. +#'@references Terzago, S. et al. (2018). NHESS 18(11), 2825-2840. +#'\doi{10.5194/nhess-18-2825-2018}; +#'D'Onofrio et al. (2014), J of Hydrometeorology 15, 830-843; Rebora et. al. +#'(2006), JHM 7, 724. +#'@param data An object of the class 's2dv_cube' as returned by `CST_Load`, +#' containing the spatial precipitation fields to downscale. +#' The data object is expected to have an element named \code{$data} with at +#' least two spatial dimensions named "lon" and "lat" and one or more +#' dimensions over which to compute average spectral slopes (unless specified +#' with parameter \code{slope}), which can be specified by parameter +#' \code{time_dim}. The number of longitudes and latitudes in the input data is +#' expected to be even and the same. If not the function will perform a +#' subsetting to ensure this condition. +#'@param weights Matrix with climatological weights which can be obtained using +#' the \code{CST_RFWeights} function. If \code{weights = 1.} (default) no +#' weights are used. The names of these dimensions must be at least 'lon' and +#' 'lat'. +#'@param nf Refinement factor for downscaling (the output resolution is +#' increased by this factor). +#'@param slope Prescribed spectral slope. The default is \code{slope = 0.} +#' meaning that the slope is determined automatically over the dimensions +#' specified by \code{time_dim}. A 1D array with named dimension can be +#' provided (see details and examples). +#'@param kmin First wavenumber for spectral slope (default: \code{kmin = 1}). +#'@param nens Number of ensemble members to produce (default: \code{nens = 1}). +#'@param fglob Logical to conserve global precipitation over the domain +#' (default: FALSE). +#'@param fsmooth Logical to conserve precipitation with a smoothing kernel +#' (default: TRUE). +#'@param time_dim String or character array with name(s) of dimension(s) +#' (e.g. "ftime", "sdate", "member" ...) over which to compute spectral slopes. +#' If a character array of dimension names is provided, the spectral slopes +#' will be computed as an average over all elements belonging to those +#' dimensions. If omitted one of c("ftime", "sdate", "time") is searched and +#' the first one with more than one element is chosen. +#'@param verbose Logical for verbose output (default: FALSE). +#'@param drop_realization_dim Logical to remove the "realization" stochastic +#' ensemble dimension, needed for saving data through function CST_SaveData +#' (default: FALSE) with the following behaviour if set to TRUE: +#' \enumerate{ +#' \item{if \code{nens == 1}: the dimension is dropped;} +#' \item{if \code{nens > 1} and a "member" dimension exists: the "realization" +#' and "member" dimensions are compacted (multiplied) and the resulting +#' dimension is named "member";} +#' \item{if \code{nens > 1} and a "member" dimension does not exist: the +#' "realization" dimension is renamed to "member".} +#' } +#'@param nprocs The number of parallel processes to spawn for the use for +#' parallel computation in multiple cores. (default: 1) #' -#' 1) if \code{nens==1}: the dimension is dropped; -#' -#' 2) if \code{nens>1} and a "member" dimension exists: -#' the "realization" and "member" dimensions are compacted (multiplied) and the resulting dimension is named "member"; -#' -#' 3) if \code{nens>1} and a "member" dimension does not exist: the "realization" dimension is renamed to "member". -#' @param nprocs The number of parallel processes to spawn for the use for parallel computation in multiple cores. (default: 1) -#' -#' @return CST_RainFARM() returns a downscaled CSTools object (i.e., of the -#' class 's2dv_cube'). -#' If \code{nens>1} an additional dimension named "realizatio"n is added to the -#' \code{$data} array after the "member" dimension (unless -#' \code{drop_realization_dim=TRUE} is specified). -#' The ordering of the remaining dimensions in the \code{$data} element of the input object is maintained. -#' @details Wether parameter 'slope' and 'weights' presents seasonality dependency, a dimension name should match between these parameters and the input data in parameter 'data'. See example 2 below where weights and slope vary with 'sdate' dimension. -#' @import multiApply -#' @import rainfarmr -#' @examples -#' #Example 1: using CST_RainFARM for a CSTools object -#' nf <- 8 # Choose a downscaling by factor 8 -#' exp <- 1 : (2 * 3 * 4 * 8 * 8) -#' dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) -#' lon <- seq(10, 13.5, 0.5) -#' dim(lon) <- c(lon = length(lon)) -#' lat <- seq(40, 43.5, 0.5) -#' dim(lat) <- c(lat = length(lat)) -#' data <- list(data = exp, lon = lon, lat = lat) -#' # Create a test array of weights -#' ww <- array(1., dim = c(lon = 8 * nf, lat = 8 * nf)) -#' res <- CST_RainFARM(data, nf = nf, weights = ww, nens=3) -#' str(res) -#' #List of 3 -#' # $ data: num [1, 1:2, 1:3, 1:3, 1:4, 1:64, 1:64] 260 553 281 278 143 ... -#' # $ lon : num [1:64] 9.78 9.84 9.91 9.97 10.03 ... -#' # $ lat : num [1:64] 39.8 39.8 39.9 40 40 ... -#' dim(res$data) -#' # dataset member realization sdate ftime lat lon -#' # 1 2 3 3 4 64 64 -#' -#' # Example 2: -#' slo <- array(c(0.1, 0.5, 0.7), c(sdate= 3)) -#' wei <- array(rnorm(8 * 8 * 3), c(lon = 8, lat = 8, sdate = 3)) -#' res <- CST_RainFARM(lonlat_prec, -#' weights = wei, slope = slo, nf = 2) -#' @export +#'@return CST_RainFARM() returns a downscaled CSTools object (i.e., of the +#'class 's2dv_cube'). If \code{nens > 1} an additional dimension named +#'"realization" is added to the \code{$data} array after the "member" dimension +#'(unless \code{drop_realization_dim = TRUE} is specified). The ordering of the +#'remaining dimensions in the \code{$data} element of the input object is +#'maintained. +#'@details Wether parameter 'slope' and 'weights' presents seasonality +#'dependency, a dimension name should match between these parameters and the +#'input data in parameter 'data'. See example 2 below where weights and slope +#'vary with 'sdate' dimension. +#'@examples +#'# Example 1: using CST_RainFARM for a CSTools object +#'nf <- 8 # Choose a downscaling by factor 8 +#'exp <- 1 : (2 * 3 * 4 * 8 * 8) +#'dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) +#'lon <- seq(10, 13.5, 0.5) +#'lat <- seq(40, 43.5, 0.5) +#'coords <- list(lon = lon, lat = lat) +#'data <- list(data = exp, coords = coords) +#'class(data) <- 's2dv_cube' +#'# Create a test array of weights +#'ww <- array(1., dim = c(lon = 8 * nf, lat = 8 * nf)) +#'res <- CST_RainFARM(data, nf = nf, weights = ww, nens = 3, time_dim = 'ftime') +#'@import multiApply +#'@import rainfarmr +#'@importFrom ClimProjDiags Subset +#'@importFrom abind abind +#'@export CST_RainFARM <- function(data, weights = 1., slope = 0, nf, kmin = 1, nens = 1, fglob = FALSE, fsmooth = TRUE, nprocs = 1, time_dim = NULL, verbose = FALSE, drop_realization_dim = FALSE) { + # Check 's2dv_cube' + if (!inherits(data, "s2dv_cube")) { + stop("Parameter 'data' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + } + # Check 'exp' object structure + if (!all(c('data', 'coords') %in% names(data))) { + stop("Parameter 'data' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + } + # Check coordinates + if (!any(names(data$coords) %in% .KnownLonNames()) | + !any(names(data$coords) %in% .KnownLatNames())) { + stop("Spatial coordinate names do not match any of the names accepted by ", + "the package.") + } + # Check dimensions + if (!any(names(dim(data$data)) %in% .KnownLonNames()) | + !any(names(dim(data$data)) %in% .KnownLatNames())) { + stop("Spatial dimension names do not match any of the names accepted by ", + "the package.") + } + + lon <- names(data$coords)[[which(names(data$coords) %in% .KnownLonNames())]] + lat <- names(data$coords)[[which(names(data$coords) %in% .KnownLatNames())]] + + lon_name <- names(dim(data$data))[[which(names(dim(data$data)) %in% .KnownLonNames())]] + lat_name <- names(dim(data$data))[[which(names(dim(data$data)) %in% .KnownLatNames())]] - res <- RainFARM(data$data, data$lon, data$lat, - nf = nf, weights = weights, nens, slope, kmin, fglob, fsmooth, - nprocs, time_dim, lon_dim = "lon", lat_dim = "lat", + res <- RainFARM(data = data$data, + lon = as.vector(data$coords[[lon]]), + lat = as.vector(data$coords[[lat]]), + nf = nf, weights = weights, nens, slope, kmin, + fglob, fsmooth, nprocs, time_dim, + lon_dim = lon_name, lat_dim = lat_name, drop_realization_dim, verbose) - att_lon <- attributes(data$lon)[-1] - att_lat <- attributes(data$lat)[-1] + data$data <- res$data - data$lon <- res$lon - attributes(data$lon) <- att_lon - data$lat <- res$lat - attributes(data$lat) <- att_lat + data$coords[[lon]] <- res[[lon_name]] + data$coords[[lat]] <- res[[lat_name]] return(data) } -#' @rdname RainFARM -#' @title RainFARM stochastic precipitation downscaling (reduced version) -#' @author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} -#' @description This function implements the RainFARM stochastic precipitation downscaling method -#' and accepts in input an array with named dims ("lon", "lat") -#' and one or more dimension (such as "ftime", "sdate" or "time") -#' over which to average automatically determined spectral slopes. -#' Adapted for climate downscaling and including orographic correction. -#' References: -#' Terzago, S. et al. (2018). NHESS 18(11), 2825-2840. http://doi.org/10.5194/nhess-18-2825-2018, -#' D'Onofrio et al. (2014), J of Hydrometeorology 15, 830-843; Rebora et. al. (2006), JHM 7, 724. -#' @param data Precipitation array to downscale. -#' The input array is expected to have at least two dimensions named "lon" and "lat" by default -#' (these default names can be changed with the \code{lon_dim} and \code{lat_dim} parameters) -#' and one or more dimensions over which to average these slopes, -#' which can be specified by parameter \code{time_dim}. -#' The number of longitudes and latitudes in the input data is expected to be even and the same. If not -#' the function will perform a subsetting to ensure this condition. -#' @param lon Vector or array of longitudes. -#' @param lat Vector or array of latitudes. -#' @param weights multi-dimensional array with climatological weights which can be obtained using -#' the \code{CST_RFWeights} function. If \code{weights=1.} (default) no weights are used. -#' The names of these dimensions must be at least 'lon' and 'lat'. -#' @param nf Refinement factor for downscaling (the output resolution is increased by this factor). -#' @param slope Prescribed spectral slope. The default is \code{slope=0.} -#' meaning that the slope is determined automatically over the dimensions specified by \code{time_dim}. A 1D array with named dimension can be provided (see details and examples) -#' @param kmin First wavenumber for spectral slope (default: \code{kmin=1}). -#' @param nens Number of ensemble members to produce (default: \code{nens=1}). -#' @param fglob Logical to conseve global precipitation over the domain (default: FALSE) -#' @param fsmooth Logical to conserve precipitation with a smoothing kernel (default: TRUE) -#' @param time_dim String or character array with name(s) of time dimension(s) -#' (e.g. "ftime", "sdate", "time" ...) over which to compute spectral slopes. -#' If a character array of dimension names is provided, the spectral slopes -#' will be computed over all elements belonging to those dimensions. -#' If omitted one of c("ftime", "sdate", "time") -#' is searched and the first one with more than one element is chosen. -#' @param lon_dim Name of lon dimension ("lon" by default). -#' @param lat_dim Name of lat dimension ("lat" by default). -#' @param verbose logical for verbose output (default: FALSE). -#' @param drop_realization_dim Logical to remove the "realization" stochastic ensemble dimension (default: FALSE) -#' with the following behaviour if set to TRUE: -#' -#' 1) if \code{nens==1}: the dimension is dropped; -#' -#' 2) if \code{nens>1} and a "member" dimension exists: -#' the "realization" and "member" dimensions are compacted (multiplied) and the resulting dimension is named "member"; -#' -#' 3) if \code{nens>1} and a "member" dimension does not exist: the "realization" dimension is renamed to "member". -#' -#' @param nprocs The number of parallel processes to spawn for the use for parallel computation in multiple cores. (default: 1) -#' @return RainFARM() returns a list containing the fine-scale longitudes, latitudes -#' and the sequence of \code{nens} downscaled fields. -#' If \code{nens>1} an additional dimension named "realization" is added to the output array -#' after the "member" dimension (if it exists and unless \code{drop_realization_dim=TRUE} is specified). -#' The ordering of the remaining dimensions in the \code{exp} element of the input object is maintained. -#' @details Wether parameter 'slope' and 'weights' presents seasonality dependency, a dimension name should match between these parameters and the input data in parameter 'data'. See example 2 below where weights and slope vary with 'sdate' dimension. -#' @import multiApply -#' @importFrom ClimProjDiags Subset -#' @importFrom abind abind -#' @export -#' @examples -#' # Example for the 'reduced' RainFARM function -#' nf <- 8 # Choose a downscaling by factor 8 -#' nens <- 3 # Number of ensemble members -#' # create a test array with dimension 8x8 and 20 timesteps -#' # or provide your own read from a netcdf file -#' pr <- rnorm(8 * 8 * 20) -#' dim(pr) <- c(lon = 8, lat = 8, ftime = 20) -#' lon_mat <- seq(10, 13.5, 0.5) # could also be a 2d matrix -#' lat_mat <- seq(40, 43.5, 0.5) -#' # Create a test array of weights -#' ww <- array(1., dim = c(lon = 8 * nf, lat = 8 * nf)) -#' # or create proper weights using an external fine-scale climatology file -#' # Specify a weightsfn filename if you wish to save the weights -#' \dontrun{ -#' ww <- CST_RFWeights("./worldclim.nc", nf, lon = lon_mat, lat = lat_mat, -#' fsmooth = TRUE) -#' } -#' # downscale using weights (ww=1. means do not use weights) -#' res <- RainFARM(pr, lon_mat, lat_mat, nf, -#' fsmooth = TRUE, fglob = FALSE, -#' weights = ww, nens = 2, verbose = TRUE) -#' str(res) -#' #List of 3 -#' # $ data: num [1:3, 1:20, 1:64, 1:64] 0.186 0.212 0.138 3.748 0.679 ... -#' # $ lon : num [1:64] 9.78 9.84 9.91 9.97 10.03 ... -#' # $ lat : num [1:64] 39.8 39.8 39.9 40 40 ... -#' dim(res$data) -#' # lon lat ftime realization -#' # 64 64 20 2 -#' # Example 2: -#' slo <- array(c(0.1, 0.5, 0.7), c(sdate= 3)) -#' wei <- array(rnorm(8*8*3), c(lon = 8, lat = 8, sdate = 3)) -#' res <- RainFARM(lonlat_prec$data, lon = lonlat_prec$lon, -#' lat = lonlat_prec$lat, weights = wei, slope = slo, nf = 2) -RainFARM <- function(data, lon, lat, nf, weights = 1., nens = 1, - slope = 0, kmin = 1, fglob = FALSE, fsmooth = TRUE, - nprocs = 1, time_dim = NULL, lon_dim = "lon", lat_dim = "lat", +#'@rdname RainFARM +#'@title RainFARM stochastic precipitation downscaling (reduced version) +#'@author Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} +#'@description This function implements the RainFARM stochastic precipitation downscaling method +#'and accepts in input an array with named dims ("lon", "lat") +#'and one or more dimension (such as "ftime", "sdate" or "time") +#'over which to average automatically determined spectral slopes. +#'Adapted for climate downscaling and including orographic correction. +#'References: +#'Terzago, S. et al. (2018). NHESS 18(11), 2825-2840. \doi{10.5194/nhess-18-2825-2018}, +#'D'Onofrio et al. (2014), J of Hydrometeorology 15, 830-843; Rebora et. al. +#'(2006), JHM 7, 724. +#'@param data Precipitation array to downscale. The input array is expected to +#' have at least two dimensions named "lon" and "lat" by default (these default +#' names can be changed with the \code{lon_dim} and \code{lat_dim} parameters) +#' and one or more dimensions over which to average these slopes, which can be +#' specified by parameter \code{time_dim}. The number of longitudes and +#' latitudes in the input data is expected to be even and the same. If not +#' the function will perform a subsetting to ensure this condition. +#'@param lon Vector or array of longitudes. +#'@param lat Vector or array of latitudes. +#'@param weights Multi-dimensional array with climatological weights which can +#' be obtained using the \code{CST_RFWeights} function. If \code{weights = 1.} +#' (default) no weights are used. The names of these dimensions must be at +#' least the same longitudinal and latitudinal dimension names as data. +#'@param nf Refinement factor for downscaling (the output resolution is +#' increased by this factor). +#'@param slope Prescribed spectral slope. The default is \code{slope = 0.} +#' meaning that the slope is determined automatically over the dimensions +#' specified by \code{time_dim}. A 1D array with named dimension can be +#' provided (see details and examples). +#'@param kmin First wavenumber for spectral slope (default: \code{kmin = 1}). +#'@param nens Number of ensemble members to produce (default: \code{nens = 1}). +#'@param fglob Logical to conseve global precipitation over the domain +#' (default: FALSE). +#'@param fsmooth Logical to conserve precipitation with a smoothing kernel +#' (default: TRUE). +#'@param time_dim String or character array with name(s) of time dimension(s) +#' (e.g. "ftime", "sdate", "time" ...) over which to compute spectral slopes. +#' If a character array of dimension names is provided, the spectral slopes +#' will be computed over all elements belonging to those dimensions. +#' If omitted one of c("ftime", "sdate", "time") is searched and the first one +#' with more than one element is chosen. +#'@param lon_dim Name of lon dimension ("lon" by default). +#'@param lat_dim Name of lat dimension ("lat" by default). +#'@param verbose logical for verbose output (default: FALSE). +#'@param drop_realization_dim Logical to remove the "realization" stochastic +#' ensemble dimension (default: FALSE) with the following behaviour if set to +#' TRUE: +#' \enumerate{ +#' \item{if \code{nens == 1}: the dimension is dropped;} +#' \item{if \code{nens > 1} and a "member" dimension exists: the "realization" +#' and "member" dimensions are compacted (multiplied) and the resulting +#' dimension is named "member";} +#' \item{if \code{nens > 1} and a "member" dimension does not exist: the +#' "realization" dimension is renamed to "member".} +#' } +#'@param nprocs The number of parallel processes to spawn for the use for +#' parallel computation in multiple cores. (default: 1) +#'@return RainFARM() Returns a list containing the fine-scale longitudes, +#' latitudes and the sequence of \code{nens} downscaled fields. If +#' \code{nens > 1} an additional dimension named "realization" is added to the +#' output array after the "member" dimension (if it exists and unless +#' \code{drop_realization_dim = TRUE} is specified). The ordering of the +#' remaining dimensions in the \code{exp} element of the input object is +#' maintained. +#'@details Wether parameter 'slope' and 'weights' presents seasonality +#'dependency, a dimension name should match between these parameters and the +#'input data in parameter 'data'. See example 2 below where weights and slope +#'vary with 'sdate' dimension. +#'@examples +#'# Example for the 'reduced' RainFARM function +#'nf <- 8 # Choose a downscaling by factor 8 +#'exp <- 1 : (2 * 3 * 4 * 8 * 8) +#'dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) +#'lon <- seq(10, 13.5, 0.5) +#'lat <- seq(40, 43.5, 0.5) +#'# Create a test array of weights +#'ww <- array(1., dim = c(lon = 8 * nf, lat = 8 * nf)) +#'res <- RainFARM(data = exp, lon = lon, lat = lat, nf = nf, +#' weights = ww, nens = 3, time_dim = 'ftime') +#'@import multiApply +#'@import rainfarmr +#'@importFrom ClimProjDiags Subset +#'@importFrom abind abind +#'@export +RainFARM <- function(data, lon, lat, nf, weights = 1., nens = 1, slope = 0, + kmin = 1, fglob = FALSE, fsmooth = TRUE, nprocs = 1, + time_dim = NULL, lon_dim = "lon", lat_dim = "lat", drop_realization_dim = FALSE, verbose = FALSE) { + # Check 'lon_dim' and 'lat_dim' parameters + if (!all(c(lon_dim, lat_dim) %in% names(dim(data)))) { + stop("Parameters 'lon_dim' and 'lat_dim' do not match with 'data' ", + "dimension names.") + } + if (length(dim(weights)) > 0) { + if (!all(c(lon_dim, lat_dim) %in% names(dim(weights)))) { + stop("Parameters 'lon_dim' and 'lat_dim' do not match with 'weights' ", + "dimension names.") + } + } + # Ensure input grid is square and with even dimensions if ( (dim(data)[lon_dim] != dim(data)[lat_dim]) | (dim(data)[lon_dim] %% 2 == 1)) { warning("Warning: input data are expected to be on a square grid", - " with an even number of pixels per side.") + " with an even number of pixels per side.") nmin <- min(dim(data)[lon_dim], dim(data)[lat_dim]) nmin <- floor(nmin / 2) * 2 data <- .subset(data, lat_dim, 1:nmin) @@ -223,19 +255,19 @@ RainFARM <- function(data, lon, lat, nf, weights = 1., nens = 1, } warning("The input data have been cut to the range.") warning(paste0("lon: [", lon[1], ", ", lon[length(lon)], "] ", - " lat: [", lat[1], ", ", lat[length(lat)], "]")) + " lat: [", lat[1], ", ", lat[length(lat)], "]")) } if (length(dim(weights)) > 0) { if (length(names(dim(weights))) == 0) { stop("Parameter 'weights' must have dimension names when it is not a scalar.") } else { - if (length(which(names(dim(weights)) == 'lon')) > 0 & - length(which(names(dim(weights)) == 'lat')) > 0) { - lonposw <- which(names(dim(weights)) == 'lon') - latposw <- which(names(dim(weights)) == 'lat') + if (length(which(names(dim(weights)) == lon_dim)) > 0 & + length(which(names(dim(weights)) == lat_dim)) > 0) { + lonposw <- which(names(dim(weights)) == lon_dim) + latposw <- which(names(dim(weights)) == lat_dim) } else { - stop("Parameter 'weights' must have dimension names 'lon' and 'lat' when", - " it is not a scalar.") + stop("Parameter 'weights' must have dimension names equal to latitudinal", + " and longitudinal dimension names as 'data' when it is not a scalar.") } } } @@ -275,8 +307,8 @@ RainFARM <- function(data, lon, lat, nf, weights = 1., nens = 1, # Perform common calls r <- lon_lat_fine(lon, lat, nf) - lon_f <- r$lon - lat_f <- r$lat + lon_f <- r[['lon']] + lat_f <- r[['lat']] # reorder and group time_dim together at the end cdim0 <- dim(data) @@ -342,23 +374,34 @@ RainFARM <- function(data, lon, lat, nf, weights = 1., nens = 1, names(dim(result))[ind] <- "member" } } - return(list(data = result, lon = lon_f, lat = lat_f)) + + res <- NULL + res[['data']] <- result + res[[lon_dim]] <- lon_f + res[[lat_dim]] <- lat_f + + return(res) } -#' Atomic RainFARM -#' @param pr Precipitation array to downscale with dimensions (lon, lat, time). -#' @param weights Matrix with climatological weights which can be obtained using -#' the \code{CST_RFWeights} function (default: \code{weights=1.} i.e. no weights). -#' @param slope Prescribed spectral slope (default: \code{slope=0.} -#' @param nf Refinement factor for downscaling (the output resolution is increased by this factor). -#' meaning that the slope is determined automatically over the dimensions specified by \code{time_dim}. -#' @param kmin First wavenumber for spectral slope (default: \code{kmin=1}). -#' @param nens Number of ensemble members to produce (default: \code{nens=1}). -#' @param fglob Logical to conseve global precipitation over the domain (default: FALSE). -#' @param fsmooth Logical to conserve precipitation with a smoothing kernel (default: TRUE). -#' @param verbose Logical for verbose output (default: FALSE). -#' @return .RainFARM returns a downscaled array with dimensions (lon, lat, time, realization) -#' @noRd +#'Atomic RainFARM +#'@param pr Precipitation array to downscale with dimensions (lon, lat, time). +#'@param weights Matrix with climatological weights which can be obtained using +#' the \code{CST_RFWeights} function (default: \code{weights = 1.} i.e. no +#' weights). +#'@param slope Prescribed spectral slope (default: \code{slope = 0.} +#'@param nf Refinement factor for downscaling (the output resolution is +#' increased by this factor). Meaning that the slope is determined +#' automatically over the dimensions specified by \code{time_dim}. +#'@param kmin First wavenumber for spectral slope (default: \code{kmin = 1}). +#'@param nens Number of ensemble members to produce (default: \code{nens = 1}). +#'@param fglob Logical to conseve global precipitation over the domain +#' (default: FALSE). +#'@param fsmooth Logical to conserve precipitation with a smoothing kernel +#' (default: TRUE). +#'@param verbose Logical for verbose output (default: FALSE). +#'@return .RainFARM returns a downscaled array with dimensions (lon, lat, time, +#' realization) +#'@noRd .RainFARM <- function(pr, weights, slope, nf, nens, kmin, fglob, fsmooth, verbose) { posna <- NULL diff --git a/R/CST_RegimesAssign.R b/R/CST_RegimesAssign.R index 99e1380482e886154377845eefb66156933381e6..4621f867e9cd703a8a6b560c503a646843e51e48 100644 --- a/R/CST_RegimesAssign.R +++ b/R/CST_RegimesAssign.R @@ -1,68 +1,88 @@ -#' @rdname CST_RegimesAssign -#' @title Function for matching a field of anomalies with -#' a set of maps used as a reference (e.g. clusters obtained from the WeatherRegime function) +#'@rdname CST_RegimesAssign +#'@title Function for matching a field of anomalies with +#'a set of maps used as a reference (e.g. clusters obtained from the WeatherRegime function) #' -#' @author Verónica Torralba - BSC, \email{veronica.torralba@bsc.es} +#'@author Verónica Torralba - BSC, \email{veronica.torralba@bsc.es} #' -#' @description This function performs the matching between a field of anomalies and a set -#' of maps which will be used as a reference. The anomalies will be assigned to the reference map -#' for which the minimum Eucledian distance (method=’distance’) or highest spatial correlation -#' (method = 'ACC') is obtained. -#' -#'@references Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools -#' for the development of a climate service. Thesis. Available online: \url{https://eprints.ucm.es/56841/} +#'@description This function performs the matching between a field of anomalies +#'and a set of maps which will be used as a reference. The anomalies will be +#'assigned to the reference map for which the minimum Eucledian distance +#'(method =’distance’) or highest spatial correlation (method = 'ACC') is +#'obtained. #' -#'@param data a 's2dv_cube' object. - -#'@param ref_maps a 's2dv_cube' object as the output of CST_WeatherRegimes. -#'@param method whether the matching will be performed in terms of minimum distance (default = 'distance') or -#' the maximum spatial correlation (method = 'ACC') between the maps. -#'@param composite a logical parameter indicating if the composite maps are computed or not (default = FALSE). -#'@param memb a logical value indicating whether to compute composites for separate members (default FALSE) or as unique ensemble (TRUE). -#'This option is only available for when parameter 'composite' is set to TRUE and the data object has a dimension named 'member'. -#'@param ncores the number of multicore threads to use for parallel computation. -#'@return A list with two elements \code{$data} (a 's2dv_cube' object containing the composites cluster=1,..,K for case (*1) -# or only k=1 for any specific cluster, i.e., case (*2)) (only when composite = 'TRUE') and \code{$statistics} that includes -#' \code{$pvalue} (array with the same structure as \code{$data} containing the pvalue of the composites obtained through a t-test -#' that accounts for the serial dependence of the data with the same structure as Composite.)(only when composite = 'TRUE'), -#' \code{$cluster} (array with the same dimensions as data (except latitude and longitude which are removed) indicating the ref_maps to which each point is allocated.) , -#' \code{$frequency} (A vector of integers (from k=1,...k n reference maps) indicating the percentage of assignations corresponding to each map.), +#'@references Torralba, V. (2019) Seasonal climate prediction for the wind +#'energy sector: methods and tools for the development of a climate service. +#'Thesis. Available online: \url{https://eprints.ucm.es/56841/} +#' +#'@param data An 's2dv_cube' object. +#'@param ref_maps An 's2dv_cube' object as the output of CST_WeatherRegimes. +#'@param method Whether the matching will be performed in terms of minimum +#' distance (default = 'distance') or the maximum spatial correlation +#' (method = 'ACC') between the maps. +#'@param composite A logical parameter indicating if the composite maps are +#' computed or not (default = FALSE). +#'@param memb A logical value indicating whether to compute composites for +#' separate members (default FALSE) or as unique ensemble (TRUE). This option +#' is only available for when parameter 'composite' is set to TRUE and the data +#' object has a dimension named 'member'. +#'@param ncores The number of multicore threads to use for parallel computation. +#'@return A list with two elements \code{$data} (a 's2dv_cube' object containing +#'the composites cluster=1,..,K for case (*1) or only k=1 for any specific +#'cluster, i.e., case (*2)) (only when composite = 'TRUE') and \code{$statistics} +#'that includes \code{$pvalue} (array with the same structure as \code{$data} +#'containing the pvalue of the composites obtained through a t-test that +#'accounts for the serial dependence of the data with the same structure as +#'Composite.)(only when composite = 'TRUE'), \code{$cluster} (array with the +#'same dimensions as data (except latitude and longitude which are removed) +#'indicating the ref_maps to which each point is allocated.), \code{$frequency} +#'(A vector of integers (from k=1,...k n reference maps) indicating the +#'percentage of assignations corresponding to each map.). +#'@examples +#'data <- array(abs(rnorm(1280, 282.7, 6.4)), dim = c(dataset = 2, member = 2, +#' sdate = 3, ftime = 3, +#' lat = 4, lon = 4)) +#'coords <- list(lon = seq(0, 3), lat = seq(47, 44)) +#'exp <- list(data = data, coords = coords) +#'class(exp) <- 's2dv_cube' +#'regimes <- CST_WeatherRegimes(data = exp, EOFs = FALSE, +#' ncenters = 4) +#'res1 <- CST_RegimesAssign(data = exp, ref_maps = regimes, +#' composite = FALSE) #'@importFrom s2dv ACC MeanDims InsertDim #'@import multiApply -#'@examples -#'\dontrun{ -#'regimes <- CST_WeatherRegimes(data = lonlat_temp$obs, EOFs = FALSE, ncenters = 4) -#'res1 <- CST_RegimesAssign(data = lonlat_temp$exp, ref_maps = regimes, composite = FALSE) -#'res2 <- CST_RegimesAssign(data = lonlat_temp$exp, ref_maps = regimes, composite = TRUE) -#'} #'@export -#' - -CST_RegimesAssign <- function(data, ref_maps, +CST_RegimesAssign <- function(data, ref_maps, method = "distance", composite = FALSE, - memb = FALSE, ncores = NULL) { + memb = FALSE, ncores = NULL) { + # Check 's2dv_cube' if (!inherits(data, 's2dv_cube')) { stop("Parameter 'data' must be of the class 's2dv_cube', ", "as output by CSTools::CST_Load.") } - if (!inherits(ref_maps, 's2dv_cube')) { stop("Parameter 'ref_maps' must be of the class 's2dv_cube', ", "as output by CSTools::CST_Load.") } - - if ('lat' %in% names(data)){ - lat <- data$lat + # Check 'exp' object structure + if (!all(c('data', 'coords') %in% names(data))) { + stop("Parameter 'data' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + } + # Check coordinates + if (!any(names(data$coords) %in% .KnownLatNames())) { + stop("Spatial coordinate names do not match any of the names accepted ", + "the package.") } else { - lat <- NULL + lat_name <- names(data$coords)[[which(names(data$coords) %in% .KnownLatNames())]] + lat <- as.vector(data$coords[[lat_name]]) } - result <- Apply(data = list(data = data$data, ref_maps = ref_maps$data), - lat = lat, fun = RegimesAssign, - target_dims = list(names(dim(data$data)), c('lat', 'lon', 'cluster')), - method = method, memb = memb, composite = composite, ncores = ncores) + + result <- RegimesAssign(data = data$data, ref_maps = ref_maps$data, lat = lat, + method = method, composite = composite, + memb = memb, ncores = ncores) - if (composite){ + if (composite) { data$data <- result$composite data$statistics <- result[-1] } else { @@ -73,88 +93,114 @@ CST_RegimesAssign <- function(data, ref_maps, return(data) } -#' @rdname RegimesAssign -#' @title Function for matching a field of anomalies with -#' a set of maps used as a reference (e.g. clusters obtained from the WeatherRegime function). +#'@rdname RegimesAssign +#'@title Function for matching a field of anomalies with +#'a set of maps used as a reference (e.g. clusters obtained from the WeatherRegime function). #' -#' @author Verónica Torralba - BSC, \email{veronica.torralba@bsc.es} +#'@author Verónica Torralba - BSC, \email{veronica.torralba@bsc.es} #' -#' @description This function performs the matching between a field of anomalies and a set -#' of maps which will be used as a reference. The anomalies will be assigned to the reference map -#' for which the minimum Eucledian distance (method = 'distance') or highest spatial correlation -#' (method = 'ACC') is obtained. +#'@description This function performs the matching between a field of anomalies +#'and a set of maps which will be used as a reference. The anomalies will be +#'assigned to the reference map for which the minimum Eucledian distance +#'(method = 'distance') or highest spatial correlation (method = 'ACC') is +#'obtained. #' -#'@references Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools for the development of a climate service. Thesis. Available online: \url{https://eprints.ucm.es/56841/} +#'@references Torralba, V. (2019) Seasonal climate prediction for the wind +#'energy sector: methods and tools for the development of a climate service. +#'Thesis. Available online: \url{https://eprints.ucm.es/56841/} #' -#'@param data an array containing anomalies with named dimensions: dataset, member, sdate, ftime, lat and lon. -#'@param ref_maps array with 3-dimensions ('lon', 'lat', 'cluster') containing the maps/clusters that will be used as a reference for the matching. -#'@param method whether the matching will be performed in terms of minimum distance (default = 'distance') or -#' the maximum spatial correlation (method = 'ACC') between the maps. -#'@param lat a vector of latitudes corresponding to the positions provided in data and ref_maps. -#'@param composite a logical parameter indicating if the composite maps are computed or not (default = FALSE). -#'@param memb a logical value indicating whether to compute composites for separate members (default FALSE) or as unique ensemble (TRUE). -#'This option is only available for when parameter 'composite' is set to TRUE and the data object has a dimension named 'member'. -#'@param ncores the number of multicore threads to use for parallel computation. -#'@return A list with elements \code{$composite} (3-d array (lon, lat, k) containing the composites k=1,..,K for case (*1) -# or only k=1 for any specific cluster, i.e., case (*2)) (only if composite='TRUE'), -#' \code{$pvalue} ( array with the same structure as \code{$composite} containing the pvalue of the composites obtained through a t-test -#' that accounts for the serial dependence of the data with the same structure as Composite.) (only if composite='TRUE'), -#' \code{$cluster} (array with the same dimensions as data (except latitude and longitude which are removed) indicating the ref_maps to which each point is allocated.) , -#' \code{$frequency} (A vector of integers (from k = 1, ... k n reference maps) indicating the percentage of assignations corresponding to each map.), -#' -#'@importFrom s2dv ACC MeanDims Eno InsertDim -#'@import multiApply +#'@param data An array containing anomalies with named dimensions: dataset, +#' member, sdate, ftime, lat and lon. +#'@param ref_maps Array with 3-dimensions ('lon', 'lat', 'cluster') containing +#' the maps/clusters that will be used as a reference for the matching. +#'@param method Whether the matching will be performed in terms of minimum +#' distance (default = 'distance') or the maximum spatial correlation +#' (method = 'ACC') between the maps. +#'@param lat A vector of latitudes corresponding to the positions provided in +#' data and ref_maps. +#'@param composite A logical parameter indicating if the composite maps are +#' computed or not (default = FALSE). +#'@param memb A logical value indicating whether to compute composites for +#' separate members (default FALSE) or as unique ensemble (TRUE). This option +#' is only available for when parameter 'composite' is set to TRUE and the data +#' object has a dimension named 'member'. +#'@param ncores The number of multicore threads to use for parallel computation. +#'@return A list with elements \code{$composite} (3-d array (lon, lat, k) +#'containing the composites k = 1,..,K for case (*1) or only k = 1 for any specific +#'cluster, i.e., case (*2)) (only if composite = 'TRUE'), \code{$pvalue} (array +#'with the same structure as \code{$composite} containing the pvalue of the +#'composites obtained through a t-test that accounts for the serial dependence +#'of the data with the same structure as Composite.) (only if composite='TRUE'), +#'\code{$cluster} (array with the same dimensions as data (except latitude and +#'longitude which are removed) indicating the ref_maps to which each point is +#'allocated.), \code{$frequency} (A vector of integers (from k = 1, ... k n +#'reference maps) indicating the percentage of assignations corresponding to +#'each map.), +#' #'@examples -#'\dontrun{ -#'regimes <- WeatherRegime(data = lonlat_temp$obs$data, lat = lonlat_temp$obs$lat, +#'data <- array(abs(rnorm(1280, 282.7, 6.4)), dim = c(dataset = 2, member = 2, +#' sdate = 3, ftime = 3, +#' lat = 4, lon = 4)) +#'regimes <- WeatherRegime(data = data, lat = seq(47, 44), #' EOFs = FALSE, ncenters = 4)$composite -#'res1 <- RegimesAssign(data = lonlat_temp$exp$data, ref_maps = drop(regimes), -#' lat = lonlat_temp$exp$lat, composite = FALSE) -#'} +#'res1 <- RegimesAssign(data = data, ref_maps = drop(regimes), +#' lat = seq(47, 44), composite = FALSE) +#'@importFrom s2dv ACC MeanDims Eno InsertDim +#'@import multiApply #'@export - RegimesAssign <- function(data, ref_maps, lat, method = "distance", composite = FALSE, memb = FALSE, ncores = NULL) { - + ## Initial checks + # data if (is.null(names(dim(data)))) { stop("Parameter 'data' must be an array with named dimensions.") } + # ref_maps if (is.null(ref_maps)) { stop("Parameter 'ref_maps' must be specified.") } - - if (is.null(lat)) { - stop("Parameter 'lat' must be specified.") - } if (is.null(names(dim(ref_maps)))) { stop("Parameter 'ref_maps' must be an array with named dimensions.") } + # lat + if (is.null(lat)) { + stop("Parameter 'lat' must be specified.") + } + # memb if (!is.logical(memb)) { stop("Parameter 'memb' must be logical.") } + # composite if (!is.logical(composite)) { stop("Parameter 'memb' must be logical.") } dimData <- names(dim(data)) - - if (!all( c('lat', 'lon') %in% dimData)) { - stop("Parameter 'data' must contain the named dimensions 'lat' and 'lon'.") + # Know spatial coordinates names + if (!any(dimData %in% .KnownLonNames()) | + !any(dimData %in% .KnownLatNames())) { + stop("Spatial coordinate dimension names do not match any of the names ", + "accepted by the package.") } - + lon_name <- dimData[[which(dimData %in% .KnownLonNames())]] + lat_name <- dimData[[which(dimData %in% .KnownLatNames())]] dimRef <- names(dim(ref_maps)) - - if (!all( c('cluster', 'lat', 'lon') %in% dimRef)) { + if (!any(dimRef %in% .KnownLonNames()) | + !any(dimRef %in% .KnownLatNames())) { + stop("Spatial coordinate dimension names do not match any of the names ", + "accepted by the package.") + } + lon_name_ref <- dimRef[[which(dimRef %in% .KnownLonNames())]] + lat_name_ref <- dimRef[[which(dimRef %in% .KnownLatNames())]] + if (!all( c('cluster', lat_name_ref, lon_name_ref) %in% dimRef)) { stop("Parameter 'ref_maps' must contain the named dimensions - 'cluster','lat' and 'lon'.") + 'cluster', and the spatial coordinates accepted names.") } - - - if (length(lat) != dim(data)['lat'] | (length(lat) != dim(ref_maps)['lat']) ) { - stop(" Parameter 'lat' does not match with the dimension 'lat' in the - parameter 'data' or in the parameter 'ref_maps'.") + if (length(lat) != dim(data)[lat_name] | + (length(lat) != dim(ref_maps)[lat_name_ref])) { + stop("Parameter 'lat' does not match with the latitudinal dimension", + " in the parameter 'data' or in the parameter 'ref_maps'.") } - - + # Temporal dimensions if ('sdate' %in% dimData && 'ftime' %in% dimData) { nsdates <- dim(data)['sdate'] nftimes <- dim(data)['ftime'] @@ -170,9 +216,12 @@ RegimesAssign <- function(data, ref_maps, lat, method = "distance", composite = } ref_maps <- drop(ref_maps) index <- Apply(data = list(ref = ref_maps, target = data), - target_dims = list(c('lat', 'lon', 'cluster'), c('lat', 'lon')), + target_dims = list(c(lat_name_ref, lon_name_ref, 'cluster'), + c(lat_name, lon_name)), fun = .RegimesAssign, lat = lat, method = method, + lon_name = lon_name, lat_name = lat_name, + lon_name_ref = lon_name_ref, lat_name_ref = lat_name_ref, ncores = ncores)[[1]] nclust <- dim(ref_maps)['cluster'] @@ -182,8 +231,8 @@ RegimesAssign <- function(data, ref_maps, lat, method = "distance", composite = } if (composite) { - poslon <- which(names(dim(data)) == 'lon') - poslat <- which(names(dim(data)) == 'lat') + poslon <- which(names(dim(data)) == lon_name) + poslat <- which(names(dim(data)) == lat_name) postime <- which(names(dim(data)) == 'time') posdim <- setdiff(1:length(dim(data)), c(postime, poslat, poslon)) dataComp <- aperm(data, c(poslon, poslat, postime, posdim)) @@ -199,19 +248,16 @@ RegimesAssign <- function(data, ref_maps, lat, method = "distance", composite = dataComp <- MergeDims(dataComp, merge_dims = c('time', 'member'), rename_dim = 'time') index <- MergeDims(index, merge_dims = c('time', 'member'), rename_dim = 'time') } - recon <- - Apply(data = list(var = dataComp, occ = index), - target_dims = list(c('lon', 'lat', 'time'), c('time')), - fun = Composite, - K = dim(ref_maps)['cluster']) + recon <- Apply(data = list(var = dataComp, occ = index), + target_dims = list(c(lon_name, lat_name, 'time'), c('time')), + fun = Composite, + K = dim(ref_maps)['cluster']) } - output <- list(composite = recon$composite, pvalue = recon$pvalue, cluster = index, frequency = freqs) } else { - output <- list(cluster = index, frequency = freqs) } @@ -219,82 +265,77 @@ RegimesAssign <- function(data, ref_maps, lat, method = "distance", composite = return(output) } -.RegimesAssign <- function(ref, target, method = 'distance', lat, composite = FALSE) { +.RegimesAssign <- function(ref, target, method = 'distance', lat, + composite = FALSE, + lon_name = 'lon', lat_name = 'lat', + lon_name_ref = 'lon', lat_name_ref = 'lat') { - # ref: c('lat', 'lon', 'cluster') - # target: c('lat', 'lon') + # ref: [lat_name_ref, lon_name_ref, 'cluster'] + # target: [lat_name, lon_name] posdim <- which(names(dim(ref)) == 'cluster') - poslat <- which(names(dim(ref)) == 'lat') - poslon <- which(names(dim(ref)) == 'lon') + poslat <- which(names(dim(ref)) == lat_name_ref) + poslon <- which(names(dim(ref)) == lon_name_ref) nclust <- dim(ref)[posdim] if (all(dim(ref)[-posdim] != dim(target))) { - stop('The target should have the same dimensions [lat,lon] that - the reference ') + stop('The target should have the same dimensions [lat_name, lon_name] that', + 'the reference ') } - if (is.null(names(dim(ref))) | is.null(names(dim(target)))) { - stop( - 'The arrays should include dimensions names ref[cluster,lat,lon] - and target [lat,lon]' + stop('The arrays should include dimensions names ref[cluster, lat_name, ', + 'lon_name] and target [lat_name, lon_name]' ) } - - if (length(lat) != dim(ref)[poslat]) { stop('latitudes do not match with the maps') } - if (is.na(max(target))){ assign <- NA - - } else{ - - - # This dimensions are reorganized - ref <- aperm(ref, c(posdim, poslat, poslon)) - target <- aperm(target, - c(which(names(dim(target)) == 'lat'), - which(names(dim(target)) == 'lon'))) - - # weights are defined - latWeights <- InsertDim(sqrt(cos(lat * pi / 180)), 2, dim(ref)[3]) - - - rmsdiff <- function(x, y) { - dims <- dim(x) - ndims <- length(dims) - if (ndims != 2 | ndims != length(dim(y))) { - stop('x and y should be maps') - } - map_diff <- NA * x - for (i in 1:dims[1]) { - for (j in 1:dims[2]) { - map_diff[i, j] <- (x[i, j] - y[i, j]) ^ 2 + } else { + # This dimensions are reorganized + ref <- aperm(ref, c(posdim, poslat, poslon)) + target <- aperm(target, + c(which(names(dim(target)) == lat_name), + which(names(dim(target)) == lon_name))) + + # weights are defined + latWeights <- InsertDim(sqrt(cos(lat * pi / 180)), 2, dim(ref)[3]) + + rmsdiff <- function(x, y) { + dims <- dim(x) + ndims <- length(dims) + if (ndims != 2 | ndims != length(dim(y))) { + stop('x and y should be maps') + } + map_diff <- NA * x + for (i in 1:dims[1]) { + for (j in 1:dims[2]) { + map_diff[i, j] <- (x[i, j] - y[i, j]) ^ 2 + } } + rmsdiff <- sqrt(mean(map_diff)) + return(rmsdiff) } - rmsdiff <- sqrt(mean(map_diff)) - return(rmsdiff) - } - - if (method == 'ACC') { - corr <- rep(NA, nclust) - for (i in 1:nclust) { - #NOTE: s2dv::ACC returns centralized and weighted result. - corr[i] <- - ACC(ref[i, , ], target, lat = lat, dat_dim = NULL, avg_dim = NULL, memb_dim = NULL)$acc + + if (method == 'ACC') { + corr <- rep(NA, nclust) + for (i in 1:nclust) { + #NOTE: s2dv::ACC returns centralized and weighted result. + corr[i] <- + ACC(ref[i, , ], target, lat = lat, dat_dim = NULL, avg_dim = NULL, + memb_dim = NULL)$acc + } + assign <- which(corr == max(corr)) } - assign <- which(corr == max(corr)) - } - - if (method == 'distance') { - rms <- rep(NA, nclust) - for (i in 1:nclust) { - rms[i] <- rmsdiff(ref[i, , ] * latWeights, target * latWeights) + + if (method == 'distance') { + rms <- rep(NA, nclust) + for (i in 1:nclust) { + rms[i] <- rmsdiff(ref[i, , ] * latWeights, target * latWeights) + } + assign <- which(rms == min(rms)) } - assign <- which(rms == min(rms)) - } } return(assign) diff --git a/R/CST_SaveExp.R b/R/CST_SaveExp.R index a15cfaa4842e3e26295fe7b4ad003c520f61d7d4..1c419be56453ef5e9515cd4a0b7130a4b0826ec4 100644 --- a/R/CST_SaveExp.R +++ b/R/CST_SaveExp.R @@ -1,288 +1,909 @@ -#'Save CSTools objects of class 's2dv_cube' containing experiments or observed -#'data in NetCDF format +#'Save objects of class 's2dv_cube' to data in NetCDF format #' #'@author Perez-Zanon Nuria, \email{nuria.perez@bsc.es} #' #'@description This function allows to divide and save a object of class #''s2dv_cube' into a NetCDF file, allowing to reload the saved data using -#'\code{CST_Load} function. +#'\code{Start} function from StartR package. If the original 's2dv_cube' object +#'has been created from \code{CST_Load()}, then it can be reloaded with +#'\code{Load()}. #' -#'@param data an object of class \code{s2dv_cube}. -#'@param destination a character string containing the directory name in which -#'to save the data. NetCDF file for each starting date are saved into the -#'folder tree: destination/experiment/variable/. By default the function -#'creates and saves the data into the folder "CST_Data" in the working -#'directory. -#'@param extra_string a character string to be include as part of the file name, for instance, to identify member or realization. It would be added to the file name between underscore characters. +#'@param data An object of class \code{s2dv_cube}. +#'@param destination A character string containing the directory name in which +#' to save the data. NetCDF file for each starting date are saved into the +#' folder tree: \cr +#' destination/Dataset/variable/. By default the function +#' creates and saves the data into the working directory. +#'@param sdate_dim A character string indicating the name of the start date +#' dimension. By default, it is set to 'sdate'. It can be NULL if there is no +#' start date dimension. +#'@param ftime_dim A character string indicating the name of the forecast time +#' dimension. By default, it is set to 'time'. It can be NULL if there is no +#' forecast time dimension. +#'@param dat_dim A character string indicating the name of dataset dimension. +#' By default, it is set to 'dataset'. It can be NULL if there is no dataset +#' dimension. +#'@param var_dim A character string indicating the name of variable dimension. +#' By default, it is set to 'var'. It can be NULL if there is no variable +#' dimension. +#'@param memb_dim A character string indicating the name of the member dimension. +#' By default, it is set to 'member'. It can be NULL if there is no member +#' dimension. +#'@param single_file A logical value indicating if all object is saved in a +#' single file (TRUE) or in multiple files (FALSE). When it is FALSE, +#' the array is separated for Datasets, variable and start date. It is FALSE +#' by default. +#'@param extra_string A character string to be include as part of the file name, +#' for instance, to identify member or realization. It would be added to the +#' file name between underscore characters. #' -#'@seealso \code{\link{CST_Load}}, \code{\link{as.s2dv_cube}} and \code{\link{s2dv_cube}} -#' -#'@import ncdf4 -#'@importFrom s2dv Reorder InsertDim -#'@import multiApply +#'@return Multiple or single NetCDF files containing the data array.\cr +#'\item{\code{single_file = TRUE}}{ +#' All data is saved in a single file located in the specified destination +#' path with the following name: +#' ___.nc. Multiple +#' variables are saved separately in the same file. The forecast time units +#' is extracted from the frequency of the time steps (hours, days, months). +#' The first value of forecast time is 1. If no frequency is found, the units +#' will be 'hours since' each start date and the time steps are assumed to be +#' equally spaced. +#'} +#'\item{\code{single_file = FALSE}}{ +#' The data array is subset and stored into multiple files. Each file +#' contains the data subset for each start date, variable and dataset. Files +#' with different variables and Datasets are stored in separated directories +#' within the following directory tree: destination/Dataset/variable/. +#' The name of each file will be: +#' __.nc. +#'} +#' +#'@seealso \code{\link[startR]{Start}}, \code{\link{as.s2dv_cube}} and +#'\code{\link{s2dv_cube}} #' #'@examples #'\dontrun{ -#'library(CSTools) #'data <- lonlat_temp$exp -#'destination <- "./path2/" -#'CST_SaveExp(data = data, destination = destination) +#'destination <- "./" +#'CST_SaveExp(data = data, destination = destination, ftime_dim = 'ftime', +#' var_dim = NULL, ftime_dim = 'ftime', var_dim = NULL) #'} #' +#'@import ncdf4 +#'@importFrom s2dv Reorder +#'@importFrom ClimProjDiags Subset +#'@import multiApply #'@export -CST_SaveExp <- function(data, destination = "./CST_Data", extra_string = NULL) { - if (!is.character(destination) & length(destination) > 1) { - stop("Parameter 'destination' must be a character string of one element ", - "indicating the name of the file (including the folder if needed) ", - "where the data will be saved.") - } +CST_SaveExp <- function(data, destination = "./", sdate_dim = 'sdate', + ftime_dim = 'time', dat_dim = 'dataset', + var_dim = 'var', memb_dim = 'member', + single_file = FALSE, extra_string = NULL) { + # Check 's2dv_cube' if (!inherits(data, 's2dv_cube')) { stop("Parameter 'data' must be of the class 's2dv_cube', ", "as output by CSTools::CST_Load.") } - sdates <- lapply(1:length(data$Datasets), function(x) { - unique(data$Datasets[[x]]$InitializationDates)})[[1]] - if (!is.character(attributes(data$Variable)$units)) { - units <- attributes(data$Variable)$variable$units + # Check object structure + if (!all(c('data', 'attrs') %in% names(data))) { + stop("Parameter 'data' must have at least 'data' and 'attrs' elements ", + "within the 's2dv_cube' structure.") + } + if (!inherits(data$attrs, 'list')) { + stop("Level 'attrs' must be a list with at least 'Dates' element.") + } + if (!all(c('coords') %in% names(data))) { + warning("Element 'coords' not found. No coordinates will be used.") + } + # metadata + if (is.null(data$attrs$Variable$metadata)) { + warning("No metadata found in element Variable from attrs.") + } else { + if (!inherits(data$attrs$Variable$metadata, 'list')) { + stop("Element metadata from Variable element in attrs must be a list.") + } + if (!any(names(data$attrs$Variable$metadata) %in% names(data$coords))) { + warning("Metadata is not found for any coordinate.") + } else if (!any(names(data$attrs$Variable$metadata) %in% + data$attrs$Variable$varName)) { + warning("Metadata is not found for any variable.") + } + } + # Dates + if (is.null(data$attrs$Dates)) { + stop("Element 'Dates' from 'attrs' level cannot be NULL.") + } + if (is.null(dim(data$attrs$Dates))) { + stop("Element 'Dates' from 'attrs' level must have time dimensions.") + } + # sdate_dim + if (!is.null(sdate_dim)) { + if (!is.character(sdate_dim)) { + stop("Parameter 'sdate_dim' must be a character string.") + } + if (length(sdate_dim) > 1) { + warning("Parameter 'sdate_dim' has length greater than 1 and ", + "only the first element will be used.") + sdate_dim <- sdate_dim[1] + } } else { - units <- attributes(data$Variable)$units - } - cdo_grid_name = attr(data$lon, 'cdo_grid_name') - projection = attr(data$lon, 'projection') - var_name <- data$Variable$varName - time_values <- data$Dates$start - dim(time_values) <- c(time = length(time_values) / length(sdates), - sdate = length(sdates)) - SaveExp(data = data$data, lon = data$lon, lat = data$lat, - Dataset = names(data$Datasets), var_name = var_name, - units = units, cdo_grid_name = cdo_grid_name, projection = projection, - startdates = sdates, Dates = time_values, destination, - extra_string = extra_string) -} -#'Save an experiment in a format compatible with CST_Load -#'@description This function is created for compatibility with CST_Load/Load for saving post-processed datasets such as those calibrated of downscaled with CSTools functions + if (length(dim(data$attrs$Dates)) == 1) { + sdate_dim <- 'sdate' + dim(data$data) <- c(sdate = 1, dim(data$data)) + data$dims <- dim(data$data) + dim(data$attrs$Dates) <- c(sdate = 1, dim(data$attrs$Dates)) + data$coords[[sdate_dim]] <- data$attrs$Dates[1] + } + } + + SaveExp(data = data$data, + destination = destination, + Dates = data$attrs$Dates, + coords = data$coords, + varname = data$attrs$Variable$varName, + metadata = data$attrs$Variable$metadata, + Datasets = data$attrs$Datasets, + startdates = data$coords[[sdate_dim]], + dat_dim = dat_dim, sdate_dim = sdate_dim, + ftime_dim = ftime_dim, var_dim = var_dim, + memb_dim = memb_dim, + extra_string = extra_string, + single_file = single_file) +} +#'Save a multidimensional array with metadata to data in NetCDF format +#'@description This function allows to save a data array with metadata into a +#'NetCDF file, allowing to reload the saved data using \code{Start} function +#'from StartR package. If the original 's2dv_cube' object has been created from +#'\code{CST_Load()}, then it can be reloaded with \code{Load()}. #' #'@author Perez-Zanon Nuria, \email{nuria.perez@bsc.es} #' -#'@param data an multi-dimensional array with named dimensions (longitude, latitude, time, member, sdate) -#'@param lon vector of logitud corresponding to the longitudinal dimension in data -#'@param lat vector of latitud corresponding to the latitudinal dimension in data -#'@param Dataset a vector of character string indicating the names of the datasets -#'@param var_name a character string indicating the name of the variable to be saved -#'@param units a character string indicating the units of the variable -#'@param startdates a vector of dates indicating the initialization date of each simulations -#'@param Dates a matrix of dates with two dimension 'time' and 'sdate'. -#'@param cdo_grid_name a character string indicating the name of the grid e.g.: 'r360x181' -#'@param projection a character string indicating the projection name -#'@param destination a character string indicating the path where to store the NetCDF files -#'@param extra_string a character string to be include as part of the file name, for instance, to identify member or realization. +#'@param data A multi-dimensional array with named dimensions. +#'@param destination A character string indicating the path where to store the +#' NetCDF files. +#'@param Dates A named array of dates with the corresponding sdate and forecast +#' time dimension. +#'@param coords A named list with elements of the coordinates corresponding to +#' the dimensions of the data parameter. The names and length of each element +#' must correspond to the names of the dimensions. If any coordinate is not +#' provided, it is set as an index vector with the values from 1 to the length +#' of the corresponding dimension. +#'@param varname A character string indicating the name of the variable to be +#' saved. +#'@param metadata A named list where each element is a variable containing the +#' corresponding information. The information must be contained in a list of +#' lists for each variable. +#'@param Datasets A vector of character string indicating the names of the +#' datasets. +#'@param startdates A vector of dates indicating the initialization date of each +#' simulations. +#'@param sdate_dim A character string indicating the name of the start date +#' dimension. By default, it is set to 'sdate'. It can be NULL if there is no +#' start date dimension. +#'@param ftime_dim A character string indicating the name of the forecast time +#' dimension. By default, it is set to 'time'. It can be NULL if there is no +#' forecast time dimension. +#'@param dat_dim A character string indicating the name of dataset dimension. +#' By default, it is set to 'dataset'. It can be NULL if there is no dataset +#' dimension. +#'@param var_dim A character string indicating the name of variable dimension. +#' By default, it is set to 'var'. It can be NULL if there is no variable +#' dimension. +#'@param memb_dim A character string indicating the name of the member dimension. +#' By default, it is set to 'member'. It can be NULL if there is no member +#' dimension. +#'@param single_file A logical value indicating if all object is saved in a +#' unique file (TRUE) or in separated directories (FALSE). When it is FALSE, +#' the array is separated for Datasets, variable and start date. It is FALSE +#' by default. +#'@param extra_string A character string to be include as part of the file name, +#' for instance, to identify member or realization. It would be added to the +#' file name between underscore characters. #' -#'@return the function creates as many files as sdates per dataset. Each file could contain multiple members. It would be added to the file name between underscore characters. -#' The path will be created with the name of the variable and each Datasets. +#'@return Multiple or single NetCDF files containing the data array.\cr +#'\item{\code{single_file = TRUE}}{ +#' All data is saved in a single file located in the specified destination +#' path with the following name: +#' ___.nc. Multiple +#' variables are saved separately in the same file. The forecast time units +#' is extracted from the frequency of the time steps (hours, days, months). +#' The first value of forecast time is 1. If no frequency is found, the units +#' will be 'hours since' each start date and the time steps are assumed to be +#' equally spaced. +#'} +#'\item{\code{single_file = FALSE}}{ +#' The data array is subset and stored into multiple files. Each file +#' contains the data subset for each start date, variable and dataset. Files +#' with different variables and Datasets are stored in separated directories +#' within the following directory tree: destination/Dataset/variable/. +#' The name of each file will be: +#' __.nc. +#'} #' -#'@import ncdf4 -#'@importFrom s2dv Reorder InsertDim -#'@import multiApply -#' #'@examples #'\dontrun{ #'data <- lonlat_temp$exp$data -#'lon <- lonlat_temp$exp$lon -#'lat <- lonlat_temp$exp$lat -#'Dataset <- 'XXX' -#'var_name <- 'tas' -#'units <- 'k' -#'startdates <- lapply(1:length(lonlat_temp$exp$Datasets), -#' function(x) { -#' lonlat_temp$exp$Datasets[[x]]$InitializationDates[[1]]})[[1]] -#'Dates <- lonlat_temp$exp$Dates$start -#'dim(Dates) <- c(time = length(Dates)/length(startdates), sdate = length(startdates)) -#'cdo_grid_name = attr(lonlat_temp$exp$lon, 'cdo_grid_name') -#'projection = attr(lonlat_temp$exp$lon, 'projection') -#'destination = './path/' -#'SaveExp(data, lon, lat, Dataset, var_name, units, startdates, Dates, -#' cdo_grid_name, projection, destination) +#'lon <- lonlat_temp$exp$coords$lon +#'lat <- lonlat_temp$exp$coords$lat +#'coords <- list(lon = lon, lat = lat) +#'Datasets <- lonlat_temp$exp$attrs$Datasets +#'varname <- 'tas' +#'Dates <- lonlat_temp$exp$attrs$Dates +#'destination = './' +#'metadata <- lonlat_temp$exp$attrs$Variable$metadata +#'SaveExp(data = data, destination = destination, coords = coords, +#' Datasets = Datasets, varname = varname, Dates = Dates, +#' metadata = metadata, single_file = TRUE, ftime_dim = 'ftime', +#' var_dim = NULL) #'} +#' +#'@import ncdf4 +#'@importFrom s2dv Reorder +#'@import multiApply +#'@importFrom ClimProjDiags Subset #'@export -SaveExp <- function(data, lon, lat, Dataset, var_name, units, startdates, Dates, - cdo_grid_name, projection, destination, - extra_string = NULL) { - dimname <- names(dim(data)) - if (any(dimname == "ftime")) { - dimname[which(dimname == "ftime")] <- "time" - names(dim(data))[which(dimname == "ftime")] <- "time" - } - if (any(dimname == "memb")) { - dimname[which(dimname == "memb")] <- "member" - names(dim(data))[which(dimname == "memb")] <- "member" - } - if (any(dimname == "ensemble")) { - dimname[which(dimname == "ensemble")] <- "member" - names(dim(data))[which(dimname == "ensemble")] <- "member" - } - if (any(dimname == "lon")) { - dimname[which(dimname == "lon")] <- "longitude" - names(dim(data))[which(dimname == "lon")] <- "longitude" - } - if (any(dimname == "lat")) { - dimname[which(dimname == "lat")] <- "latitude" - names(dim(data))[which(dimname == "lat")] <- "latitude" - } - names(dim(data)) <- dimname - if (is.null(dimname)) { - stop("Element 'data' in parameter 'data' must have named dimensions.") - } - sdate_pos <- which(dimname == "sdate") - - if (length(sdate_pos) == 0) { - stop("Element 'data' in parameter 'data' hasn't 'sdate' dimension.") - } else if (length(sdate_pos) > 1) { - stop("Element 'data' in parameter 'data' has more than one 'sdate'", - " dimension.") +SaveExp <- function(data, destination = "./", Dates = NULL, coords = NULL, + varname = NULL, metadata = NULL, Datasets = NULL, + startdates = NULL, dat_dim = 'dataset', sdate_dim = 'sdate', + ftime_dim = 'time', var_dim = 'var', memb_dim = 'member', + single_file = FALSE, extra_string = NULL) { + ## Initial checks + # data + if (is.null(data)) { + stop("Parameter 'data' cannot be NULL.") + } + dimnames <- names(dim(data)) + if (is.null(dimnames)) { + stop("Parameter 'data' must be an array with named dimensions.") + } + # destination + if (!is.character(destination) | length(destination) > 1) { + stop("Parameter 'destination' must be a character string of one element ", + "indicating the name of the file (including the folder if needed) ", + "where the data will be saved.") + } + # Dates + if (!is.null(Dates)) { + if (!inherits(Dates, "POSIXct") & !inherits(Dates, "Date")) { + stop("Parameter 'Dates' must be of 'POSIXct' or 'Dates' class.") + } + if (is.null(dim(Dates))) { + stop("Parameter 'Dates' must have dimension names.") + } } + # coords + if (!is.null(coords)) { + if (!all(names(coords) %in% dimnames)) { + coords <- coords[-which(!names(coords) %in% dimnames)] + } + for (i_coord in dimnames) { + if (i_coord %in% names(coords)) { + if (length(coords[[i_coord]]) != dim(data)[i_coord]) { + warning(paste0("Coordinate '", i_coord, "' has different lenght as ", + "its dimension and it will not be used.")) + coords[[i_coord]] <- 1:dim(data)[i_coord] + } + } else { + warning(paste0("Coordinate '", i_coord, "' is not provided ", + "and it will be set as index in element coords.")) + coords[[i_coord]] <- 1:dim(data)[i_coord] + } + } + } else { + coords <- sapply(dimnames, function(x) 1:dim(data)[x]) + } + # varname + if (is.null(varname)) { + warning("Parameter 'varname' is NULL. It will be assigned to 'X'.") + varname <- 'X' + } else if (length(varname) > 1) { + multiple_vars <- TRUE + } else { + multiple_vars <- FALSE + } + if (!all(sapply(varname, is.character))) { + stop("Parameter 'varname' must be a character string with the ", + "variable names.") + } + # metadata + if (is.null(metadata)) { + warning("Parameter 'metadata' is not provided so the metadata saved ", + "will be incomplete.") + } + # single_file + if (!inherits(single_file, 'logical')) { + warning("Parameter 'single_file' must be a logical value. It will be ", + "set as FALSE.") + single_file <- FALSE + } + # extra_string if (!is.null(extra_string)) { if (!is.character(extra_string)) { stop("Parameter 'extra_string' must be a character string.") } } - dataset_pos <- which(dimname == "dataset" | dimname == "dat") - dims <- dim(data) - if (length(dataset_pos) == 0) { - warning("Element 'data' in parameter 'data' hasn't 'dataset' dimension. ", - "All data is stored in the same 'dataset' folder.") - data$data <- InsertDim(data, posdim = 1, lendim = 1) - names(dim(data))[1] <- "dataset" - dimname <- c("dataset", dimname) - dataset_pos = 1 - } else if (length(dataset_pos) > 1) { - stop("Element 'data' in parameter 'data' has more than one 'dataset'", - " dimension.") - } - n_datasets <- dim(data)[dataset_pos] # number of folder by dataset - # dataset names: - datasets <- Dataset - if (n_datasets > length(datasets)) { - warning("Dimension 'dataset' in element 'data' from parameter 'data' ", - "is greater than those listed in element 'Datasets' and the ", - "first element is reused.") - datasets <- c(datasets, rep(datasets[1], n_datasets - length(datasets))) - } else if (n_datasets < length(datasets)) { - warning("Dimension 'dataset' in element 'data' from parameter 'data', ", - "is smaller than those listed in element 'Datasets' and only the", - " first element will be used.") - datasets <- datasets[1 : n_datasets] - } - # var names: - if ('var' %in% dimname) { - var_pos <- which(dimname == 'var') - if (dims[var_pos] == 1) { - data <- adrop(data, drop = var_pos) - dimname <- names(dim(data)) + + ## Dimensions checks + # Spatial coordinates + if (!any(dimnames %in% .KnownLonNames()) | + !any(dimnames %in% .KnownLatNames())) { + warning("Spatial coordinate names do not match any of the names accepted by ", + "the package.") + lon_dim <- NULL + lat_dim <- NULL + } else { + lon_dim <- dimnames[which(dimnames %in% .KnownLonNames())] + lat_dim <- dimnames[which(dimnames %in% .KnownLatNames())] + if (length(lon_dim) > 1) { + warning("Found more than one longitudinal dimension. Only the first one ", + "will be used.") + lon_dim <- lon_dim[1] + } + if (length(lat_dim) > 1) { + warning("Found more than one latitudinal dimension. Only the first one ", + "will be used.") + lat_dim <- lat_dim[1] + } + } + # ftime_dim + if (!is.null(ftime_dim)) { + if (!is.character(ftime_dim)) { + stop("Parameter 'ftime_dim' must be a character string.") + } + if (!all(ftime_dim %in% dimnames)) { + stop("Parameter 'ftime_dim' is not found in 'data' dimension.") + } + if (length(ftime_dim) > 1) { + warning("Parameter 'ftime_dim' has length greater than 1 and ", + "only the first element will be used.") + ftime_dim <- ftime_dim[1] + } + } + # sdate_dim + if (!is.null(sdate_dim)) { + if (!is.character(sdate_dim)) { + stop("Parameter 'sdate_dim' must be a character string.") + } + if (length(sdate_dim) > 1) { + warning("Parameter 'sdate_dim' has length greater than 1 and ", + "only the first element will be used.") + sdate_dim <- sdate_dim[1] + } + if (!all(sdate_dim %in% dimnames)) { + stop("Parameter 'sdate_dim' is not found in 'data' dimension.") + } + } + # memb_dim + if (!is.null(memb_dim)) { + if (!is.character(memb_dim)) { + stop("Parameter 'memb_dim' must be a character string.") + } + if (!all(memb_dim %in% dimnames)) { + stop("Parameter 'memb_dim' is not found in 'data' dimension. Set it ", + "as NULL if there is no member dimension.") + } + } + # dat_dim + if (!is.null(dat_dim)) { + if (!is.character(dat_dim)) { + stop("Parameter 'dat_dim' must be a character string.") + } + if (!all(dat_dim %in% dimnames)) { + stop("Parameter 'dat_dim' is not found in 'data' dimension. Set it ", + "as NULL if there is no Datasets dimension.") + } + if (length(dat_dim) > 1) { + warning("Parameter 'dat_dim' has length greater than 1 and ", + "only the first element will be used.") + dat_dim <- dat_dim[1] + } + n_datasets <- dim(data)[dat_dim] + } else { + n_datasets <- 1 + } + # var_dim + if (!is.null(var_dim)) { + if (!is.character(var_dim)) { + stop("Parameter 'var_dim' must be a character string.") + } + if (!all(var_dim %in% dimnames)) { + stop("Parameter 'var_dim' is not found in 'data' dimension. Set it ", + "as NULL if there is no variable dimension.") + } + if (length(var_dim) > 1) { + warning("Parameter 'var_dim' has length greater than 1 and ", + "only the first element will be used.") + var_dim <- var_dim[1] + } + n_vars <- dim(data)[var_dim] + } else { + n_vars <- 1 + } + # minimum dimensions + if (all(dimnames %in% c(var_dim, dat_dim))) { + if (!single_file) { + warning("Parameter data has only ", + paste(c(var_dim, dat_dim), collapse = ' and '), " dimensions ", + "and it cannot be splitted in multiple files. All data will ", + "be saved in a single file.") + single_file <- TRUE + } + } + # Dates dimension check + if (!is.null(Dates)) { + if (all(names(dim(Dates)) == c(ftime_dim, sdate_dim)) | + all(names(dim(Dates)) == c(sdate_dim, ftime_dim))) { + if (is.null(startdates)) { + startdates <- Subset(Dates, along = ftime_dim, 1, drop = 'selected') + } else if ((!inherits(startdates, "POSIXct") & !inherits(startdates, "Date")) && + (!is.character(startdates) | (all(nchar(startdates) != 10) & + all(nchar(startdates) != 8) & all(nchar(startdates) != 6)))) { + warning("Parameter 'startdates' should be a character string containing ", + "the start dates in the format 'yyyy-mm-dd', 'yyyymmdd', 'yyyymm', ", + "'POSIXct' or 'Dates' class.") + startdates <- Subset(Dates, along = ftime_dim, 1, drop = 'selected') } + } else { + stop("Parameter 'Dates' must have start date dimension and ", + "forecast time dimension.") + } } - if (length(var_name) != 1) { - stop("One variable name must be included in element 'Variable$varName' ", - "of parameter 'data'.") + # startdates + if (is.null(startdates)) { + if (is.null(sdate_dim)) { + startdates <- 'XXX' + } else { + startdates <- rep('XXX', dim(data)[sdate_dim]) + } + } else { + if (is.null(sdate_dim)) { + if (length(startdates) != 1) { + warning("Parameter 'startdates' has length more than 1. Only first ", + "value will be used.") + startdates <- startdates[[1]] + } + } + } + # Datasets + if (is.null(Datasets)) { + if (!single_file) { + warning("Parameter 'Datasets' is NULL. Files will be saved with a ", + "directory name of 'XXX'.") + } + Datasets <- rep('XXX', n_datasets ) } - if (!is.character(var_name)) { - stop("Element 'Variable$varName' of parameter 'data' ", - "must be a character string.") + if (inherits(Datasets, 'list')) { + Datasets <- names(Datasets) + } + if (n_datasets > length(Datasets)) { + warning("Dimension 'Datasets' in 'data' is greater than those listed in ", + "element 'Datasets' and the first element will be reused.") + Datasets <- c(Datasets, rep(Datasets[1], n_datasets - length(Datasets))) + } else if (n_datasets < length(Datasets)) { + warning("Dimension 'Datasets' in 'data' is smaller than those listed in ", + "element 'Datasets' and only the firsts elements will be used.") + Datasets <- Datasets[1:n_datasets] } - known_dim_names <- c("var", "lat", "latitude", "lon", "longitude", "time", - "ftime", "sdate", "dataset", "dat", "nlevel", "levels") - dims_var <- NULL - list_pos <- 1 + ## Unknown dimensions check + alldims <- c(dat_dim, var_dim, sdate_dim, lon_dim, lat_dim, memb_dim, ftime_dim) + if (!all(dimnames %in% alldims)) { + unknown_dims <- dimnames[which(!dimnames %in% alldims)] + warning("Detected unknown dimension: ", paste(unknown_dims, collapse = ', ')) + memb_dim <- c(memb_dim, unknown_dims) + alldims <- c(dat_dim, var_dim, sdate_dim, lon_dim, lat_dim, memb_dim, ftime_dim) + } + # Reorder + if (any(dimnames != alldims)) { + data <- Reorder(data, alldims) + dimnames <- names(dim(data)) + if (!is.null(attr(data, 'dimensions'))) { + attr(data, 'dimensions') <- dimnames + } + } - if (any(dimname == 'longitude') | any(dimname == 'lon')) { - dim_lon <- ncdim_def(name = 'lon', units = 'degrees', - vals = as.vector(lon), longname = 'longitude') - dims_var[[list_pos]] <- dim_lon - list_pos <- list_pos + 1 - } - if (any(dimname == 'latitude') | any(dimname == 'lat')) { - dim_lat <- ncdim_def(name = 'lat', units = 'degrees_north', - vals = as.vector(lat), longname = 'latitude') - dims_var[[list_pos]] <- dim_lat - list_pos <- list_pos + 1 - } - if (any(!(dimname %in% known_dim_names))) { - dims_member <- dimname[!(dimname %in% known_dim_names)] - if (length(dims_member) > 1) { - stop("Ask for saving realizations or further dimensions to the mantainer.") + ## NetCDF dimensions definition + defined_dims <- NULL + extra_info_dim <- NULL + if (is.null(Dates)) { + filedims <- dimnames[which(!dimnames %in% c(dat_dim, var_dim))] + } else { + filedims <- dimnames[which(!dimnames %in% c(dat_dim, var_dim, sdate_dim, ftime_dim))] + } + for (i_coord in filedims) { + dim_info <- list() + # vals + if (i_coord %in% names(coords)) { + if (is.numeric(coords[[i_coord]])) { + dim_info[['vals']] <- as.vector(coords[[i_coord]]) + } else { + dim_info[['vals']] <- 1:dim(data)[i_coord] + } } else { - dim_memb <- ncdim_def(name = 'ensemble', units = "adim", - vals = 1 : dim(data)[which(dimname == 'member')], - longname = 'ensemble', create_dimvar = TRUE) - dims_var[[list_pos]] <- dim_memb - list_pos <- list_pos + 1 + dim_info[['vals']] <- 1:dim(data)[i_coord] } + # name + dim_info[['name']] <- i_coord + # len + dim_info[['len']] <- as.numeric(dim(data)[i_coord]) + # unlim + dim_info[['unlim']] <- FALSE + # create_dimvar + dim_info[['create_dimvar']] <- TRUE + ## metadata + if (i_coord %in% names(metadata)) { + if ('variables' %in% names(attributes(metadata[[i_coord]]))) { + # from Start: 'lon' or 'lat' + attrs <- attributes(metadata[[i_coord]])[['variables']][[i_coord]] + i_coord_info <- attrs[!sapply(attrs, inherits, 'list')] + } else if (inherits(metadata[[i_coord]], 'list')) { + # from Start and Load: main var + i_coord_info <- metadata[[i_coord]] + } else if (!is.null(attributes(metadata[[i_coord]]))) { + # from Load + i_coord_info <- attributes(metadata[[i_coord]]) + } else { + stop("Metadata is not correct.") + } + # len + if ('size' %in% names(i_coord_info)) { + if (i_coord_info[['size']] != dim(data)[i_coord]) { + dim_info[['original_len']] <- i_coord_info[['size']] + i_coord_info[['size']] <- NULL + } + } + # units + if (!('units' %in% names(i_coord_info))) { + dim_info[['units']] <- '' + } else { + dim_info[['units']] <- i_coord_info[['units']] + i_coord_info[['units']] <- NULL + } + # calendar + if (!('calendar' %in% names(i_coord_info))) { + dim_info[['calendar']] <- NA + } else { + dim_info[['calendar']] <- i_coord_info[['calendar']] + i_coord_info[['calendar']] <- NULL + } + # longname + if ('long_name' %in% names(i_coord_info)) { + dim_info[['longname']] <- i_coord_info[['long_name']] + i_coord_info[['long_name']] <- NULL + } else if ('longname' %in% names(i_coord_info)) { + dim_info[['longname']] <- i_coord_info[['longname']] + i_coord_info[['longname']] <- NULL + } else { + if (i_coord %in% .KnownLonNames()) { + dim_info[['longname']] <- 'longitude' + } else if (i_coord %in% .KnownLatNames()) { + dim_info[['longname']] <- 'latitude' + } + } + # extra information + if (!is.null(names(i_coord_info))) { + extra_info_dim[[i_coord]] <- i_coord_info + } + } else { + # units + dim_info[['units']] <- "adim" + # longname + dim_info[['longname']] <- i_coord + # calendar + dim_info[['calendar']] <- NA + } + new_dim <- list(ncdim_def(name = dim_info[['name']], units = dim_info[['units']], + vals = dim_info[['vals']], unlim = dim_info[['unlim']], + create_dimvar = dim_info[['create_dimvar']], + calendar = dim_info[['calendar']], + longname = dim_info[['longname']])) + names(new_dim) <- i_coord + defined_dims <- c(defined_dims, new_dim) } + defined_vars <- list() + if (!single_file) { + for (i in 1:n_datasets) { + path <- file.path(destination, Datasets[i], varname) + for (j in 1:n_vars) { + dir.create(path[j], recursive = TRUE) + startdates <- gsub("-", "", startdates) + dim(startdates) <- c(length(startdates)) + names(dim(startdates)) <- sdate_dim + if (is.null(dat_dim) & is.null(var_dim)) { + data_subset <- data + } else if (is.null(dat_dim)) { + data_subset <- Subset(data, c(var_dim), list(j), drop = 'selected') + } else if (is.null(var_dim)) { + data_subset <- Subset(data, along = c(dat_dim), list(i), drop = 'selected') + } else { + data_subset <- Subset(data, c(dat_dim, var_dim), list(i, j), drop = 'selected') + } + if (is.null(Dates)) { + input_data <- list(data_subset, startdates) + target_dims <- list(c(lon_dim, lat_dim, memb_dim, ftime_dim), NULL) + } else { + input_data <- list(data_subset, startdates, Dates) + target_dims = list(c(lon_dim, lat_dim, memb_dim, ftime_dim), NULL, ftime_dim) + } + Apply(data = input_data, + target_dims = target_dims, + fun = .saveExp, + destination = path[j], + defined_dims = defined_dims, + ftime_dim = ftime_dim, + varname = varname[j], + metadata_var = metadata[[varname[j]]], + extra_info_dim = extra_info_dim, + extra_string = extra_string) + } + } + } else { + # Datasets definition + # From here + if (!is.null(dat_dim)) { + new_dim <- list(ncdim_def(name = dat_dim, units = "adim", + vals = 1 : dim(data)[dat_dim], + longname = 'Datasets', create_dimvar = TRUE)) + names(new_dim) <- dat_dim + defined_dims <- c(new_dim, defined_dims) + extra_info_dim[[dat_dim]] <- list(Datasets = paste(Datasets, collapse = ', ')) + } + first_sdate <- last_sdate <- NULL + if (!is.null(Dates)) { + # sdate definition + sdates <- Subset(Dates, along = ftime_dim, 1, drop = 'selected') + differ <- as.numeric((sdates - sdates[1])/3600) + new_dim <- list(ncdim_def(name = sdate_dim, units = paste('hours since', sdates[1]), + vals = differ, + longname = sdate_dim, create_dimvar = TRUE)) + names(new_dim) <- sdate_dim + defined_dims <- c(defined_dims, new_dim) + first_sdate <- sdates[1] + last_sdate <- sdates[length(sdates)] + # ftime definition + Dates <- Reorder(Dates, c(ftime_dim, sdate_dim)) + differ_ftime <- apply(Dates, 2, function(x){as.numeric((x - x[1])/3600)}) + dim(differ_ftime) <- dim(Dates) + differ_ftime_subset <- Subset(differ_ftime, along = sdate_dim, 1, drop = 'selected') + if (all(apply(differ_ftime, 1, function(x){length(unique(x)) == 1}))) { + if (all(diff(differ_ftime_subset/24) == 1)) { + # daily values + dim_time <- list(ncdim_def(name = ftime_dim, units = 'days', + vals = round(differ_ftime_subset/24) + 1, + calendar = 'proleptic_gregorian', + longname = ftime_dim, unlim = TRUE)) + names(dim_time) <- ftime_dim + defined_dims <- c(defined_dims, dim_time) + } else if (all(diff(differ_ftime_subset/24) %in% c(28, 29, 30, 31))) { + # monthly values + dim_time <- list(ncdim_def(name = ftime_dim, units = 'months', + vals = round(differ_ftime_subset/730) + 1, + calendar = 'proleptic_gregorian', + longname = ftime_dim, unlim = TRUE)) + names(dim_time) <- ftime_dim + defined_dims <- c(defined_dims, dim_time) + } else { + # other frequency + dim_time <- list(ncdim_def(name = ftime_dim, units = 'hours', + vals = differ_ftime_subset + 1, + calendar = 'proleptic_gregorian', + longname = ftime_dim, unlim = TRUE)) + names(dim_time) <- ftime_dim + defined_dims <- c(defined_dims, dim_time) + } + } else { + warning("Time steps are not equal for all start dates. Only ", + "forecast time values for the first start date will be saved ", + "correctly.") + dim_time <- list(ncdim_def(name = ftime_dim, + units = paste('hours since', + paste(sdates, collapse = ', ')), + vals = differ_ftime_subset, + calendar = 'proleptic_gregorian', + longname = ftime_dim, unlim = TRUE)) + names(dim_time) <- ftime_dim + defined_dims <- c(defined_dims, dim_time) + } + } - if (any(dimname == 'level')) { - stop("Ask for saving 3Dim fields to the mantainer.") + # var definition + defined_vars <- list() + extra_info_var <- NULL + for (j in 1:n_vars) { + var_info <- list() + i_var_info <- metadata[[varname[j]]][!sapply(metadata[[varname[j]]], inherits, 'list')] + ## Define metadata + # name + var_info[['name']] <- varname[j] + # units + if ('units' %in% names(i_var_info)) { + var_info[['units']] <- i_var_info[['units']] + i_var_info[['units']] <- NULL + } else { + var_info[['units']] <- '' + } + # dim + var_info[['dim']] <- defined_dims + # missval + if ('missval' %in% names(i_var_info)) { + var_info[['missval']] <- i_var_info[['missval']] + i_var_info[['missval']] <- NULL + } else { + var_info[['missval']] <- NULL + } + # longname + if (any(c('longname', 'long_name') %in% names(i_var_info))) { + longname <- names(i_var_info)[which(names(i_var_info) %in% c('longname', 'long_name'))] + var_info[['longname']] <- i_var_info[[longname]] + i_var_info[[longname]] <- NULL + } else { + var_info[['longname']] <- varname[j] + } + # prec + if ('prec' %in% names(i_var_info)) { + var_info[['prec']] <- i_var_info[['prec']] + i_var_info[['prec']] <- NULL + } else { + prec <- typeof(data) + if (prec == 'character') { + var_info[['prec']] <- 'char' + } + if (any(prec %in% c('short', 'float', 'double', 'integer', 'char', 'byte'))) { + var_info[['prec']] <- prec + } else { + var_info[['prec']] <- 'double' + } + } + # extra information + if (!is.null(names(i_var_info))) { + extra_info_var[[varname[j]]] <- i_var_info + } + new_var <- list(ncvar_def(name = var_info[['name']], + units = var_info[['units']], + dim = var_info[['dim']], + missval = var_info[['missval']], + longname = var_info[['longname']], + prec = var_info[['prec']])) + + names(new_var) <- varname[j] + defined_vars <- c(defined_vars, new_var) + } + if (is.null(extra_string)) { + gsub("-", "", first_sdate) + file_name <- paste0(paste(c(varname, + gsub("-", "", first_sdate), + gsub("-", "", last_sdate)), + collapse = '_'), ".nc") + } else { + file_name <- paste0(paste(c(varname, extra_string, + gsub("-", "", first_sdate), + gsub("-", "", last_sdate)), + collapse = '_'), ".nc") + } + full_filename <- file.path(destination, file_name) + file_nc <- nc_create(full_filename, defined_vars) + if (is.null(var_dim)) { + ncvar_put(file_nc, varname, vals = data) + } else { + for (j in 1:n_vars) { + ncvar_put(file_nc, defined_vars[[j]]$name, + vals = Subset(data, var_dim, j, drop = 'selected')) + } + } + # Additional dimension attributes + for (dim in names(defined_dims)) { + if (dim %in% names(extra_info_dim)) { + for (info_dim in names(extra_info_dim[[dim]])) { + ncatt_put(file_nc, dim, info_dim, as.character(extra_info_dim[[dim]][[info_dim]])) + } + } + } + # Additional dimension attributes + for (var in names(defined_vars)) { + if (var %in% names(extra_info_var)) { + for (info_var in names(extra_info_var[[var]])) { + ncatt_put(file_nc, var, info_var, as.character(extra_info_var[[var]][[info_var]])) + } + } + } + nc_close(file_nc) } +} - for (i in 1 : n_datasets) { - path <- file.path(destination, datasets[i], var_name) - dir.create(path, recursive = TRUE) - startdate <- gsub("-", "", startdates) +.saveExp <- function(data, startdates = NULL, dates = NULL, destination = "./", + defined_dims, ftime_dim = 'time', varname = 'var', + metadata_var = NULL, extra_info_dim = NULL, + extra_string = NULL) { + # ftime_dim + if (!is.null(dates)) { + differ <- as.numeric((dates - dates[1])/3600) + dim_time <- list(ncdim_def(name = ftime_dim, units = paste('hours since', dates[1]), + vals = differ, calendar = 'proleptic_gregorian', + longname = ftime_dim, unlim = TRUE)) + names(dim_time) <- ftime_dim + defined_dims <- c(defined_dims, dim_time) + } + + ## Define var metadata + var_info <- NULL + extra_info_var <- NULL + i_var_info <- metadata_var[!sapply(metadata_var, inherits, 'list')] - dim(startdate) <- c(sdate = length(startdate)) - Apply(list(data, startdate, Dates), - target_dims = list(c('member', 'time', 'latitude', 'longitude'), - NULL, 'time'), - fun = .saveExp, var_name = var_name, units = units, - dims_var = dims_var, cdo_grid_name = cdo_grid_name, projection = projection, - destination = path, extra_string = extra_string) + # name + var_info[['name']] <- varname + # units + if ('units' %in% names(i_var_info)) { + var_info[['units']] <- i_var_info[['units']] + i_var_info[['units']] <- NULL + } else { + var_info[['units']] <- '' } -} + # dim + var_info[['dim']] <- defined_dims + # missval + if ('missval' %in% names(i_var_info)) { + var_info[['missval']] <- i_var_info[['missval']] + i_var_info[['missval']] <- NULL + } else { + var_info[['missval']] <- NULL + } + # longname + if (any(c('longname', 'long_name') %in% names(i_var_info))) { + longname <- names(i_var_info)[which(names(i_var_info) %in% c('longname', 'long_name'))] + var_info[['longname']] <- i_var_info[[longname]] + i_var_info[[longname]] <- NULL + } else { + var_info[['longname']] <- varname + } + # prec + if ('prec' %in% names(i_var_info)) { + var_info[['prec']] <- i_var_info[['prec']] + i_var_info[['prec']] <- NULL + } else { + prec <- typeof(data) + if (prec == 'character') { + var_info[['prec']] <- 'char' + } + if (any(prec %in% c('short', 'float', 'double', 'integer', 'char', 'byte'))) { + var_info[['prec']] <- prec + } else { + var_info[['prec']] <- 'double' + } + } + # extra information + if (!is.null(names(i_var_info))) { + extra_info_var <- i_var_info + } + + datanc <- ncvar_def(name = var_info[['name']], + units = var_info[['units']], + dim = var_info[['dim']], + missval = var_info[['missval']], + longname = var_info[['longname']], + prec = var_info[['prec']]) -# data is an array with dimensions: member, time, lat, lon: -# Dates is a vector of the dates for the time dimension -# dims_var is a list with the ncdim_def of common variables in dataset: member, lat and lon: -# data <- 1:(3 * 4 * 5 * 6) -# dim(data) <- c(longitude = 3, latitude = 4, time = 5, member = 6) -# var_name <- 'tas' -# units <- 'K' -# lon <- 1:3 -# lat <- 1:4 -# sdate = '19001101' -# destination = '/esarchive/scratch/nperez/git/Flor/cstools/' -# dims_var = list(ncdim_def(name = 'lon', units = 'degrees', -# vals = as.vector(lon), longname = 'longitude'), -# ncdim_def(name = 'lat', units = 'degrees_north', -# vals = as.vector(lat), longname = 'latitude'), -# ncdim_def(name = 'ensemble', units = "adim", -# vals = 1 : 6, -# longname = 'ensemble', create_dimvar = TRUE)) -#Dates <- as.Date(c("1900-11-01", "1900-12-01", "1901-01-01", "1901-02-01", "1901-03-01")) -#.saveExp(data, sdate, Dates, var_name, units, dims_var, cdo_grid_name = 'r360x181', projection = 'none', destination) -.saveExp <- function(data, sdate, Dates, var_name, units, dims_var, - cdo_grid_name, projection, destination, extra_string) { - dim_names <- names(dim(data)) - if (any(dim_names != c('longitude', 'latitude', 'member', 'time'))) { - data <- Reorder(data, c('longitude', 'latitude', 'member', 'time')) - } - differ <- as.numeric((Dates - Dates[1])/3600) - dim_time <- ncdim_def(name = 'time', units = paste('hours since', Dates[1]), - vals = differ, calendar = 'proleptic_gregorian', - longname = 'time', unlim = TRUE) - list_pos = length(dims_var) + 1 - dims_var[[list_pos]] <- dim_time - datanc <- ncvar_def(name = var_name, - units = units, - dim = dims_var, missval = -99999) if (is.null(extra_string)) { - file_name <- paste0(var_name, "_", sdate, ".nc") + file_name <- paste0(varname, "_", startdates, ".nc") } else { - file_name <- paste0(var_name, "_", extra_string, "_", sdate, ".nc") + file_name <- paste0(varname, "_", extra_string, "_", startdates, ".nc") } full_filename <- file.path(destination, file_name) file_nc <- nc_create(full_filename, datanc) ncvar_put(file_nc, datanc, data) - ncatt_put(file_nc, datanc, 'coordinates', cdo_grid_name) - ncatt_put(file_nc, datanc, 'projection', projection) + + # Additional attributes + for (dim in names(defined_dims)) { + if (dim %in% names(extra_info_dim)) { + for (info_dim in names(extra_info_dim[[dim]])) { + ncatt_put(file_nc, dim, info_dim, as.character(extra_info_dim[[dim]][[info_dim]])) + } + } + } + # Additional dimension attributes + if (!is.null(extra_info_var)) { + for (info_var in names(extra_info_var)) { + ncatt_put(file_nc, varname, info_var, as.character(extra_info_var[[info_var]])) + } + } + nc_close(file_nc) } diff --git a/R/CST_SplitDim.R b/R/CST_SplitDim.R index a2d93f7d576f4a11a369fb390a80a1edf653d889..25d610da8ffb14fae278e7bc2ba559c12464b299 100644 --- a/R/CST_SplitDim.R +++ b/R/CST_SplitDim.R @@ -2,20 +2,38 @@ #' #'@author Nuria Perez-Zanon, \email{nuria.perez@bsc.es} #' -#'@description This function split a dimension in two. The user can select the dimension to split and provide indices indicating how to split that dimension or dates and the frequency expected (monthly or by day, month and year). The user can also provide a numeric frequency indicating the length of each division. +#'@description This function split a dimension in two. The user can select the +#'dimension to split and provide indices indicating how to split that dimension +#'or dates and the frequency expected (monthly or by day, month and year). The +#'user can also provide a numeric frequency indicating the length of each +#'division. #' -#'@param data a 's2dv_cube' object -#'@param split_dim a character string indicating the name of the dimension to split -#'@param indices a vector of numeric indices or dates. If left at NULL, the dates provided in the s2dv_cube object (element Dates) will be used. -#'@param freq a character string indicating the frequency: by 'day', 'month' and 'year' or 'monthly' (by default). 'month' identifies months between 1 and 12 independently of the year they belong to, while 'monthly' differenciates months from different years. -#'@param new_dim_name a character string indicating the name of the new dimension. -#'@param insert_ftime an integer indicating the number of time steps to add at the begining of the time series. +#'@param data A 's2dv_cube' object +#'@param split_dim A character string indicating the name of the dimension to +#' split. +#'@param indices A vector of numeric indices or dates. If left at NULL, the +#' dates provided in the s2dv_cube object (element Dates) will be used. +#'@param freq A character string indicating the frequency: by 'day', 'month' and +#' 'year' or 'monthly' (by default). 'month' identifies months between 1 and 12 +#' independently of the year they belong to, while 'monthly' differenciates +#' months from different years. +#'@param new_dim_name A character string indicating the name of the new +#' dimension. +#'@param insert_ftime An integer indicating the number of time steps to add at +#' the begining of the time series. #' -#'@details Parameter 'insert_ftime' has been included for the case of using daily data, requiring split the temporal dimensions by months (or similar) and the first lead time doesn't correspondt to the 1st day of the month. In this case, the insert_ftime could be used, to get a final output correctly organized. E.g.: leadtime 1 is the 2nd of November and the input time series extend to the 31st of December. When requiring split by month with \code{inset_ftime = 1}, the 'monthly' dimension of length two will indicate the month (position 1 for November and position 2 for December), dimension 'time' will be length 31. For November, the position 1 and 31 will be NAs, while from positon 2 to 30 will be filled with the data provided. This allows to select correctly days trhough time dimension. -#'@import abind -#'@importFrom ClimProjDiags Subset +#'@details Parameter 'insert_ftime' has been included for the case of using +#'daily data, requiring split the temporal dimensions by months (or similar) and +#'the first lead time doesn't correspondt to the 1st day of the month. In this +#'case, the insert_ftime could be used, to get a final output correctly +#'organized. E.g.: leadtime 1 is the 2nd of November and the input time series +#'extend to the 31st of December. When requiring split by month with +#'\code{inset_ftime = 1}, the 'monthly' dimension of length two will indicate +#'the month (position 1 for November and position 2 for December), dimension +#''time' will be length 31. For November, the position 1 and 31 will be NAs, +#'while from positon 2 to 30 will be filled with the data provided. This allows +#'to select correctly days trhough time dimension. #'@examples -#' #'data <- 1 : 20 #'dim(data) <- c(time = 10, lat = 2) #'data <-list(data = data) @@ -28,89 +46,90 @@ #'data <- list(data = data$data, Dates = time) #'class(data) <- 's2dv_cube' #'new_data <- CST_SplitDim(data, indices = time) -#'dim(new_data$data) #'new_data <- CST_SplitDim(data, indices = time, freq = 'day') -#'dim(new_data$data) #'new_data <- CST_SplitDim(data, indices = time, freq = 'month') -#'dim(new_data$data) #'new_data <- CST_SplitDim(data, indices = time, freq = 'year') -#'dim(new_data$data) +#'@import abind +#'@importFrom ClimProjDiags Subset #'@export CST_SplitDim <- function(data, split_dim = 'time', indices = NULL, - freq = 'monthly', new_dim_name = NULL, insert_ftime = NULL) { - if (!inherits(data, 's2dv_cube')) { - stop("Parameter 'data' must be of the class 's2dv_cube', ", - "as output by CSTools::CST_Load.") - } - if (!is.null(insert_ftime)) { - if (!is.numeric(insert_ftime)) { - stop("Parameter 'insert_ftime' should be an integer.") + freq = 'monthly', new_dim_name = NULL, + insert_ftime = NULL) { + # Check 's2dv_cube' + if (!inherits(data, 's2dv_cube')) { + stop("Parameter 'data' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + } + if (!is.null(insert_ftime)) { + if (!is.numeric(insert_ftime)) { + stop("Parameter 'insert_ftime' should be an integer.") + } else { + if (length(insert_ftime) > 1) { + warning("Parameter 'insert_ftime' must be of length 1, and only the", + " first element will be used.") + insert_ftime <- insert_ftime[1] + } + # adding NAs at the begining of the data in ftime dim + ftimedim <- which(names(dim(data$data)) == 'ftime') + dims <- dim(data$data) + dims[ftimedim] <- insert_ftime + empty_array <- array(NA, dims) + data$data <- abind(empty_array, data$data, along = ftimedim) + names(dim(data$data)) <- names(dims) + # adding dates to Dates for the new NAs introduced + if ((data$attrs$Dates[2] - data$attrs$Dates[1]) == 1) { + timefreq <- 'days' } else { - if (length(insert_ftime) > 1) { - warning("Parameter 'insert_ftime' must be of length 1, and only the", - " first element will be used.") - insert_ftime <- insert_ftime[1] - } - # adding NAs at the begining of the data in ftime dim - ftimedim <- which(names(dim(data$data)) == 'ftime') - dims <- dim(data$data) - dims[ftimedim] <- insert_ftime - empty_array <- array(NA, dims) - data$data <- abind(empty_array, data$data, along = ftimedim) - names(dim(data$data)) <- names(dims) - # adding dates to Dates for the new NAs introduced - if ((data$Dates[[1]][2] - data$Dates[[1]][1]) == 1) { - timefreq <- 'days' - } else { - timefreq <- 'months' - warning("Time frequency of forecast time is considered monthly.") - } - start <- data$Dates[[1]] - dim(start) <- c(ftime = length(start)/dims['sdate'], sdate = dims['sdate']) - #new <- array(NA, prod(dim(data$data)[c('ftime', 'sdate')])) - # Pending fix transform to UTC when concatenaiting - data$Dates$start <- do.call(c, lapply(1:dim(start)[2], function(x) { - seq(start[1,x] - as.difftime(insert_ftime, - units = timefreq), - start[dim(start)[1],x], by = timefreq, tz = "UTC")})) + timefreq <- 'months' + warning("Time frequency of forecast time is considered monthly.") } - } - if (is.null(indices)) { - if (any(split_dim %in% c('ftime', 'time', 'sdate'))) { - if (is.list(data$Dates)) { - indices <- data$Dates[[1]] - } else { - indices <- data$Dates - } - if (any(names(dim(data$data)) %in% 'sdate')) { - if (!any(names(dim(data$data)) %in% split_dim)) { - stop("Parameter 'split_dims' must be one of the dimension ", - "names in parameter 'data'.") - } - indices <- indices[1 : dim(data$data)[which(names(dim(data$data)) == - split_dim)]] + start <- data$attrs$Dates + dim(start) <- c(ftime = length(start)/dims['sdate'], sdate = dims['sdate']) + # new <- array(NA, prod(dim(data$data)[c('ftime', 'sdate')])) + # Pending fix transform to UTC when concatenaiting + data$attrs$Dates <- do.call(c, lapply(1:dim(start)[2], function(x) { + seq(start[1,x] - as.difftime(insert_ftime, + units = timefreq), + start[dim(start)[1],x], by = timefreq, tz = "UTC")})) + } + } + if (is.null(indices)) { + if (any(split_dim %in% c('ftime', 'time', 'sdate'))) { + indices <- data$attrs$Dates + if (any(names(dim(data$data)) %in% 'sdate')) { + if (!any(names(dim(data$data)) %in% split_dim)) { + stop("Parameter 'split_dims' must be one of the dimension ", + "names in parameter 'data'.") } + indices <- indices[1 : dim(data$data)[which(names(dim(data$data)) == split_dim)]] } } - data$data <- SplitDim(data$data, split_dim = split_dim, indices = indices, + } + data$data <- SplitDim(data$data, split_dim = split_dim, indices = indices, freq = freq, new_dim_name = new_dim_name) - return(data) + return(data) } #'Function to Split Dimension #' #'@author Nuria Perez-Zanon, \email{nuria.perez@bsc.es} #' -#'@description This function split a dimension in two. The user can select the dimension to split and provide indices indicating how to split that dimension or dates and the frequency expected (monthly or by day, month and year). The user can also provide a numeric frequency indicating the length of each division. +#'@description This function split a dimension in two. The user can select the +#'dimension to split and provide indices indicating how to split that dimension +#'or dates and the frequency expected (monthly or by day, month and year). The +#'user can also provide a numeric frequency indicating the length of each division. #' -#'@param data an n-dimensional array with named dimensions -#'@param split_dim a character string indicating the name of the dimension to split -#'@param indices a vector of numeric indices or dates -#'@param freq a character string indicating the frequency: by 'day', 'month' and 'year' or 'monthly' (by default). 'month' identifies months between 1 and 12 independetly of the year they belong to, while 'monthly' differenciates months from different years. Parameter 'freq' can also be numeric indicating the length in which to subset the dimension. -#'@param new_dim_name a character string indicating the name of the new dimension. -#'@import abind -#'@importFrom ClimProjDiags Subset +#'@param data An n-dimensional array with named dimensions. +#'@param split_dim A character string indicating the name of the dimension to +#' split. +#'@param indices A vector of numeric indices or dates. +#'@param freq A character string indicating the frequency: by 'day', 'month' and +#' 'year' or 'monthly' (by default). 'month' identifies months between 1 and 12 +#' independetly of the year they belong to, while 'monthly' differenciates +#' months from different years. Parameter 'freq' can also be numeric indicating +#' the length in which to subset the dimension. +#'@param new_dim_name A character string indicating the name of the new +#' dimension. #'@examples -#' #'data <- 1 : 20 #'dim(data) <- c(time = 10, lat = 2) #'indices <- c(rep(1,5), rep(2,5)) @@ -122,140 +141,142 @@ CST_SplitDim <- function(data, split_dim = 'time', indices = NULL, #'new_data <- SplitDim(data, indices = time, freq = 'day') #'new_data <- SplitDim(data, indices = time, freq = 'month') #'new_data <- SplitDim(data, indices = time, freq = 'year') +#'@import abind +#'@importFrom ClimProjDiags Subset #'@export SplitDim <- function(data, split_dim = 'time', indices, freq = 'monthly', new_dim_name = NULL) { - # check data - if (is.null(data)) { - stop("Parameter 'data' cannot be NULL.") - } - if (is.null(dim(data))) { - dim(data) = c(time = length(data)) - } - if (is.null(names(dim(data)))) { - stop("Parameter 'data' must have dimension names.") - } - dims <- dim(data) - # check split_dim - if (!is.character(split_dim)) { - stop("Parameter 'split_dim' must be a character.") - } - if (length(split_dim) > 1) { - split_dim <- split_dim[1] - warning("Parameter 'split_dim' has length greater than ", - "one and only the first element will be used.") - } - if (!any(names(dims) %in% split_dim)) { - stop("Parameter 'split_dims' must be one of the dimension ", - "names in parameter 'data'.") - } - pos_split <- which(names(dims) == split_dim) - # check indices and freq - if (is.null(indices)) { - if (!is.numeric(freq)) { - stop("Parameter 'freq' must be a integer number indicating ", - " the length of each chunk.") - } else { - if (!((dims[pos_split] / freq) %% 1 == 0)) { - stop("Parameter 'freq' must be proportional to the ", - "length of the 'split_dim' in parameter 'data'.") - } - indices <- rep(1 : (dims[pos_split] / freq), freq) - indices <- sort(indices) - repited <- sort(unique(indices)) - } - } else if (is.numeric(indices)) { - if (!is.null(freq)) { - if (freq != 'monthly') { - warning("Parameter 'freq' is not being used since ", - "parameter 'indices' is numeric.") - } - } - repited <- sort(unique(indices)) + # check data + if (is.null(data)) { + stop("Parameter 'data' cannot be NULL.") + } + if (is.null(dim(data))) { + dim(data) = c(time = length(data)) + } + if (is.null(names(dim(data)))) { + stop("Parameter 'data' must have dimension names.") + } + dims <- dim(data) + # check split_dim + if (!is.character(split_dim)) { + stop("Parameter 'split_dim' must be a character.") + } + if (length(split_dim) > 1) { + split_dim <- split_dim[1] + warning("Parameter 'split_dim' has length greater than ", + "one and only the first element will be used.") + } + if (!any(names(dims) %in% split_dim)) { + stop("Parameter 'split_dims' must be one of the dimension ", + "names in parameter 'data'.") + } + pos_split <- which(names(dims) == split_dim) + # check indices and freq + if (is.null(indices)) { + if (!is.numeric(freq)) { + stop("Parameter 'freq' must be a integer number indicating ", + " the length of each chunk.") } else { - # Indices should be Dates and freq character - if (!is.character(freq)) { - stop("Parameter 'freq' must be a character indicating ", - "how to divide the dates provided in parameter 'indices'", - ", 'monthly', 'anually' or 'daily'.") - } - if (!(any(class(indices) %in% c('POSIXct')))) { - indices <- try( { - if (is.character(indices)) { - as.POSIXct(indices) - } else { - as.POSIXct(indices) - } - }) - if ('try-error' %in% class(indices) | - sum(is.na(indices)) == length(indices)) { - stop("Dates provided in parameter 'indices' must be of class", - " 'POSIXct' or convertable to 'POSIXct'.") - } - } + if (!((dims[pos_split] / freq) %% 1 == 0)) { + stop("Parameter 'freq' must be proportional to the ", + "length of the 'split_dim' in parameter 'data'.") + } + indices <- rep(1 : (dims[pos_split] / freq), freq) + indices <- sort(indices) + repited <- sort(unique(indices)) + } + } else if (is.numeric(indices)) { + if (!is.null(freq)) { + if (freq != 'monthly') { + warning("Parameter 'freq' is not being used since ", + "parameter 'indices' is numeric.") + } } - if (length(indices) != dims[pos_split]) { - stop("Parameter 'indices' has different length of parameter ", - "data in the dimension supplied in 'split_dim'.") + repited <- sort(unique(indices)) + } else { + # Indices should be Dates and freq character + if (!is.character(freq)) { + stop("Parameter 'freq' must be a character indicating ", + "how to divide the dates provided in parameter 'indices'", + ", 'monthly', 'anually' or 'daily'.") } - # check indices as dates: - if (!is.numeric(indices)) { - if (freq == 'day') { - indices <- as.numeric(strftime(indices, format = "%d")) - repited <- unique(indices) - } else if (freq == 'month') { - indices <- as.numeric(strftime(indices, format = "%m")) - repited <- unique(indices) - } else if (freq == 'year') { - indices <- as.numeric(strftime(indices, format = "%Y")) - repited <- unique(indices) - } else if (freq == 'monthly' ) { - indices <- as.numeric(strftime(indices, format = "%m%Y")) - repited <- unique(indices) + if (!inherits(indices, 'POSIXct')) { + indices <- try({ + if (is.character(indices)) { + as.POSIXct(indices) } else { - stop("Parameter 'freq' must be numeric or a character: ", - "by 'day', 'month', 'year' or 'monthly' (for ", - "distinguishable month).") - } - } - # check new_dim_name - if (!is.null(new_dim_name)) { - if (!is.character(new_dim_name)) { - stop("Parameter 'new_dim_name' must be character string") + as.POSIXct(indices) } - if (length(new_dim_name) > 1) { - new_dim_name <- new_dim_name[1] - warning("Parameter 'new_dim_name' has length greater than 1 ", - "and only the first elemenst is used.") + }) + if ('try-error' %in% class(indices) | + sum(is.na(indices)) == length(indices)) { + stop("Dates provided in parameter 'indices' must be of class", + " 'POSIXct' or convertable to 'POSIXct'.") } } - max_times <- max(unlist(lapply(repited, - function(x){sum(indices == x)}))) - data <- lapply(repited, function(x) {rebuild(x, data, along = split_dim, - indices = indices, max_times)}) - data <- abind(data, along = length(dims) + 1) - if (is.character(freq)) { - names(dim(data)) <- c(names(dims), freq) + } + if (length(indices) != dims[pos_split]) { + stop("Parameter 'indices' has different length of parameter ", + "data in the dimension supplied in 'split_dim'.") + } + # check indices as dates: + if (!is.numeric(indices)) { + if (freq == 'day') { + indices <- as.numeric(strftime(indices, format = "%d")) + repited <- unique(indices) + } else if (freq == 'month') { + indices <- as.numeric(strftime(indices, format = "%m")) + repited <- unique(indices) + } else if (freq == 'year') { + indices <- as.numeric(strftime(indices, format = "%Y")) + repited <- unique(indices) + } else if (freq == 'monthly' ) { + indices <- as.numeric(strftime(indices, format = "%m%Y")) + repited <- unique(indices) } else { - names(dim(data)) <- c(names(dims), 'index') + stop("Parameter 'freq' must be numeric or a character: ", + "by 'day', 'month', 'year' or 'monthly' (for ", + "distinguishable month).") } - if (!is.null(new_dim_name)) { - names(dim(data)) <- c(names(dims), new_dim_name) + } + # check new_dim_name + if (!is.null(new_dim_name)) { + if (!is.character(new_dim_name)) { + stop("Parameter 'new_dim_name' must be character string") } -return(data) + if (length(new_dim_name) > 1) { + new_dim_name <- new_dim_name[1] + warning("Parameter 'new_dim_name' has length greater than 1 ", + "and only the first elemenst is used.") + } + } + max_times <- max(unlist(lapply(repited, + function(x){sum(indices == x)}))) + data <- lapply(repited, function(x) {rebuild(x, data, along = split_dim, + indices = indices, max_times)}) + data <- abind(data, along = length(dims) + 1) + if (is.character(freq)) { + names(dim(data)) <- c(names(dims), freq) + } else { + names(dim(data)) <- c(names(dims), 'index') + } + if (!is.null(new_dim_name)) { + names(dim(data)) <- c(names(dims), new_dim_name) + } + return(data) } rebuild <- function(x, data, along, indices, max_times) { - a <- Subset(data, along = along, indices = which(indices == x)) - pos_dim <- which(names(dim(a)) == along) - if (dim(a)[pos_dim] != max_times) { - adding <- max_times - dim(a)[pos_dim] - new_dims <- dim(a) - new_dims[pos_dim] <- adding - extra <- array(NA, dim = new_dims) - a <- abind(a, extra, along = pos_dim) - names(dim(a)) <- names(dim(data)) - } - return(a) + a <- Subset(data, along = along, indices = which(indices == x)) + pos_dim <- which(names(dim(a)) == along) + if (dim(a)[pos_dim] != max_times) { + adding <- max_times - dim(a)[pos_dim] + new_dims <- dim(a) + new_dims[pos_dim] <- adding + extra <- array(NA, dim = new_dims) + a <- abind(a, extra, along = pos_dim) + names(dim(a)) <- names(dim(data)) + } + return(a) } diff --git a/R/CST_Subset.R b/R/CST_Subset.R new file mode 100644 index 0000000000000000000000000000000000000000..35717757c0679bec88b8cdc293b99a24425f4116 --- /dev/null +++ b/R/CST_Subset.R @@ -0,0 +1,162 @@ +#'Subset an object of class s2dv_cube +#' +#'This function allows to subset (i.e. slice, take a chunk of) the data inside +#'an object of class \code{s2dv_cube} and modify the dimensions, coordinates and +#'attributes accordingly, removing any variables, time steps and spatial +#'coordinates that are dropped when subsetting. It ensures that the information +#'inside the s2dv_cube remains coherent with the data it contains.\cr\cr +#'As in the function \code{Subset} from the ClimProjDiags package, the +#'dimensions to subset along can be specified via the parameter \code{along} +#'either with integer indices or by their name.\cr\cr +#'There are additional ways to adjust which dimensions are dropped in the +#'resulting object: either to drop all, to drop none, to drop only the ones that +#'have been sliced or to drop only the ones that have not been sliced.\cr\cr +#'The \code{load_parameters} and \code{when} attributes of the original cube +#'are preserved. The \code{source_files} attribute is subset along the +#'\code{var_dim} and \code{dat_dim} dimensions. +#' +#'@author Agudetse Roures Victoria, \email{victoria.agudetse@bsc.es} +#' +#'@param x An object of class \code{s2dv_cube} to be sliced. +#'@param along A vector with references to the dimensions to take the subset +#' from: either integers or dimension names. +#'@param indices A list of indices to take from each dimension specified in +#' 'along'. If a single dimension is specified in 'along', it can be directly +#' provided as an integer or a vector. +#'@param drop Whether to drop all the dimensions of length 1 in the resulting +#' array, none, only those that are specified in 'along', or only those that +#' are not specified in 'along'. The possible values are: 'all' or TRUE, 'none' +#' or FALSE, 'selected', and 'non-selected'. The default value is FALSE. +#'@param dat_dim A character string indicating the name of dataset dimension. +#' The default value is NULL. +#'@param var_dim A chatacter string indicating the name of the variable +#' dimension. The default value is NULL. +#' +#'@return An object of class \code{s2dv_cube} with similar data, coordinates and +#' attributes as the \code{x} input, but with trimmed or dropped dimensions. +#' +#'@examples +#'#Example with sample data: +#'# Check original dimensions and coordinates +#'lonlat_temp$exp$dims +#'names(lonlat_temp$exp$coords) +#'# Subset the s2dv_cube +#'exp_subset <- CST_Subset(lonlat_temp$exp, +#' along = c("lat", "lon"), +#' indices = list(1:10, 1:10), +#' drop = 'non-selected') +#'# Check new dimensions and coordinates +#'exp_subset$dims +#'names(exp_subset$coords) +#' +#'@seealso \link[ClimProjDiags]{Subset} +#' +#'@importFrom ClimProjDiags Subset +#'@export +CST_Subset <- function(x, along, indices, drop = FALSE, var_dim = NULL, + dat_dim = NULL) { + # Check that x is s2dv_cube + if (!inherits(x, 's2dv_cube')) { + stop("Parameter 'x' must be of the class 's2dv_cube'.") + } + # Check var_dim + if (!is.null(var_dim)) { + if ((!is.character(var_dim)) || (length(var_dim) > 1)) { + stop("Parameter 'var_dim' must be a character string.") + } + } + # Check dat_dim + if (!is.null(dat_dim)) { + if ((!is.character(dat_dim)) || (length(dat_dim) > 1)) { + stop("Parameter 'dat_dim' must be a character string.") + } + } + + # Subset data + x$data <- ClimProjDiags::Subset(x$data, along = along, + indices = indices, + drop = drop) + # Adjust dimensions + x$dims <- dim(x$data) + # Adjust coordinates + for (dimension in 1:length(along)) { + dim_name <- along[dimension] + index <- indices[[dimension]] + # Only rename coordinates that have not been dropped + if (dim_name %in% names(x$dims)) { + # Subset coordinate by indices + if (is.null(dim(x$coords[[dim_name]])) | length(dim(x$coords[[dim_name]])) == 1) { + x$coords[[dim_name]] <- .subset_with_attrs(x$coords[[dim_name]], index) + } else { + x$coords[[dim_name]] <- ClimProjDiags::Subset(x$coords[[dim_name]], along = dim_name, + indices = index) + } + } + } + # Remove dropped coordinates + for (coordinate in names(x$coords)) { + if (!(coordinate %in% names(x$dims))) { + x$coords[[coordinate]] <- NULL + } + } + # Adjust attributes + # Variable + for (dimension in 1:length(along)) { + dim_name <- along[dimension] + index <- indices[[dimension]] + if ((!is.null(var_dim)) && (dim_name == var_dim)) { + x$attrs$Variable$varName <- as.vector(x$coords[[dim_name]]) + } + if ((!is.null(dat_dim)) && (dim_name == dat_dim)) { + x$attrs$Datasets <- as.vector(x$coords[[dim_name]]) + } + if ((!is.null(x$attrs$source_files)) && + (dim_name %in% names(dim(x$attrs$source_files)))) { + x$attrs$source_files <- ClimProjDiags::Subset(x$attrs$source_files, + along = dim_name, + indices = index, + drop = drop) + } + if ((dim_name %in% names(x$dims)) && + (dim_name %in% names(x$attrs$Variable$metadata))) { + variable <- x$attrs$Variable$metadata[[dim_name]] + # Subset coords by indices + if (is.null(dim(variable)) | length(dim(variable)) == 1) { + x$attrs$Variable$metadata[[dim_name]] <- .subset_with_attrs(variable, index) + } else { + x$attrs$Variable$metadata[[dim_name]] <- ClimProjDiags::Subset(variable, along = dim_name, + indices = index) + } + } + } + # Remove metadata from variables that were dropped + vars_to_keep <- na.omit(match(c(names(x$dims), (x$attrs$Variable$varName)), + names(x$attrs$Variable$metadata))) + x$attrs$Variable$metadata <- x$attrs$Variable$metadata[vars_to_keep] + # Subset Dates + time_along <- intersect(along, names(dim(x$attrs$Dates))) + if (!(length(time_along) == 0)) { + time_indices <- indices[match(time_along, along)] + original_dates <- x$attrs$Dates + x$attrs$Dates <- ClimProjDiags::Subset(x$attrs$Dates, + along = time_along, + indices = time_indices, + drop = drop) + } + return(x) +} + +# Function to subset vectors with attributes +.subset_with_attrs <- function(x, ...) { + l <- x[...] + x.dims <- names(dim(x)) + attr.names <- names(attributes(x)) + attr.names <- attr.names[attr.names != 'names'] + attr.names <- attr.names[attr.names != 'dim'] + attributes(l)[attr.names] <- attributes(x)[attr.names] + if (is.null(dim(l))) { + dim(l) <- length(l) + } + names(dim(l)) <- x.dims + return(l) +} diff --git a/R/CST_WeatherRegimes.R b/R/CST_WeatherRegimes.R index c0c85c07db3afc3921bf155b3099b685bfe66e52..56783aebb44f8aaac9b282c403b7a52441e1af12 100644 --- a/R/CST_WeatherRegimes.R +++ b/R/CST_WeatherRegimes.R @@ -1,140 +1,193 @@ -#' @rdname CST_WeatherRegimes -#' @title Function for Calculating the Cluster analysis +#'@rdname CST_WeatherRegimes +#'@title Function for Calculating the Cluster analysis #' -#' @author Verónica Torralba - BSC, \email{veronica.torralba@bsc.es} +#'@author Verónica Torralba - BSC, \email{veronica.torralba@bsc.es} #' -#' @description This function computes the weather regimes from a cluster analysis. -#'It is applied on the array \code{data} in a 's2dv_cube' object. The dimensionality of this object can be also reduced -#'by using PCs obtained from the application of the #'EOFs analysis to filter the dataset. -#'The cluster analysis can be performed with the traditional k-means or those methods +#'@description This function computes the weather regimes from a cluster +#'analysis. It is applied on the array \code{data} in a 's2dv_cube' object. The +#'dimensionality of this object can be also reduced by using PCs obtained from +#'the application of the #'EOFs analysis to filter the dataset. The cluster +#'analysis can be performed with the traditional k-means or those methods #'included in the hclust (stats package). #' -#'@references Cortesi, N., V., Torralba, N., González-Reviriego, A., Soret, and F.J., Doblas-Reyes (2019). -#' Characterization of European wind speed variability using weather regimes. Climate Dynamics,53, -#' 4961–4976, doi:10.1007/s00382-019-04839-5. -#'@references Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools -#' for the development of a climate service. Thesis. Available online: \url{https://eprints.ucm.es/56841/} +#'@references Cortesi, N., V., Torralba, N., González-Reviriego, A., Soret, and +#'F.J., Doblas-Reyes (2019). Characterization of European wind speed variability +#'using weather regimes. Climate Dynamics,53, 4961–4976, +#'\doi{10.1007/s00382-019-04839-5}. +#'@references Torralba, V. (2019) Seasonal climate prediction for the wind +#'energy sector: methods and tools for the development of a climate service. +#'Thesis. Available online: \url{https://eprints.ucm.es/56841/}. #' -#'@param data a 's2dv_cube' object -#'@param ncenters Number of clusters to be calculated with the clustering function. -#'@param EOFs Whether to compute the EOFs (default = 'TRUE') or not (FALSE) to filter the data. -#'@param neofs number of modes to be kept (default = 30). -#'@param varThreshold Value with the percentage of variance to be explained by the PCs. -#' Only sufficient PCs to explain this much variance will be used in the clustering. -#'@param method Different options to estimate the clusters. The most traditional approach is the k-means analysis (default=’kmeans’) -#'but the function also support the different methods included in the hclust . These methods are: -#'"ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). -#' For more details about these methods see the hclust function documentation included in the stats package. -#'@param iter.max Parameter to select the maximum number of iterations allowed (Only if method='kmeans' is selected). -#'@param nstart Parameter for the cluster analysis determining how many random sets to choose (Only if method='kmeans' is selected). +#'@param data An 's2dv_cube' object. +#'@param ncenters Number of clusters to be calculated with the clustering +#' function. +#'@param EOFs Whether to compute the EOFs (default = 'TRUE') or not (FALSE) to +#' filter the data. +#'@param neofs Number of modes to be kept (default = 30). +#'@param varThreshold Value with the percentage of variance to be explained by +#' the PCs. Only sufficient PCs to explain this much variance will be used in +#' the clustering. +#'@param method Different options to estimate the clusters. The most traditional +#' approach is the k-means analysis (default=’kmeans’) but the function also +#' support the different methods included in the hclust . These methods are: +#' "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" +#' (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). For more details +#' about these methods see the hclust function documentation included in the +#' stats package. +#'@param iter.max Parameter to select the maximum number of iterations allowed +#' (Only if method='kmeans' is selected). +#'@param nstart Parameter for the cluster analysis determining how many random +#' sets to choose (Only if method='kmeans' is selected). #'@param ncores The number of multicore threads to use for parallel computation. -#'@return A list with two elements \code{$data} (a 's2dv_cube' object containing the composites cluster=1,..,K for case (*1) -# or only k=1 for any specific cluster, i.e., case (*2)) and \code{$statistics} that includes -#' \code{$pvalue} (array with the same structure as \code{$data} containing the pvalue of the composites obtained through a t-test that accounts for the serial dependence.), -#' \code{cluster} (A matrix or vector with integers (from 1:k) indicating the cluster to which each time step is allocated.), -#' \code{persistence} (Percentage of days in a month/season before a cluster is replaced for a new one (only if method=’kmeans’ has been selected.)), -#' \code{frequency} (Percentage of days in a month/season belonging to each cluster (only if method=’kmeans’ has been selected).), +#'@return A list with two elements \code{$data} (a 's2dv_cube' object containing +#'the composites cluster = 1,..,K for case (*1) or only k = 1 for any specific +#'cluster, i.e., case (*2)) and \code{$statistics} that includes \code{$pvalue} +#'(array with the same structure as \code{$data} containing the pvalue of the +#'composites obtained through a t-test that accounts for the serial dependence.), +#'\code{cluster} (A matrix or vector with integers (from 1:k) indicating the +#'cluster to which each time step is allocated.), \code{persistence} (Percentage +#'of days in a month/season before a cluster is replaced for a new one (only if +#'method=’kmeans’ has been selected.)), \code{frequency} (Percentage of days in +#'a month/season belonging to each cluster (only if method=’kmeans’ has been +#'selected).), +#'@examples +#'data <- array(abs(rnorm(1280, 283.7, 6)), dim = c(dataset = 2, member = 2, +#' sdate = 3, ftime = 3, +#' lat = 4, lon = 4)) +#'coords <- list(lon = seq(0, 3), lat = seq(47, 44)) +#'obs <- list(data = data, coords = coords) +#'class(obs) <- 's2dv_cube' +#' +#'res1 <- CST_WeatherRegimes(data = obs, EOFs = FALSE, ncenters = 4) +#'res2 <- CST_WeatherRegimes(data = obs, EOFs = TRUE, ncenters = 3) +#' #'@importFrom s2dv EOF #'@import multiApply -#'@examples -#'\dontrun{ -#'res1 <- CST_WeatherRegimes(data = lonlat_temp$obs, EOFs = FALSE, ncenters = 4) -#'res2 <- CST_WeatherRegimes(data = lonlat_temp$obs, EOFs = TRUE, ncenters = 3) -#'} #'@export -#' CST_WeatherRegimes <- function(data, ncenters = NULL, - EOFs = TRUE, neofs = 30, - varThreshold = NULL, - method = "kmeans", - iter.max = 100, nstart = 30, - ncores = NULL) { + EOFs = TRUE, neofs = 30, + varThreshold = NULL, + method = "kmeans", + iter.max = 100, nstart = 30, + ncores = NULL) { + # Check 's2dv_cube' if (!inherits(data, 's2dv_cube')) { stop("Parameter 'data' must be of the class 's2dv_cube', ", "as output by CSTools::CST_Load.") } - if ('lon' %in% names(data)){ - lon <- data$lon - }else { - lon <- NULL + # Check 'exp' object structure + if (!all(c('data', 'coords') %in% names(data))) { + stop("Parameter 'data' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + } + # Check coordinates + if (!any(names(data$coords) %in% .KnownLonNames()) | + !any(names(data$coords) %in% .KnownLatNames())) { + stop("Spatial coordinate names do not match any of the names accepted ", + "the package.") + } else { + lon_name <- names(data$coords)[[which(names(data$coords) %in% .KnownLonNames())]] + lat_name <- names(data$coords)[[which(names(data$coords) %in% .KnownLatNames())]] + lon <- as.vector(data$coords[[lon_name]]) + lat <- as.vector(data$coords[[lat_name]]) } - result <- WeatherRegime(data$data,ncenters = ncenters, + + result <- WeatherRegime(data$data, ncenters = ncenters, EOFs = EOFs, neofs = neofs, varThreshold = varThreshold, lon = lon, - lat = data$lat, method = method, - iter.max=iter.max, nstart = nstart, + lat = lat, method = method, + iter.max = iter.max, nstart = nstart, ncores = ncores) data$data <- result$composite data$statistics <- result[-1] return(data) } -#' @rdname WeatherRegimes -#' @title Function for Calculating the Cluster analysis +#'@rdname WeatherRegimes +#'@title Function for Calculating the Cluster analysis #' -#' @author Verónica Torralba - BSC, \email{veronica.torralba@bsc.es} +#'@author Verónica Torralba - BSC, \email{veronica.torralba@bsc.es} #' -#' @description This function computes the weather regimes from a cluster analysis. -#'It can be applied over the dataset with dimensions -#'c(year/month, month/day, lon, lat), or by using PCs obtained from the application of the -#'EOFs analysis to filter the dataset. -#'The cluster analysis can be performed with the traditional k-means or those methods -#'included in the hclust (stats package). +#'@description This function computes the weather regimes from a cluster analysis. +#'It can be applied over the dataset with dimensions c(year/month, month/day, +#'lon, lat), or by using PCs obtained from the application of the EOFs analysis +#'to filter the dataset. The cluster analysis can be performed with the +#'traditional k-means or those methods included in the hclust (stats package). #' -#'@references Cortesi, N., V., Torralba, N., González-Reviriego, A., Soret, and F.J., Doblas-Reyes (2019). -#' Characterization of European wind speed variability using weather regimes. Climate Dynamics,53, -#' 4961–4976, doi:10.1007/s00382-019-04839-5. -#'@references Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools -#' for the development of a climate service. Thesis. Available online: \url{https://eprints.ucm.es/56841/} +#'@references Cortesi, N., V., Torralba, N., González-Reviriego, A., Soret, and +#'F.J., Doblas-Reyes (2019). Characterization of European wind speed variability +#'using weather regimes. Climate Dynamics,53, 4961–4976, +#'\doi{10.1007/s00382-019-04839-5}. +#'@references Torralba, V. (2019) Seasonal climate prediction for the wind +#'energy sector: methods and tools for the development of a climate service. +#'Thesis. Available online: \url{https://eprints.ucm.es/56841/} #' -#'@param data an array containing anomalies with named dimensions with at least start date 'sdate', forecast time 'ftime', latitude 'lat' and longitude 'lon'. -#'@param ncenters Number of clusters to be calculated with the clustering function. -#'@param EOFs Whether to compute the EOFs (default = 'TRUE') or not (FALSE) to filter the data. -#'@param neofs number of modes to be kept only if EOFs = TRUE has been selected. (default = 30). -#'@param varThreshold Value with the percentage of variance to be explained by the PCs. -#' Only sufficient PCs to explain this much variance will be used in the clustering. +#'@param data An array containing anomalies with named dimensions with at least +#' start date 'sdate', forecast time 'ftime', latitude 'lat' and longitude +#' 'lon'. +#'@param ncenters Number of clusters to be calculated with the clustering +#' function. +#'@param EOFs Whether to compute the EOFs (default = 'TRUE') or not (FALSE) to +#' filter the data. +#'@param neofs Number of modes to be kept only if EOFs = TRUE has been selected. +#' (default = 30). +#'@param varThreshold Value with the percentage of variance to be explained by +#' the PCs. Only sufficient PCs to explain this much variance will be used in +#' the clustering. #'@param lon Vector of longitudes. #'@param lat Vector of latitudes. -#'@param method Different options to estimate the clusters. The most traditional approach is the k-means analysis (default=’kmeans’) -#'but the function also support the different methods included in the hclust . These methods are: -#'"ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). -#' For more details about these methods see the hclust function documentation included in the stats package. -#'@param iter.max Parameter to select the maximum number of iterations allowed (Only if method='kmeans' is selected). -#'@param nstart Parameter for the cluster analysis determining how many random sets to choose (Only if method='kmeans' is selected). +#'@param method Different options to estimate the clusters. The most traditional +#' approach is the k-means analysis (default=’kmeans’) but the function also +#' support the different methods included in the hclust . These methods are: +#' "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" +#' (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). For more details +#' about these methods see the hclust function documentation included in the +#' stats package. +#'@param iter.max Parameter to select the maximum number of iterations allowed +#' (Only if method = 'kmeans' is selected). +#'@param nstart Parameter for the cluster analysis determining how many random +#' sets to choose (Only if method='kmeans' is selected). #'@param ncores The number of multicore threads to use for parallel computation. -#'@return A list with elements \code{$composite} (array with at least 3-d ('lat', 'lon', 'cluster') containing the composites k=1,..,K for case (*1) -# or only k=1 for any specific cluster, i.e., case (*2)), -#' \code{pvalue} (array with at least 3-d ('lat','lon','cluster') with the pvalue of the composites obtained through a t-test that accounts for the serial -# dependence of the data with the same structure as Composite.), -#' \code{cluster} (A matrix or vector with integers (from 1:k) indicating the cluster to which each time step is allocated.), -#' \code{persistence} (Percentage of days in a month/season before a cluster is replaced for a new one (only if method=’kmeans’ has been selected.)), -#' \code{frequency} (Percentage of days in a month/season belonging to each cluster (only if method=’kmeans’ has been selected).), -#'@importFrom s2dv EOF -#'@import multiApply +#'@return A list with elements \code{$composite} (array with at least 3-d ('lat', +#''lon', 'cluster') containing the composites k = 1,..,K for case (*1) or only k = 1 +#'for any specific cluster, i.e., case (*2)), \code{pvalue} (array with at least +#'3-d ('lat','lon','cluster') with the pvalue of the composites obtained through +#'a t-test that accounts for the serial dependence of the data with the same +#'structure as Composite.), \code{cluster} (A matrix or vector with integers +#'(from 1:k) indicating the cluster to which each time step is allocated.), +#'\code{persistence} (Percentage of days in a month/season before a cluster is +#'replaced for a new one (only if method=’kmeans’ has been selected.)), +#'\code{frequency} (Percentage of days in a month/season belonging to each +#'cluster (only if method=’kmeans’ has been selected).), #'@examples -#'\dontrun{ -#'res <- WeatherRegime(data = lonlat_temp$obs$data, lat = lonlat_temp$obs$lat, +#'data <- array(abs(rnorm(1280, 283.7, 6)), dim = c(dataset = 2, member = 2, +#' sdate = 3, ftime = 3, +#' lat = 4, lon = 4)) +#'lat <- seq(47, 44) +#'res <- WeatherRegime(data = data, lat = lat, #' EOFs = FALSE, ncenters = 4) -#'} +#'@importFrom s2dv EOF +#'@import multiApply #'@export - WeatherRegime <- function(data, ncenters = NULL, - EOFs = TRUE,neofs = 30, + EOFs = TRUE, neofs = 30, varThreshold = NULL, lon = NULL, lat = NULL, method = "kmeans", - iter.max=100, nstart = 30, + iter.max = 100, nstart = 30, ncores = NULL) { - + ## Check inputs + # data if (is.null(names(dim(data)))) { stop("Parameter 'data' must be an array with named dimensions.") } - + if (EOFs == TRUE && is.null(lon)) { + stop("Parameter 'lon' must be specified.") + } if (is.null(lat)) { stop("Parameter 'lat' must be specified.") } - dimData <- names(dim(data)) - + # temporal dimensions if ('sdate' %in% dimData && 'ftime' %in% dimData) { nsdates <- dim(data)['sdate'] nftimes <- dim(data)['ftime'] @@ -148,17 +201,35 @@ WeatherRegime <- function(data, ncenters = NULL, stop("Parameter 'data' must have temporal dimensions.") } } - + # spatial dimensions + if (!any(names(dim(data)) %in% .KnownLonNames()) | + !any(names(dim(data)) %in% .KnownLatNames())) { + stop("Spatial coordinate names do not match any of the names accepted ", + "by the package.") + } + + lon_name <- names(dim(data))[[which(names(dim(data)) %in% .KnownLonNames())]] + lat_name <- names(dim(data))[[which(names(dim(data)) %in% .KnownLatNames())]] + + if (!is.null(lat) && dim(data)[lat_name] != length(lat)) { + stop("The length of the paramter 'lat' does not match with the ['lat'] dimension of + the parameter 'data'.") + } + # ncenters + if (is.null(ncenters)) { + stop("Parameter 'ncenters' must be specified.") + } output <- Apply(data = list(data), - target_dims = c('time','lat','lon'), + target_dims = c('time', lat_name, lon_name), fun = .WeatherRegime, EOFs = EOFs, neofs = neofs, varThreshold = varThreshold, lon = lon, lat = lat, ncenters = ncenters, method = method, - ncores = ncores) + ncores = ncores, + lon_name = lon_name, lat_name = lat_name) if (method == 'kmeans' && 'sdate' %in% dimData && 'ftime' %in% dimData) { @@ -186,31 +257,15 @@ WeatherRegime <- function(data, ncenters = NULL, .WeatherRegime <- function(data, ncenters = NULL, EOFs = TRUE, neofs = 30, varThreshold = NULL, lon = NULL, lat = NULL, method = "kmeans", - iter.max=100, nstart = 30) { - - if (is.null(names(dim(data)))) { - stop("Parameter 'data' must be an array with 'time', 'lat' and 'lon' dimensions.") - } - - if (!is.null(lat) && dim(data)['lat'] != length(lat)) { - stop("The length of the paramter 'lat' does not match with the ['lat'] dimension of - the parameter 'data'.") - } - if (is.null(ncenters)) { - stop("Parameter 'ncenters' must be specified.") - } - if (EOFs == TRUE && is.null(lon)) { - stop("Parameter 'lon' must be specified.") - } - if (is.null(lat)) { - stop("Parameter 'lat' must be specified.") - } + iter.max = 100, nstart = 30, lon_name = 'lon', + lat_name = 'lat') { - nlon <- dim(data)['lat'] - nlat <- dim(data)['lon'] + + nlon <- dim(data)[lat_name] + nlat <- dim(data)[lon_name] if (any(is.na(data))){ - nas_test <- MergeDims(data, merge_dims = c('lat','lon'), + nas_test <- MergeDims(data, merge_dims = c(lat_name,lon_name), rename_dim = 'space', na.rm = TRUE) if (dim(nas_test)['space']== c(nlat*nlon)){ stop("Parameter 'data' contains NAs in the 'time' dimensions.") @@ -240,12 +295,12 @@ WeatherRegime <- function(data, ncenters = NULL, } } else { - dataW <- aperm(Apply(data, target_dims = 'lat', + dataW <- aperm(Apply(data, target_dims = lat_name, function (x, la) { x * cos(la * pi / 180)}, la = lat)[[1]], c(2, 1, 3)) - cluster_input <- MergeDims(dataW, merge_dims = c('lat','lon'), + cluster_input <- MergeDims(dataW, merge_dims = c(lat_name, lon_name), rename_dim = 'space',na.rm = TRUE) } @@ -270,7 +325,7 @@ WeatherRegime <- function(data, ncenters = NULL, } result <- lapply(1:length(result), function (n) { - names(dim(result[[n]])) <- c("lat", "lon", "cluster") + names(dim(result[[n]])) <- c(lat_name, lon_name, "cluster") return (result[[n]]) }) diff --git a/R/PlotForecastPDF.R b/R/PlotForecastPDF.R index 5cb90c16ce4ebbc11dbb3f81738588746421d284..7143294472db2653fff3ca08d4cbc08a07b0a8a5 100644 --- a/R/PlotForecastPDF.R +++ b/R/PlotForecastPDF.R @@ -37,14 +37,16 @@ #'@param add.ensmemb Either to add the ensemble members \code{'above'} (default) #' or \code{'below'} the pdf, or not (\code{'no'}). #'@param color.set A selection of predefined color sets: use \code{'ggplot'} -#' (default) for blue/green/red, \code{'s2s4e'} for blue/grey/orange, or +#' (default) for blue/green/red, \code{'s2s4e'} for blue/grey/orange, #' \code{'hydro'} for yellow/gray/blue (suitable for precipitation and -#' inflows). +#' inflows) or the \code{"vitigeoss"} color set. #'@param memb_dim A character string indicating the name of the member #' dimension. #' #'@return A ggplot object containing the plot. -#' +#'@examples +#'fcsts <- data.frame(fcst1 = rnorm(10), fcst2 = rnorm(10, 0.5, 1.2)) +#'PlotForecastPDF(fcsts,c(-1,1)) #'@importFrom data.table data.table #'@importFrom data.table CJ #'@importFrom data.table setkey @@ -53,20 +55,12 @@ #'@importFrom plyr . #'@importFrom plyr dlply #'@importFrom s2dv InsertDim -#'@examples -#'fcsts <- data.frame(fcst1 = rnorm(10), fcst2 = rnorm(10, 0.5, 1.2), -#' fcst3 = rnorm(10, -0.5, 0.9)) -#'PlotForecastPDF(fcsts,c(-1,1)) -#'\donttest{ -#'fcsts2 <- array(rnorm(100), dim = c(member = 20, fcst = 5)) -#'PlotForecastPDF(fcsts2, c(-0.66, 0.66), extreme.limits = c(-1.2, 1.2), -#' fcst.names = paste0('random fcst ', 1 : 5), obs = 0.7) -#'} #'@export PlotForecastPDF <- function(fcst, tercile.limits, extreme.limits = NULL, obs = NULL, plotfile = NULL, title = "Set a title", var.name = "Varname (units)", fcst.names = NULL, add.ensmemb = c("above", "below", "no"), - color.set = c("ggplot", "s2s4e", "hydro"), memb_dim = 'member') { + color.set = c("ggplot", "s2s4e", "hydro", "vitigeoss"), + memb_dim = 'member') { value <- init <- extremes <- x <- ymin <- ymax <- tercile <- NULL y <- xend <- yend <- yjitter <- MLT <- lab.pos <- NULL ggColorHue <- function(n) { @@ -95,6 +89,12 @@ PlotForecastPDF <- function(fcst, tercile.limits, extreme.limits = NULL, obs = N colorMember <- c("#ffff7f") colorObs <- "purple" colorLab <- c("red", "blue") + } else if (color.set == "vitigeoss") { + colorFill <- rev(c("#007be2", "#acb2b5", "#f40000")) + colorHatch <- rev(c("#211b79", "#ae0003")) + colorMember <- c("#ffff7f") + colorObs <- "purple" + colorLab <- colorHatch } else { stop("Parameter 'color.set' should be one of ggplot/s2s4e/hydro") } @@ -574,3 +574,4 @@ PlotForecastPDF <- function(fcst, tercile.limits, extreme.limits = NULL, obs = N } return(do.call("rbind", hatch.ls)) } + diff --git a/R/PlotMostLikelyQuantileMap.R b/R/PlotMostLikelyQuantileMap.R index 31cde14f5ecd5e0caa3771804ac34376cfa9a2f3..18e4a8bad36bf193441d55cdd15fb392ca216cfe 100644 --- a/R/PlotMostLikelyQuantileMap.R +++ b/R/PlotMostLikelyQuantileMap.R @@ -1,26 +1,46 @@ #'Plot Maps of Most Likely Quantiles #' -#'@author Veronica Torralba, \email{veronica.torralba@bsc.es}, Nicolau Manubens, \email{nicolau.manubens@bsc.es} -#'@description This function receives as main input (via the parameter \code{probs}) a collection of longitude-latitude maps, each containing the probabilities (from 0 to 1) of the different grid cells of belonging to a category. As many categories as maps provided as inputs are understood to exist. The maps of probabilities must be provided on a common rectangular regular grid, and a vector with the longitudes and a vector with the latitudes of the grid must be provided. The input maps can be provided in two forms, either as a list of multiple two-dimensional arrays (one for each category) or as a three-dimensional array, where one of the dimensions corresponds to the different categories. +#'@author Veronica Torralba, \email{veronica.torralba@bsc.es}, Nicolau Manubens, +#'\email{nicolau.manubens@bsc.es} +#'@description This function receives as main input (via the parameter +#'\code{probs}) a collection of longitude-latitude maps, each containing the +#'probabilities (from 0 to 1) of the different grid cells of belonging to a +#'category. As many categories as maps provided as inputs are understood to +#'exist. The maps of probabilities must be provided on a common rectangular +#'regular grid, and a vector with the longitudes and a vector with the latitudes +#'of the grid must be provided. The input maps can be provided in two forms, +#'either as a list of multiple two-dimensional arrays (one for each category) or +#'as a three-dimensional array, where one of the dimensions corresponds to the +#'different categories. #' -#'@param probs a list of bi-dimensional arrays with the named dimensions 'latitude' (or 'lat') and 'longitude' (or 'lon'), with equal size and in the same order, or a single tri-dimensional array with an additional dimension (e.g. 'bin') for the different categories. The arrays must contain probability values between 0 and 1, and the probabilities for all categories of a grid cell should not exceed 1 when added. -#'@param lon a numeric vector with the longitudes of the map grid, in the same order as the values along the corresponding dimension in \code{probs}. -#'@param lat a numeric vector with the latitudes of the map grid, in the same order as the values along the corresponding dimension in \code{probs}. -#'@param cat_dim the name of the dimension along which the different categories are stored in \code{probs}. This only applies if \code{probs} is provided in the form of 3-dimensional array. The default expected name is 'bin'. -#'@param bar_titles vector of character strings with the names to be drawn on top of the color bar for each of the categories. As many titles as categories provided in \code{probs} must be provided. -#'@param col_unknown_cat character string with a colour representation of the colour to be used to paint the cells for which no category can be clearly assigned. Takes the value 'white' by default. +#'@param probs A list of bi-dimensional arrays with the named dimensions +#' 'latitude' (or 'lat') and 'longitude' (or 'lon'), with equal size and in the +#' same order, or a single tri-dimensional array with an additional dimension +#' (e.g. 'bin') for the different categories. The arrays must contain +#' probability values between 0 and 1, and the probabilities for all categories +#' of a grid cell should not exceed 1 when added. +#'@param lon A numeric vector with the longitudes of the map grid, in the same +#' order as the values along the corresponding dimension in \code{probs}. +#'@param lat A numeric vector with the latitudes of the map grid, in the same +#' order as the values along the corresponding dimension in \code{probs}. +#'@param cat_dim The name of the dimension along which the different categories +#' are stored in \code{probs}. This only applies if \code{probs} is provided in +#' the form of 3-dimensional array. The default expected name is 'bin'. +#'@param bar_titles Vector of character strings with the names to be drawn on +#' top of the color bar for each of the categories. As many titles as +#' categories provided in \code{probs} must be provided. +#'@param col_unknown_cat Character string with a colour representation of the +#' colour to be used to paint the cells for which no category can be clearly +#' assigned. Takes the value 'white' by default. #'@param drawleg Where to draw the common colour bar. Can take values TRUE, #' FALSE or:\cr #' 'up', 'u', 'U', 'top', 't', 'T', 'north', 'n', 'N'\cr #' 'down', 'd', 'D', 'bottom', 'b', 'B', 'south', 's', 'S' (default)\cr #' 'right', 'r', 'R', 'east', 'e', 'E'\cr #' 'left', 'l', 'L', 'west', 'w', 'W' -#'@param ... additional parameters to be sent to \code{PlotCombinedMap} and \code{PlotEquiMap}. +#'@param ... Additional parameters to be sent to \code{PlotCombinedMap} and +#' \code{PlotEquiMap}. #'@seealso \code{PlotCombinedMap} and \code{PlotEquiMap} -#' -#'@importFrom maps map -#'@importFrom graphics box image layout mtext par plot.new -#'@importFrom grDevices adjustcolor bmp colorRampPalette dev.cur dev.new dev.off hcl jpeg pdf png postscript svg tiff #'@examples #'# Simple example #'x <- array(1:(20 * 10), dim = c(lat = 10, lon = 20)) / 200 @@ -108,7 +128,9 @@ #' mask = 1 - (w1 + w2 / max(c(w1, w2))), #' brks = 20, width = 10, height = 8) #'} -#' +#'@importFrom maps map +#'@importFrom graphics box image layout mtext par plot.new +#'@importFrom grDevices adjustcolor bmp colorRampPalette dev.cur dev.new dev.off hcl jpeg pdf png postscript svg tiff #'@export PlotMostLikelyQuantileMap <- function(probs, lon, lat, cat_dim = 'bin', bar_titles = NULL, diff --git a/R/PlotPDFsOLE.R b/R/PlotPDFsOLE.R index 25c669a471bb96392ebfea4cc614764f67e07a1f..bf95abb76c745410447d0cdf59c22f34b2509231 100644 --- a/R/PlotPDFsOLE.R +++ b/R/PlotPDFsOLE.R @@ -1,63 +1,47 @@ -#' Plotting two probability density gaussian functions and the optimal linear -#' estimation (OLE) as result of combining them. +#'Plotting two probability density gaussian functions and the optimal linear +#'estimation (OLE) as result of combining them. #' -#' @author Eroteida Sanchez-Garcia - AEMET, //email{esanchezg@aemet.es} +#'@author Eroteida Sanchez-Garcia - AEMET, //email{esanchezg@aemet.es} #' -#' @description This function plots two probability density gaussian functions -#' and the optimal linear estimation (OLE) as result of combining them. +#'@description This function plots two probability density gaussian functions +#'and the optimal linear estimation (OLE) as result of combining them. #' -#' @param pdf_1 A numeric array with a dimension named 'statistic', containg -#' two parameters: mean' and 'standard deviation' of the first gaussian pdf -#' to combining. -#' @param pdf_2 A numeric array with a dimension named 'statistic', containg -#' two parameters: mean' and 'standard deviation' of the second gaussian pdf +#'@param pdf_1 A numeric array with a dimension named 'statistic', containg +#' two parameters: mean' and 'standard deviation' of the first gaussian pdf #' to combining. -#' @param nsigma (optional) A numeric value for setting the limits of X axis. -#' (Default nsigma = 3). -#' @param legendPos (optional) A character value for setting the position of the -#' legend ("bottom", "top", "right" or "left")(Default 'bottom'). -#' @param legendSize (optional) A numeric value for setting the size of the -#' legend text. (Default 1.0). -#' @param plotfile (optional) A filename where the plot will be saved. -#' (Default: the plot is not saved). -#' @param width (optional) A numeric value indicating the plot width in -#' units ("in", "cm", or "mm"). (Default width = 30). -#' @param height (optional) A numeric value indicating the plot height. -#' (Default height = 15). -#' @param units (optional) A character value indicating the plot size -#' unit. (Default units = 'cm'). -#' @param dpi (optional) A numeric value indicating the plot resolution. -#' (Default dpi = 300). -#' -#' @return PlotPDFsOLE() returns a ggplot object containing the plot. -#' -#' @import ggplot2 -#' -#' @examples -#' # Example 1 -#' pdf_1 <- c(1.1,0.6) -#' attr(pdf_1, "name") <- "NAO1" -#' dim(pdf_1) <- c(statistic = 2) -#' pdf_2 <- c(1,0.5) -#' attr(pdf_2, "name") <- "NAO2" -#' dim(pdf_2) <- c(statistic = 2) -#' -#' PlotPDFsOLE(pdf_1, pdf_2) -#' -#' # Example 2 -#' Glosea5PDF <- c(2.25, 0.67) -#' attr(Glosea5PDF, "name") <- "Glosea5" -#' dim(Glosea5PDF) <- c(statistic = 2) -#' ECMWFPDF <- c(2.38, 0.61) -#' attr(ECMWFPDF, "name") <- "ECMWF" -#' dim(ECMWFPDF) <- c(statistic = 2) -#' MFPDF <- c(4.52, 0.34) -#' attr(MFPDF, "name") <- "MF" -#' dim(MFPDF) <- c(statistic = 2) -#' PlotPDFsOLE(pdf_1 = Glosea5PDF, pdf_2 = ECMWFPDF, legendPos = 'left') -#' PlotPDFsOLE(pdf_1 = Glosea5PDF, pdf_2 = MFPDF, legendPos = 'top') -#' PlotPDFsOLE(pdf_1 = ECMWFPDF, pdf_2 = MFPDF, legendSize = 1.2) - +#'@param pdf_2 A numeric array with a dimension named 'statistic', containg +#' two parameters: mean' and 'standard deviation' of the second gaussian pdf +#' to combining. +#'@param nsigma (optional) A numeric value for setting the limits of X axis. +#' (Default nsigma = 3). +#'@param legendPos (optional) A character value for setting the position of the +#' legend ("bottom", "top", "right" or "left")(Default 'bottom'). +#'@param legendSize (optional) A numeric value for setting the size of the +#' legend text. (Default 1.0). +#'@param plotfile (optional) A filename where the plot will be saved. +#' (Default: the plot is not saved). +#'@param width (optional) A numeric value indicating the plot width in +#' units ("in", "cm", or "mm"). (Default width = 30). +#'@param height (optional) A numeric value indicating the plot height. +#' (Default height = 15). +#'@param units (optional) A character value indicating the plot size +#' unit. (Default units = 'cm'). +#'@param dpi (optional) A numeric value indicating the plot resolution. +#' (Default dpi = 300). +#' +#'@return PlotPDFsOLE() returns a ggplot object containing the plot. +#' +#'@examples +#'# Example 1 +#'pdf_1 <- c(1.1,0.6) +#'attr(pdf_1, "name") <- "NAO1" +#'dim(pdf_1) <- c(statistic = 2) +#'pdf_2 <- c(1,0.5) +#'attr(pdf_2, "name") <- "NAO2" +#'dim(pdf_2) <- c(statistic = 2) +#' +#'PlotPDFsOLE(pdf_1, pdf_2) +#'@import ggplot2 #'@export PlotPDFsOLE <- function(pdf_1, pdf_2, nsigma = 3, legendPos = 'bottom', legendSize = 1.0, plotfile = NULL, width = 30, diff --git a/R/PlotTriangles4Categories.R b/R/PlotTriangles4Categories.R index 16abb421a24bc8bcb0e4be891c5945132e92cb8b..c851c9e0d7faf1e1acf07ccbee31972aee5e2f32 100644 --- a/R/PlotTriangles4Categories.R +++ b/R/PlotTriangles4Categories.R @@ -1,11 +1,12 @@ #'Function to convert any 3-d numerical array to a grid of coloured triangles. #' #'This function converts a 3-d numerical data array into a coloured -#'grid with triangles. It is useful for a slide or article to present tabular results as -#'colors instead of numbers. This can be used to compare the outputs of two or four categories ( -#'e.g. modes of variability, clusters, or forecast systems). +#'grid with triangles. It is useful for a slide or article to present tabular +#'results as colors instead of numbers. This can be used to compare the outputs +#'of two or four categories (e.g. modes of variability, clusters, or forecast +#'systems). #' -#'@param data array with three named dimensions: 'dimx', 'dimy', 'dimcat', +#'@param data Array with three named dimensions: 'dimx', 'dimy', 'dimcat', #' containing the values to be displayed in a coloured image with triangles. #'@param brks A vector of the color bar intervals. The length must be one more #' than the parameter 'cols'. Use ColorBar() to generate default values. @@ -13,14 +14,13 @@ #' must be one less than the parameter 'brks'. Use ColorBar() to generate #' default values. #'@param toptitle A string of the title of the grid. Set NULL as default. -#'@param sig_data logical array with the same dimensions as 'data' to add layers +#'@param sig_data Logical array with the same dimensions as 'data' to add layers #' to the plot. A value of TRUE at a grid cell will draw a dot/symbol on the #' corresponding triangle of the plot. Set NULL as default. -#'@param pch_sig symbol to be used to represent sig_data. Takes 18 -#' (diamond) by default. See 'pch' in par() for additional -#' accepted options. -#'@param col_sig colour of the symbol to represent sig_data. -#'@param cex_sig parameter to increase/reduce the size of the symbols used +#'@param pch_sig Symbol to be used to represent sig_data. Takes 18 +#' (diamond) by default. See 'pch' in par() for additional accepted options. +#'@param col_sig Colour of the symbol to represent sig_data. +#'@param cex_sig Parameter to increase/reduce the size of the symbols used #' to represent sig_data. #'@param xlab A logical value (TRUE) indicating if xlabels should be plotted #'@param ylab A logical value (TRUE) indicating if ylabels should be plotted @@ -37,18 +37,21 @@ #'@param lab_legend A vector of labels indicating what is represented in each #'category (i.e. triangle). Set the sequence from 1 to the length of #' the categories (2 or 4). -#'@param cex_leg a number to indicate the increase/reductuion of the lab_legend used -#' to represent sig_data. -#'@param col_leg color of the legend (triangles). -#'@param cex_axis a number to indicate the increase/reduction of the axis labels. +#'@param cex_leg A number to indicate the increase/reductuion of the lab_legend +#' used to represent sig_data. +#'@param col_leg Color of the legend (triangles). +#'@param cex_axis A number to indicate the increase/reduction of the axis labels. #'@param fileout A string of full directory path and file name indicating where #' to save the plot. If not specified (default), a graphics device will pop up. -#'@param mar A numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the plot. +#'@param mar A numerical vector of the form c(bottom, left, top, right) which +#' gives the number of lines of margin to be specified on the four sides of the +#' plot. #'@param size_units A string indicating the units of the size of the device #' (file or window) to plot in. Set 'px' as default. See ?Devices and the #' creator function of the corresponding device. -#'@param res A positive number indicating resolution of the device (file or window) -#' to plot in. See ?Devices and the creator function of the corresponding device. +#'@param res A positive number indicating resolution of the device (file or +#' window) to plot in. See ?Devices and the creator function of the +#' corresponding device. #'@param figure.width a numeric value to control the width of the plot. #'@param ... The additional parameters to be passed to function ColorBar() in #' s2dv for color legend creation. @@ -58,19 +61,17 @@ #'1.0 - 2020-10 (V.Torralba, \email{veronica.torralba@bsc.es}) - Original code #' #'@examples -#'#Example with random data -#' arr1<- arr1<- array(runif(n = 12 * 7 * 4, min=-1, max=1),dim = c(12,7,4)) -#' names(dim(arr1)) <- c('dimx','dimy','dimcat') -#'arr2<- array(TRUE,dim = dim(arr1)) -#'arr2[which(arr1 < 0.3)] = FALSE +#'# Example with random data +#'arr1 <- array(runif(n = 4 * 5 * 4, min = -1, max = 1), dim = c(4,5,4)) +#'names(dim(arr1)) <- c('dimx', 'dimy', 'dimcat') +#'arr2 <- array(TRUE, dim = dim(arr1)) +#'arr2[which(arr1 < 0.3)] <- FALSE #'PlotTriangles4Categories(data = arr1, -#' cols = c('white','#fef0d9','#fdd49e','#fdbb84','#fc8d59', -#' '#e34a33','#b30000', '#7f0000'), -#' brks = c(-1, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 1), -#' lab_legend = c('NAO+', 'BL','AR','NAO-'), -#' xtitle = "Target month", ytitle = "Lead time", -#' xlabels = c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", -#' "Aug", "Sep", "Oct", "Nov", "Dec")) +#' cols = c('white','#fef0d9','#fdd49e','#fdbb84','#fc8d59'), +#' brks = c(-1, 0, 0.1, 0.2, 0.3, 0.4), +#' lab_legend = c('NAO+', 'BL','AR','NAO-'), +#' xtitle = "Target month", ytitle = "Lead time", +#' xlabels = c("Jan", "Feb", "Mar", "Apr")) #'@importFrom grDevices dev.new dev.off dev.cur #'@importFrom graphics plot points polygon text title axis #'@importFrom RColorBrewer brewer.pal diff --git a/R/PlotWeeklyClim.R b/R/PlotWeeklyClim.R new file mode 100644 index 0000000000000000000000000000000000000000..6f103d9641ca7f02d19f853dfabd4bfbc0587408 --- /dev/null +++ b/R/PlotWeeklyClim.R @@ -0,0 +1,231 @@ +#'Plots the observed weekly means and climatology of a timeseries data +#' +#'@description This function plots the observed weekly means and climatology of +#'a timeseries data using ggplot package. It compares the weekly climatology in +#'a specified period (reference period) to the observed conditions during the +#'target period analyzed in the case study (included in the reference period). +#' +#'@param data A multidimensional array with named dimensions with at least sdate +#' and time dimensions containing observed daily data. It can also be a +#' dataframe with computed percentiles as input for ggplot. The target year +#' must be included in the input data. +#'@param first_date The first date of the target period of timeseries. It can be +#' of class 'Date', 'POSIXct' or a character string in the format 'yyyy-mm-dd'. +#' It must be a date included in the reference period. +#'@param ref_period_ini A numeric value indicating the first year of the +#' reference period. +#'@param ref_period_end A numeric value indicating the last year of the +#' reference period. +#'@param time_dim A character string indicating the daily time dimension name. +#' The default value is 'time'. +#'@param sdate_dim A character string indicating the start year dimension name. +#' The default value is 'sdate'. +#'@param title The text for the top title of the plot. +#'@param palette A palette name from the R Color Brewer’s package. The default +#' value is 'Blues'. +#'@param fileout A character string indicating the file name where to save the +#' plot. If not specified (default) a graphics device will pop up. +#'@param device A character string indicating the device to use. Can either be +#' a device function (e.g. png), or one of "eps", "ps", "tex" (pictex), +#' "pdf", "jpeg", "tiff", "png", "bmp", "svg" or "wmf" (windows only). +#'@param width A numeric value of the plot width in units ("in", "cm", "mm", or +#' "px"). It is set to 8 by default. +#'@param height A numeric value of the plot height in units ("in", "cm", "mm", +#' or "px"). It is set to 6 by default. +#'@param units Units of the size of the device (file or window) to plot in. +#' Inches (’in’) by default. +#'@param dpi A numeric value of the plot resolution. It is set to 300 by +#' default. +#' +#'@return A ggplot object containing the plot. +#' +#'@examples +#'data <- array(rnorm(49*20, 274, 7), dim = c(time = 49, sdate = 20)) +#'PlotWeeklyClim(data = data, first_date = '2010-08-09', +#' ref_period_ini = 1998, +#' ref_period_end = 2020) +#' +#'@import multiApply +#'@import lubridate +#'@import ggplot2 +#'@import RColorBrewer +#'@import scales +#'@importFrom ClimProjDiags Subset +#'@importFrom s2dv MeanDims +#'@export +PlotWeeklyClim <- function(data, first_date, ref_period_ini, ref_period_end, + time_dim = 'time', sdate_dim = 'sdate', + title = "Observed weekly means and climatology", + palette = "Blues", fileout = NULL, + device = NULL, width = 8, height = 6, + units = 'in', dpi = 300) { + # Check input arguments + # data + if (is.array(data)) { + if (is.null(names(dim(data)))) { + stop("Parameter 'data' must have named dimensions.") + } + is_array <- TRUE + } else if (is.data.frame(data)) { + col_names <- c("week", "clim", "p10", "p90", "p33", "p66", + "week_mean", "day", "data") + if (!all(col_names %in% names(data))) { + stop(paste0("If parameter 'data' is a data frame, it must contain the ", + "following column names: 'week', 'clim', 'p10', 'p90', 'p33', ", + "'p66', 'week_mean', 'day' and 'data'.")) + } + is_array <- FALSE + } else { + stop("Parameter 'data' must be an array or a data frame.") + } + if (is_array) { + # time_dim + if (!is.character(time_dim)) { + stop("Parameter 'time_dim' must be a character string.") + } + if (!all(time_dim %in% names(dim(data)))) { + stop("Parameter 'time_dim' is not found in 'data' dimension.") + } + if (dim(data)[time_dim] < 7) { + stop(paste0("Parameter 'data' must have the dimension 'time_dim' of ", + "length equal or grater than 7 to compute the weekly means.")) + } + # sdate_dim + if (!is.character(sdate_dim)) { + stop("Parameter 'sdate_dim' must be a character string.") + } + if (!sdate_dim %in% names(dim(data))) { + warning(paste0("Parameter 'sdate_dim' is not found in 'data' dimension. ", + "A dimension of length 1 has been added.")) + data <- InsertDim(data, 1, lendim = 1, name = sdate_dim) + } + # ref_period_ini and ref_period_end + if (!is.numeric(ref_period_ini) | !is.numeric(ref_period_end)) { + stop("Parameters 'ref_period_ini' and 'ref_period_end' must be numeric.") + } + # first_date + if ((!inherits(first_date, "POSIXct") & !inherits(first_date, "Date")) && + (!is.character(first_date) | nchar(first_date) != 10)) { + stop(paste0("Parameter 'first_date' must be a character string ", + "indicating the date in the format 'yyyy-mm-dd', 'POSIXct' ", + "or 'Dates' class.")) + } + first_date <- ymd(first_date) + target_year <- year(first_date) + if (target_year < ref_period_ini | target_year > ref_period_end) { + stop("Parameter 'first_date' must be a date included in the reference period.") + } + + # Dates creation + dates <- seq(first_date, first_date + days(dim(data)[time_dim]-1), by = "1 day") + index_first_date <- which(dates == first_date) + index_last_date <- length(dates) - (length(dates) %% 7) + last_date <- dates[index_last_date] + + # Weekly aggregations + data_subset <- Subset(data, along = time_dim, + indices = index_first_date:index_last_date) + weekly_aggre <- SplitDim(data_subset, split_dim = time_dim, + indices = sort(rep(1:(length(index_first_date:index_last_date)/7), 7)), + new_dim_name = 'week') + weekly_means <- MeanDims(weekly_aggre, time_dim) + weekly_clim <- MeanDims(weekly_means, sdate_dim) + + weekly_p10 <- Apply(weekly_means, target_dims = sdate_dim, + fun = function(x) {quantile(x, 0.10)})$output1 + weekly_p90 <- Apply(weekly_means, target_dims = sdate_dim, + fun = function(x) {quantile(x, 0.90)})$output1 + weekly_p33 <- Apply(weekly_means, target_dims = sdate_dim, + fun = function(x) {quantile(x, 0.33)})$output1 + weekly_p66 <- Apply(weekly_means, target_dims = sdate_dim, + fun = function(x) {quantile(x, 0.66)})$output1 + + clim <- p10 <- p90 <- p33 <- p66 <- NULL + weekly_data <- data.frame(clim = as.vector(weekly_clim), + p10 = as.vector(weekly_p10), + p90 = as.vector(weekly_p90), + p33 = as.vector(weekly_p33), + p66 = as.vector(weekly_p66), + week = 1:(length(index_first_date:index_last_date)/7)) + + daily <- Subset(data_subset, along = sdate_dim, + indices = which(ref_period_ini:ref_period_end == target_year), + drop = TRUE) + + dims_subset <- names(dim(daily))[which(!names(dim(daily)) %in% c(time_dim, sdate_dim))] + + if (!identical(dims_subset, character(0))) { + daily <- Subset(daily, dims_subset, as.list(rep(1, length(dims_subset))), drop = TRUE) + } + + daily_data <- data.frame(day = seq(first_date, last_date, by = "1 day"), + data = daily, + week = sort(rep(1:(length(index_first_date:index_last_date)/7), 7))) + week_mean <- aggregate(data ~ week, data = daily_data, mean) + weekly_data <- cbind(weekly_data, week_mean$data) + colnames(weekly_data)[7] <- 'week_mean' + all <- merge(weekly_data, daily_data, by = 'week') + } else { + all <- data + } + + # Create a ggplot object + cols <- colorRampPalette(brewer.pal(9, palette))(6) + + p <- ggplot(all, aes(x = day)) + + geom_ribbon(aes(ymin = p10, ymax = p90, group = week, fill = "p10-p90"), + alpha = 0.7) + # extremes clim + geom_ribbon(aes(ymin = p33, ymax = p66, group = week, fill = "p33-p66"), + alpha = 0.7) + # terciles clim + geom_line(aes(y = clim, group = week, color = "climatological mean", + linetype = "climatological mean"), + alpha = 1.0, size = 0.7) + # mean clim + geom_line(aes(y = data, color = "observed daily mean", + linetype = "observed daily mean"), + alpha = 1, size = 0.2) + # daily evolution + geom_line(aes(y = week_mean, group = week, color = "observed weekly mean", + linetype = "observed weekly mean"), + alpha = 1, size = 0.7) + # weekly evolution + theme_bw() + ylab(paste0('tas', " (", "deg.C", ")")) + xlab(NULL) + + ggtitle(title) + + scale_x_date(breaks = seq(min(all$day), max(all$day), by = "7 days"), + minor_breaks = NULL, expand = c(0.03, 0.03), + labels = date_format("%d %b %Y")) + + theme(axis.text.x = element_text(angle = 45, hjust = 1), + panel.grid.major = element_line(size = 0.5, linetype = 'solid', + colour = "gray92"), + panel.grid.minor = element_line(size = 0.25, linetype = 'solid', + colour = "gray92"), + legend.spacing = unit(-0.2, "cm")) + + scale_fill_manual(name = NULL, + values = c("p10-p90" = cols[3], "p33-p66" = cols[4])) + + scale_color_manual(name = NULL, values = c("climatological mean" = cols[5], + "observed daily mean" = "grey20", + "observed weekly mean" = "black")) + + scale_linetype_manual(name = NULL, values = c("climatological mean" = "solid", + "observed daily mean" = "dashed", + "observed weekly mean" = "solid"), + guide = guide_legend(override.aes = list(lwd = c(0.7, 0.2, 0.7)))) + + guides(fill = guide_legend(order = 1)) + + # Return the ggplot object + if (is.null(fileout)) { + return(p) + } else { + ggsave(filename = fileout, plot = p, device = device, height = height, + width = width, units = units, dpi = dpi) + } +} + + + + + + + + + + + + + \ No newline at end of file diff --git a/R/Predictability.R b/R/Predictability.R index 12cd6b410f26b197da67df65e82f42f626d29668..680666df92b5c779e79c7488f5d6906672f88eb3 100644 --- a/R/Predictability.R +++ b/R/Predictability.R @@ -16,18 +16,17 @@ #'@references Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., #'and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large #'scale atmospheric predictability.Nature Communications, 10(1), 1316. -#'DOI = https://doi.org/10.1038/s41467-019-09305-8 " +#'\doi{10.1038/s41467-019-09305-8}" #'@references Faranda, D., Gabriele Messori and Pascal Yiou. (2017). -#' Dynamical proxies of North Atlantic predictability and extremes. -#' Scientific Reports, 7-41278, 2017. +#'Dynamical proxies of North Atlantic predictability and extremes. +#'Scientific Reports, 7-41278, 2017. #' #'@param dim An array of N named dimensions containing the local dimension as -#'the output of CST_ProxiesAttractor or ProxiesAttractor. -# +#' the output of CST_ProxiesAttractor or ProxiesAttractor. #'@param theta An array of N named dimensions containing the inverse of the -#'persistence 'theta' as the output of CST_ProxiesAttractor or ProxiesAttractor. -#' -#'@param ncores The number of cores to use in parallel computation +#' persistence 'theta' as the output of CST_ProxiesAttractor or +#' ProxiesAttractor. +#'@param ncores The number of cores to use in parallel computation. #' #'@return A list of length 2: #' \itemize{ @@ -59,16 +58,8 @@ #' attractor <- ProxiesAttractor(dat = m, quanti = 0.60) #' predyn <- Predictability(dim = attractor$dim, theta = attractor$theta) #'@export -#' -Predictability<- function(dim, theta, ncores = NULL) { - # if (!inherits(dim, 's2dv_cube')) { - # stop("Parameter 'dim' must be of the class 's2dv_cube', ", - # "as output by CSTools::CST_Load.") - # } - # if (!inherits(theta, 's2dv_cube')) { - # stop("Parameter 'theta' must be of the class 's2dv_cube', ", - # "as output by CSTools::CST_Load.") - # } +Predictability <- function(dim, theta, ncores = NULL) { + if (any(names(dim(dim)) %in% 'sdate')) { if (any(names(dim(dim)) %in% 'ftime')) { dim <- MergeDims(dim, merge_dims = c('ftime', 'sdate'), diff --git a/R/as.s2dv_cube.R b/R/as.s2dv_cube.R index 5f31ad31f6242fcfaa5bbe48e22ac27700b419c1..e39e5cadb2130830631c994291c8a30ec19a71f6 100644 --- a/R/as.s2dv_cube.R +++ b/R/as.s2dv_cube.R @@ -1,183 +1,357 @@ #'Conversion of 'startR_array' or 'list' objects to 's2dv_cube' #' -#'This function converts data loaded using startR package or s2dv Load function into a 's2dv_cube' object. +#'This function converts data loaded using Start function from startR package or +#'Load from s2dv into an 's2dv_cube' object. #' #'@author Perez-Zanon Nuria, \email{nuria.perez@bsc.es} #'@author Nicolau Manubens, \email{nicolau.manubens@bsc.es} #' -#'@param object an object of class 'startR_array' generated from function \code{Start} from startR package (version 0.1.3 from earth.bsc.es/gitlab/es/startR) or a list output from function \code{Load} from s2dv package. -#' -#'@return The function returns a 's2dv_cube' object to be easily used with functions \code{CST} from CSTools package. +#'@param object An object of class 'startR_array' generated from function +#' \code{Start} from startR package or a list output from function \code{Load} +#' from s2dv package. Any other object class will not be accepted. +#'@param remove_attrs_coords A logical value indicating whether to remove the +#' attributes of the coordinates (TRUE) or not (FALSE). The default value is +#' FALSE. +#'@param remove_null Optional. A logical value indicating whether to remove the +#' elements that are NULL (TRUE) or not (FALSE) of the output object. It is +#' only used when the object is an output from function \code{Load}. The +#' default value is FALSE. +#' +#'@return The function returns an 's2dv_cube' object to be easily used with +#'functions with the prefix \code{CST} from CSTools and CSIndicators packages. +#'The object is mainly a list with the following elements:\cr +#'\itemize{ +#' \item{'data', array with named dimensions.} +#' \item{'dims', named vector of the data dimensions.} +#' \item{'coords', named list with elements of the coordinates corresponding to +#' the dimensions of the data parameter. If any coordinate is not provided, it +#' is set as an index vector with the values from 1 to the length of the +#' corresponding dimension. The attribute 'indices' indicates wether the +#' coordinate is an index vector (TRUE) or not (FALSE).} +#' \item{'attrs', named list with elements: +#' \itemize{ +#' \item{'Dates', array with named temporal dimensions of class 'POSIXct' +#' from time values in the data.} +#' \item{'Variable', has the following components: +#' \itemize{ +#' \item{'varName', character vector of the short variable name. It is +#' usually specified in the parameter 'var' from the functions +#' Start and Load.} +#' \item{'metadata', named list of elements with variable metadata. +#' They can be from coordinates variables (e.g. longitude) or +#' main variables (e.g. 'var').} +#' } +#' } +#' \item{'Datasets', character strings indicating the names of the +#' datasets.} +#' \item{'source_files', a vector of character strings with complete paths +#' to all the found files involved in loading the data.} +#' \item{'when', a time stamp of the date issued by the Start() or Load() +#' call to obtain the data.} +#' \item{'load_parameters', it contains the components used in the +#' arguments to load the data from Start() or Load() functions.} +#' } +#' } +#'} #' -#'@seealso \code{\link{s2dv_cube}}, \code{\link[s2dv]{Load}}, \code{\link[startR]{Start}} and \code{\link{CST_Load}} +#'@seealso \code{\link{s2dv_cube}}, \code{\link[s2dv]{Load}}, +#'\code{\link[startR]{Start}} and \code{\link{CST_Load}} #'@examples #'\dontrun{ +#'# Example 1: convert an object from startR::Start function to 's2dv_cube' #'library(startR) #'repos <- '/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc' #'data <- Start(dat = repos, #' var = 'tas', #' sdate = c('20170101', '20180101'), -#' ensemble = indices(1:20), +#' ensemble = indices(1:5), #' time = 'all', -#' latitude = 'all', -#' longitude = indices(1:40), +#' latitude = indices(1:5), +#' longitude = indices(1:5), #' return_vars = list(latitude = 'dat', longitude = 'dat', time = 'sdate'), #' retrieve = TRUE) #'data <- as.s2dv_cube(data) -#'class(data) +#'# Example 2: convert an object from s2dv::Load function to 's2dv_cube' #'startDates <- c('20001101', '20011101', '20021101', -#' '20031101', '20041101', '20051101') +#' '20031101', '20041101', '20051101') #'data <- Load(var = 'tas', exp = 'system5c3s', -#' nmember = 15, sdates = startDates, -#' leadtimemax = 3, latmin = 27, latmax = 48, -#' lonmin = -12, lonmax = 40, output = 'lonlat') +#' nmember = 2, sdates = startDates, +#' leadtimemax = 3, latmin = 10, latmax = 30, +#' lonmin = -10, lonmax = 10, output = 'lonlat') #'data <- as.s2dv_cube(data) -#'class(data) #'} #'@export -as.s2dv_cube <- function(object) { - if (is.list(object)) { +as.s2dv_cube <- function(object, remove_attrs_coords = FALSE, + remove_null = FALSE) { + + if (is.list(object) & length(object) == 11) { if (is.null(object) || (is.null(object$mod) && is.null(object$obs))) { stop("The s2dv::Load call did not return any data.") } obs <- object obs$mod <- NULL object$obs <- NULL - names(object)[[1]] <- 'data' - names(obs)[[1]] <- 'data' - remove_matches <- function(v, patterns) { - if (length(v) > 0) { - matches <- c() - for (pattern in patterns) { - matches <- c(matches, which(grepl(pattern, v))) - } - if (length(matches) > 0) { - v <- v[-matches] - } - } - v - } - - harmonize_patterns <- function(v) { - matches <- grepl('.*\\.nc$', v) - if (sum(!matches) > 0) { - match_indices <- which(!matches) - v[match_indices] <- sapply(v[match_indices], function(x) paste0(x, '*')) - } - v <- glob2rx(v) - v <- gsub('\\$.*\\$', '*', v) - v - } - + names(object)[[1]] <- 'data' # exp + names(obs)[[1]] <- 'data' # obs + # obs if (!is.null(obs$data)) { + obs_exist <- TRUE obs$Datasets$exp <- NULL obs$Datasets <- obs$Datasets$obs - obs_path_patterns <- sapply(obs$Datasets, function(x) attr(x, 'source')) - obs_path_patterns <- harmonize_patterns(obs_path_patterns) + } else { + obs_exist <- FALSE } - + # object if (!is.null(object$data)) { + exp_exist <- TRUE object$Datasets$obs <- NULL object$Datasets <- object$Datasets$exp - exp_path_patterns <- sapply(object$Datasets, function(x) attr(x, 'source')) - exp_path_patterns <- harmonize_patterns(exp_path_patterns) + } else { + exp_exist <- FALSE } - - if (!is.null(obs$data) && !is.null(object$data)) { - obs$source_files <- remove_matches(obs$source_files, - exp_path_patterns) - obs$not_found_files <- remove_matches(obs$not_found_files, - exp_path_patterns) - - object$source_files <- remove_matches(object$source_files, - obs_path_patterns) - object$not_found_files <- remove_matches(object$not_found_files, - obs_path_patterns) - } - result <- list() - if (!is.null(object$data)) { - class(object) <- 's2dv_cube' - result$exp <- object - } - if (!is.null(obs$data)) { - class(obs) <- 's2dv_cube' - result$obs <- obs + # obs and exp + if (obs_exist & exp_exist) { + obs_exp = list(exp = object, obs = obs) + } else if (obs_exist & !exp_exist) { + obs_exp = list(obs = obs) + } else { + obs_exp = list(exp = object) } - if (is.list(result)) { - if (is.null(result$exp)) { - result <- result$obs - } else if (is.null(result$obs)) { - result <- result$exp + i <- 0 + for (obj_i in obs_exp) { + i <- i + 1 + # attrs + obj_i$attrs <- within(obj_i, rm(list = c('data'))) + obj_i <- within(obj_i, rm(list = names(obj_i$attrs))) + dates <- obj_i$attrs$Dates$start + attr(dates, 'end') <- obj_i$attrs$Dates$end + if (!is.null(dates)) { + dim(dates) <- dim(obj_i$data)[c('ftime', 'sdate')] + obj_i$attrs$Dates <- dates + } + # Variable + varname <- obj_i$attrs$Variable$varName + varmetadata <- NULL + varmetadata[[varname]] <- attributes(obj_i$attrs$Variable)[-1] + obj_i$attrs$Variable <- list(varName = varname, metadata = varmetadata) + # dims + obj_i$dims <- dim(obj_i$data) + # coords + obj_i$coords <- sapply(names(dim(obj_i$data)), function(x) NULL) + # sdate + obj_i$coords$sdate <- obj_i$attrs$load_parameters$sdates + if (!remove_attrs_coords) attr(obj_i$coords$sdate, 'indices') <- FALSE + # lon + if (!is.null(obj_i$attrs$lon)) { + if (remove_attrs_coords) { + obj_i$coords$lon <- as.vector(obj_i$attrs$lon) + } else { + obj_i$coords$lon <- obj_i$attrs$lon + attr(obj_i$coords$lon, 'indices') <- FALSE + } + obj_i$attrs$Variable$metadata$lon <- obj_i$attrs$lon + obj_i$attrs <- within(obj_i$attrs, rm(list = 'lon')) + } + # lat + if (!is.null(obj_i$attrs$lat)) { + if (remove_attrs_coords) { + obj_i$coords$lat <- as.vector(obj_i$attrs$lat) } else { - warning("The output is a list of two 's2dv_cube' objects", - " corresponding to 'exp' and 'obs'.") + obj_i$coords$lat <- obj_i$attrs$lat + attr(obj_i$coords$lat, 'indices') <- FALSE } + obj_i$attrs$Variable$metadata$lat <- obj_i$attrs$lat + obj_i$attrs <- within(obj_i$attrs, rm(list = 'lat')) + } + # member + obj_i$coords$member <- 1:obj_i$dims['member'] + if (!remove_attrs_coords) attr(obj_i$coords$member, 'indices') <- TRUE + # dataset + if (!is.null(names(obj_i$attrs$Datasets))) { + obj_i$coords$dataset <- names(obj_i$attrs$Datasets) + if (!remove_attrs_coords) attr(obj_i$coords$dataset, 'indices') <- FALSE + obj_i$attrs$Datasets <- names(obj_i$attrs$Datasets) + } else { + obj_i$coords$dataset <- 1:obj_i$dims['dataset'] + if (!remove_attrs_coords) attr(obj_i$coords$dataset, 'indices') <- TRUE + } + # ftime + obj_i$coords$ftime <- 1:obj_i$dims['ftime'] + if (!remove_attrs_coords) attr(obj_i$coords$ftime, 'indices') <- TRUE + # remove NULL values + if (isTRUE(remove_null)) { + obj_i$attrs$load_parameters <- .rmNullObs(obj_i$attrs$load_parameters) + } + obj_i <- obj_i[c('data','dims','coords','attrs')] + class(obj_i) <- 's2dv_cube' + if (names(obs_exp)[[i]] == 'exp') { + result$exp <- obj_i + } else { + result$obs <- obj_i + } + } + if (is.list(result)) { + if (is.null(result$exp)) { + result <- result$obs + } else if (is.null(result$obs)) { + result <- result$exp + } else { + warning("The output is a list of two 's2dv_cube' objects", + " corresponding to 'exp' and 'obs'.") + } } - + } else if (inherits(object, 'startR_array')) { + # From Start: result <- list() result$data <- as.vector(object) - dim(result$data) <- dim(object) - - dat_attr_names <- names(attributes(object)$Variables$dat1) - common_attr_names <- names(attributes(object)$Variables$common) - # $lon - known_lon_names <- utils::getFromNamespace(".KnownLonNames", "s2dv")() - if (!is.null(dat_attr_names[which(dat_attr_names %in% known_lon_names)]) & - !identical(dat_attr_names[which(dat_attr_names %in% known_lon_names)], character(0))) { - result$lon <- attributes(object)$Variables$dat1[[dat_attr_names[which(dat_attr_names %in% known_lon_names)]]] - } else if (!is.null(common_attr_names[which(common_attr_names %in% known_lon_names)]) & - !identical(common_attr_names[which(common_attr_names %in% known_lon_names)], character(0))) { - result$lon <- attributes(object)$Variables$common[[common_attr_names[which(common_attr_names %in% known_lon_names)]]] - } else { - warning("'lon' is not found in this object.") - result$lon <- NULL - } - # $lat - known_lat_names <- utils::getFromNamespace(".KnownLatNames", "s2dv")() - if (!is.null(dat_attr_names[which(dat_attr_names %in% known_lat_names)]) & - !identical(dat_attr_names[which(dat_attr_names %in% known_lat_names)], character(0))) { - result$lat <- attributes(object)$Variables$dat1[[dat_attr_names[which(dat_attr_names %in% known_lat_names)]]] - } else if (!is.null(common_attr_names[which(common_attr_names %in% known_lat_names)]) & - !identical(common_attr_names[which(common_attr_names %in% known_lat_names)], character(0))) { - result$lat <- attributes(object)$Variables$common[[common_attr_names[which(common_attr_names %in% known_lat_names)]]] - } else { - warning("'lat' is not found in this object.") - result$lat <- NULL + ## dims + dims <- dim(object) + dim(result$data) <- dims + result$dims <- dims + ## coords + result$coords <- sapply(names(dims), function(x) NULL) + # Find coordinates + FileSelector <- attributes(object)$FileSelectors + VariablesCommon <- names(attributes(object)$Variables$common) + dat <- names(FileSelector)[1] + VariablesDat <- names(attributes(object)$Variables[[dat]]) + varName <- NULL + for (i_coord in names(dims)) { + if (i_coord %in% names(FileSelector[[dat]])) { # coords in FileSelector + coord_in_fileselector <- FileSelector[[dat]][which(i_coord == names(FileSelector[[dat]]))] + if (length(coord_in_fileselector) == 1) { + if (length(coord_in_fileselector[[i_coord]][[1]]) == dims[i_coord]) { + # TO DO: add var_dim parameter + if (i_coord %in% c('var', 'vars')) { + varName <- as.vector(coord_in_fileselector[[i_coord]][[1]]) + } + if (remove_attrs_coords) { + result$coords[[i_coord]] <- as.vector(coord_in_fileselector[[i_coord]][[1]]) + } else { + result$coords[[i_coord]] <- coord_in_fileselector[[i_coord]][[1]] + } + if (!remove_attrs_coords) attr(result$coords[[i_coord]], 'indices') <- FALSE + } else { + result$coords[[i_coord]] <- 1:dims[i_coord] + if (!remove_attrs_coords) attr(result$coords[[i_coord]], 'indices') <- TRUE + } + } + } else if (i_coord %in% VariablesCommon) { # coords in common + coord_in_common <- attributes(object)$Variables$common[[which(i_coord == VariablesCommon)]] + if (inherits(coord_in_common, "POSIXct")) { + result$attrs$Dates <- coord_in_common + } + if (length(coord_in_common) == dims[i_coord]) { + if (remove_attrs_coords) { + if (inherits(coord_in_common, "POSIXct")) { + result$coords[[i_coord]] <- coord_in_common + } else { + result$coords[[i_coord]] <- as.vector(coord_in_common) + } + } else { + result$coords[[i_coord]] <- coord_in_common + } + if (!remove_attrs_coords) attr(result$coords[[i_coord]], 'indices') <- FALSE + } else { + result$coords[[i_coord]] <- 1:dims[i_coord] + if (!remove_attrs_coords) attr(result$coords[[i_coord]], 'indices') <- TRUE + } + } else if (!is.null(VariablesDat)) { # coords in dat + if (i_coord %in% VariablesDat) { + coord_in_dat <- attributes(object)$Variables[[dat]][[which(i_coord == VariablesDat)]] + if (inherits(coord_in_dat, "POSIXct")) { + result$attrs$Dates <- coord_in_dat + } + if (length(coord_in_dat) == dims[i_coord]) { + if (remove_attrs_coords) { + if (inherits(coord_in_dat, "POSIXct")) { + result$coords[[i_coord]] <- coord_in_dat + } else { + result$coords[[i_coord]] <- as.vector(coord_in_dat) + } + } else { + result$coords[[i_coord]] <- coord_in_dat + } + if (!remove_attrs_coords) attr(result$coords[[i_coord]], 'indices') <- FALSE + } else { + result$coords[[i_coord]] <- 1:dims[i_coord] + if (!remove_attrs_coords) attr(result$coords[[i_coord]], 'indices') <- TRUE + } + } else { + result$coords[[i_coord]] <- 1:dims[i_coord] + if (!remove_attrs_coords) attr(result$coords[[i_coord]], 'indices') <- TRUE + } + } else { # missing other dims + result$coords[[i_coord]] <- 1:dims[i_coord] + if (!remove_attrs_coords) attr(result$coords[[i_coord]], 'indices') <- TRUE + } } - vars <- which(!common_attr_names %in% c("time", known_lon_names, known_lat_names)) - - if (length(vars) > 1) { - warning("More than one variable has been provided and ", - "only the first one '", common_attr_names[vars[1]],"' will be used.") - vars <- vars[1] + # attrs + ## varName + if (!is.null(varName)) { + result$attrs$Variable$varName <- varName } - - Variable <- list() - Variable$varName <- names(attributes(object)$Variables$common)[vars] - attr(Variable, 'variable') <- attributes(object)$Variables$common[[vars]] - result$Variable <- Variable - dims <- dim(object) - if (any(c('sdate', 'sdates') %in% names(dims))) { - n_sdates <- dims[which(names(dims) == 'sdate' | names(dims) == 'sdates')] - sdates <- attributes(object)$Variables$common$time[1 : n_sdates] + ## Variables + for (var_type in names(attributes(object)$Variables)) { + if (!is.null(attributes(object)$Variables[[var_type]])) { + for (var in names(attributes(object)$Variables[[var_type]])) { + attr_variable <- attributes(object)$Variables[[var_type]][[var]] + if (is.null(result$attrs$Dates)) { + if (inherits(attr_variable, "POSIXct")) { + result$attrs$Dates <- attr_variable + } + } + if (is.null(result$attrs$Variable$metadata[[var]])) { + result$attrs$Variable$metadata[[var]] <- attr_variable + } + } + } + } + ## Datasets + if (length(names(FileSelector)) > 1) { + # lon name + known_lon_names <- .KnownLonNames() + lon_name_dat <- names(dims)[which(names(dims) %in% known_lon_names)] + # lat name + known_lat_names <- .KnownLatNames() + lat_name_dat <- names(dims)[which(names(dims) %in% known_lat_names)] + result$attrs$Datasets <- names(FileSelector) + # TO DO: add dat_dim parameter + if (any(names(dims) %in% c('dat', 'dataset'))) { + dat_dim <- names(dims)[which(names(dims) %in% c('dat', 'dataset'))] + result$coords[[dat_dim]] <- names(FileSelector) + if (!remove_attrs_coords) attr(result$coords[[dat_dim]], 'indices') <- FALSE + } + for (i in 2:length(names(FileSelector))) { + if (!is.null(lon_name_dat)) { + if (any(result$coords[[lon_name_dat]] != as.vector(attributes(object)$Variables[[names(FileSelector)[i]]][[lon_name_dat]]))) { + warning("'lon' values are different for different datasets. ", + "Only values from the first will be used.") + } + } + if (!is.null(lat_name_dat)) { + if (any(result$coords[[lat_name_dat]] != as.vector(attributes(object)$Variables[[names(FileSelector)[i]]][[lat_name_dat]]))) { + warning("'lat' values are different for different datasets. ", + "Only values from the first will be used.") + } + } + } } else { - sdates <- attributes(object)$Variables$common$time[1] + result$attrs$Datasets <- names(FileSelector) } - Dataset <- list(list(InitializationDates = list(Member_1 = sdates))) - names(Dataset) <- list(deparse(substitute(object))) - result$Datasets <- Dataset - result$Dates$start <- attributes(object)$Variables$common$time - result$when <- Sys.time() - result$source_files <- as.vector(attributes(object)$Files) - result$load_parameters <- attributes(object)$FileSelectors + ## when + result$attrs$when <- Sys.time() + ## source_files + result$attrs$source_files <- attributes(object)$Files + ## load_parameters + result$attrs$load_parameters <- attributes(object)$FileSelectors class(result) <- 's2dv_cube' } else { stop("The class of parameter 'object' is not implemented", " to be converted into 's2dv_cube' class yet.") } - result - -} + return(result) +} \ No newline at end of file diff --git a/R/s2dv_cube.R b/R/s2dv_cube.R index 9cd83480944a15e3ee1fd4f06ee9801f259c8a9a..5f9c465b8490dda0c6635ba43cbd2a65b7f7d33c 100644 --- a/R/s2dv_cube.R +++ b/R/s2dv_cube.R @@ -1,44 +1,71 @@ #'Creation of a 's2dv_cube' object #' -#'@description This function allows to create a 's2dv_cube' object by passing +#'@description This function allows to create an 's2dv_cube' object by passing #'information through its parameters. This function will be needed if the data #'hasn't been loaded using CST_Load or has been transformed with other methods. -#'A 's2dv_cube' object has many different components including metadata. This +#'An 's2dv_cube' object has many different components including metadata. This #'function will allow to create 's2dv_cube' objects even if not all elements #'are defined and for each expected missed parameter a warning message will be #'returned. #' #'@author Perez-Zanon Nuria, \email{nuria.perez@bsc.es} #' -#'@param data an array with any number of named dimensions, typically an object -#' output from CST_Load, with the following dimensions: dataset, member, sdate, -#'ftime, lat and lon. -#'@param lon an array with one dimension containing the longitudes and -#'attributes: dim, cdo_grid_name, data_across_gw, array_across_gw, first_lon, -#'last_lon and projection. -#'@param lat an array with one dimension containing the latitudes and -#'attributes: dim, cdo_grid_name, first_lat, last_lat and projection. -#'@param Variable a list of two elements: \code{varName} a character string -#'indicating the abbreviation of a variable name and \code{level} a character -#'string indicating the level (e.g., "2m"), if it is not required it could be -#' set as NULL. -#'@param Datasets a named list with the dataset model with two elements: -#'\code{InitiatlizationDates}, containing a list of the start dates for each -#'member named with the names of each member, and \code{Members} containing a -#'vector with the member names (e.g., "Member_1") -#'@param Dates a named list of one to two elements: The first element, -#'\code{start}, is an array of dimensions (sdate, time) with the POSIX initial -#'date of each forecast time of each starting date. The second element, -#'\code{end} (optional), is an array of dimensions (sdate, time) with the POSIX -# final date of each forecast time of each starting date. -#'@param time_dims a vector of strings containing the names of the temporal -#'dimensions found in \code{data}. -#'@param when a time stamp of the date issued by the Load() call to obtain the -#'data. -#'@param source_files a vector of character strings with complete paths to all -#'the found files involved in the Load() call. +#'@param data A multidimensional array with named dimensions, typically with +#' dimensions: dataset, member, sdate, ftime, lat and lon. +#'@param coords A named list with elements of the coordinates corresponding to +#' the dimensions of the data parameter. The names and length of each element +#' must correspond to the names of the dimensions. If any coordinate is not +#' provided, it is set as an index vector with the values from 1 to the length +#' of the corresponding dimension. +#'@param varName A character string indicating the abbreviation of the variable +#' name. +#'@param metadata A named list where each element is a variable containing the +#' corresponding information. The information can be contained in a list of +#' lists for each variable. +#'@param Datasets Character strings indicating the names of the dataset. It +#' there are multiple datasets it can be a vector of its names or a list of +#' lists with additional information. +#'@param Dates A POSIXct array of time dimensions containing the Dates. +#'@param when A time stamp of the date when the data has been loaded. This +#' parameter is also found in Load() and Start() functions output. +#'@param source_files A vector of character strings with complete paths to all +#' the found files involved in loading the data. +#'@param \dots Additional elements to be added in the object. They will be +#' stored in the end of 'attrs' element. Multiple elements are accepted. #' -#'@return The function returns an object of class 's2dv_cube'. +#'@return The function returns an object of class 's2dv_cube' with the following +#' elements in the structure:\cr +#'\itemize{ +#' \item{'data', array with named dimensions.} +#' \item{'dims', named vector of the data dimensions.} +#' \item{'coords', named list with elements of the coordinates corresponding to +#' the dimensions of the data parameter. If any coordinate is not provided, it +#' is set as an index vector with the values from 1 to the length of the +#' corresponding dimension. The attribute 'indices' indicates wether the +#' coordinate is an index vector (TRUE) or not (FALSE).} +#' \item{'attrs', named list with elements: +#' \itemize{ +#' \item{'Dates', array with named temporal dimensions of class 'POSIXct' from +#' time values in the data.} +#' \item{'Variable', has the following components: +#' \itemize{ +#' \item{'varName', with the short name of the loaded variable as specified +#' in the parameter 'var'.} +#' \item{''metadata', named list of elements with variable metadata. +#' They can be from coordinates variables (e.g. longitude) or +#' main variables (e.g. 'var').} +#' } +#' } +#' \item{'Datasets', character strings indicating the names of the dataset.} +#' \item{'source_files', a vector of character strings with complete paths to +#' all the found files involved in loading the data.} +#' \item{'when', a time stamp of the date issued by the Start() or Load() call to +#' obtain the data.} +#' \item{'load_parameters', it contains the components used in the arguments to +#' load the data from Start() or Load() functions.} +#' } +#' } +#'} #' #'@seealso \code{\link[s2dv]{Load}} and \code{\link{CST_Load}} #'@examples @@ -46,219 +73,154 @@ #'dim(exp_original) <- c(lat = 2, time = 10, lon = 5) #'exp1 <- s2dv_cube(data = exp_original) #'class(exp1) -#'exp2 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50)) +#'coords <- list(lon = seq(-10, 10, 5), lat = c(45, 50)) +#'exp2 <- s2dv_cube(data = exp_original, coords = coords) #'class(exp2) -#'exp3 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50), -#' Variable = list(varName = 'tas', level = '2m')) +#'metadata <- list(tas = list(level = '2m')) +#'exp3 <- s2dv_cube(data = exp_original, coords = coords, +#' varName = 'tas', metadata = metadata) #'class(exp3) -#'exp4 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50), -#' Variable = list(varName = 'tas', level = '2m'), -#' Dates = list(start = paste0(rep("01", 10), rep("01", 10), 1990:1999), -#' end = paste0(rep("31", 10), rep("01", 10), 1990:1999))) +#'Dates = as.POSIXct(paste0(rep("01", 10), rep("01", 10), 1990:1999), format = "%d%m%Y") +#'dim(Dates) <- c(time = 10) +#'exp4 <- s2dv_cube(data = exp_original, coords = coords, +#' varName = 'tas', metadata = metadata, +#' Dates = Dates) #'class(exp4) -#'exp5 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50), -#' Variable = list(varName = 'tas', level = '2m'), -#' Dates = list(start = paste0(rep("01", 10), rep("01", 10), 1990:1999), -#' end = paste0(rep("31", 10), rep("01", 10), 1990:1999)), -#' when = "2019-10-23 19:15:29 CET") +#'exp5 <- s2dv_cube(data = exp_original, coords = coords, +#' varName = 'tas', metadata = metadata, +#' Dates = Dates, when = "2019-10-23 19:15:29 CET") #'class(exp5) -#'exp6 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50), -#' Variable = list(varName = 'tas', level = '2m'), -#' Dates = list(start = paste0(rep("01", 10), rep("01", 10), 1990:1999), -#' end = paste0(rep("31", 10), rep("01", 10), 1990:1999)), +#'exp6 <- s2dv_cube(data = exp_original, coords = coords, +#' varName = 'tas', metadata = metadata, +#' Dates = Dates, #' when = "2019-10-23 19:15:29 CET", -#' source_files = c("/path/to/file1.nc", "/path/to/file2.nc")) +#' source_files = c("/path/to/file1.nc", "/path/to/file2.nc")) #'class(exp6) -#'exp7 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50), -#' Variable = list(varName = 'tas', level = '2m'), -#' Dates = list(start = paste0(rep("01", 10), rep("01", 10), 1990:1999), -#' end = paste0(rep("31", 10), rep("01", 10), 1990:1999)), +#'exp7 <- s2dv_cube(data = exp_original, coords = coords, +#' varName = 'tas', metadata = metadata, +#' Dates = Dates, #' when = "2019-10-23 19:15:29 CET", #' source_files = c("/path/to/file1.nc", "/path/to/file2.nc"), #' Datasets = list( #' exp1 = list(InitializationsDates = list(Member_1 = "01011990", #' Members = "Member_1")))) #'class(exp7) -#'dim(exp_original) <- c(dataset = 1, member = 1, sdate = 2, ftime = 5, lat = 2, lon = 5) -#'exp8 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50), -#' Variable = list(varName = 'tas', level = '2m'), -#' Dates = list(start = paste0(rep("01", 10), rep("01", 10), 1990:1999), -#' end = paste0(rep("31", 10), rep("01", 10), 1990:1999))) +#'dim(exp_original) <- c(dataset = 1, member = 1, time = 10, lat = 2, lon = 5) +#'exp8 <- s2dv_cube(data = exp_original, coords = coords, +#' varName = 'tas', metadata = metadata, +#' Dates = Dates, original_dates = Dates) #'class(exp8) #'@export -s2dv_cube <- function(data, lon = NULL, lat = NULL, Variable = NULL, Datasets = NULL, - Dates = NULL, time_dims = NULL, when = NULL, source_files = NULL) { - +s2dv_cube <- function(data, coords = NULL, varName = NULL, metadata = NULL, + Datasets = NULL, Dates = NULL, when = NULL, + source_files = NULL, ...) { + + # data if (is.null(data) | !is.array(data) | is.null(names(dim(data)))) { stop("Parameter 'data' must be an array with named dimensions.") } + # dims dims <- dim(data) - if (is.null(lon)) { - if (any(c('lon', 'longitude') %in% names(dims))) { - warning("Parameter 'lon' is not provided but data contains a ", - "longitudinal dimension.") - } else { - warning("Parameter 'lon' is not provided so the data is from an ", - "unknown location.") + + ## coords + if (!is.null(coords)) { + if (!all(names(coords) %in% names(dims))) { + coords <- coords[-which(!names(coords) %in% names(dims))] } - } - if (is.null(lat)) { - if (any(c('lat', 'latitude') %in% names(dims))) { - warning("Parameter 'lat' is not provided but data contains a ", - "latitudinal dimension.") - } else { - warning("Parameter 'lat' is not provided so the data is from an ", - "unknown location.") + for (i_coord in names(dims)) { + if (i_coord %in% names(coords)) { + if (length(coords[[i_coord]]) != dims[i_coord]) { + warning(paste0("Coordinate '", i_coord, "' has different lenght as ", + "its dimension and it will not be used.")) + coords[[i_coord]] <- 1:dims[i_coord] + attr(coords[[i_coord]], 'indices') <- TRUE + } else { + attr(coords[[i_coord]], 'indices') <- FALSE + } + } else { + warning(paste0("Coordinate '", i_coord, "' is not provided ", + "and it will be set as index in element coords.")) + coords[[i_coord]] <- 1:dims[i_coord] + attr(coords[[i_coord]], 'indices') <- TRUE + } + } + } else { + coords <- sapply(names(dims), function(x) 1:dims[x]) + for (i in 1:length(coords)) { + attr(coords[[i]], "indices") <- TRUE } } - if (is.null(Variable)) { - warning("Parameter 'Variable' is not provided so the metadata ", - "of 's2dv_cube' object will be incomplete.") - } - if (is.null(Datasets)) { - warning("Parameter 'Datasets' is not provided so the metadata ", - "of 's2dv_cube' object will be incomplete.") - } + + ## attrs + attrs <- list() + # Dates if (is.null(Dates)) { - if (!is.null(time_dims)) { - if (any(time_dims %in% names(dims))) { - warning("Parameter 'Dates' is not provided but data contains a ", - "temporal dimension.") + warning("Parameter 'Dates' is not provided so the metadata ", + "of 's2dv_cube' object will be incomplete.") + attrs$Dates <- NULL + } else if (length(Dates) == 1 & inherits(Dates[1], "POSIXct")) { + attrs$Dates <- Dates + } else { + if (!is.array(Dates)) { + warning("Parameter 'Dates' must be an array with named time dimensions.") + } else { + if (is.null(names(dim(Dates)))) { + warning("Parameter 'Dates' must have dimension names.") + } else if (!all(names(dim(Dates)) %in% names(dims))) { + warning("Parameter 'Dates' must have the corresponding time dimension names in 'data'.") } else { - warning("Data does not contain any of the temporal dimensions ", - "in 'time_dims'.") + if (inherits(Dates[1], "POSIXct")) { + attrs$Dates <- Dates + } else { + warning("Parameter 'Dates' must be of class 'POSIXct'.") + } } - } else if (any(c('time', 'ftime', 'sdate') %in% names(dims))) { - warning("Parameter 'Dates' is not provided but data contains a ", - "temporal dimension.") - } else { - warning("Parameter 'Dates' is not provided so the data is from an ", - "unknown time period.") } } - if (is.null(when)) { - warning("Parameter 'when' is not provided so the metadata ", + # Variable + if (is.null(varName)) { + warning("Parameter 'varName' is not provided so the metadata ", "of 's2dv_cube' object will be incomplete.") + attrs$Variable$varName <- NULL + } else { + if (!is.character(varName)) { + warning("Parameter 'varName' must be a character.") + } else { + attrs$Variable$varName <- varName + } } - if (is.null(source_files)) { - warning("Parameter 'source_files' is not provided so the metadata ", + if (is.null(metadata)) { + warning("Parameter 'metadata' is not provided so the metadata ", "of 's2dv_cube' object will be incomplete.") - } - if (!is.null(Variable)) { - if (!is.list(Variable)) { - Variable <- list(Variable) - } - if (names(Variable)[1] != 'varName' | names(Variable)[2] != 'level') { - warning("The name of the first element of parameter 'Variable' is ", - "expected to be 'varName' and the second 'level'.") - } - if (!is.character(Variable[[1]])) { - warning("The element 'Varname' of parameter 'Variable' must be ", - "a character.") + attrs$Variable$metadata <- NULL + } else { + if (!is.list(metadata)) { + metadata <- list(metadata) } + attrs$Variable$metadata <- metadata } - # Dimensions comparison - ## lon - if (!is.null(lon)) { - if (any(names(dims) %in% c('lon', 'longitude'))) { - name_lon <- names(dims[names(dims) %in% c('lon', 'longitude')]) - if (dims[name_lon] != length(lon) & dims[name_lon] != 1) { - stop("Length of parameter 'lon' doesn't match the length of ", - "longitudinal dimension in parameter 'data'.") - } - if (!is.null(names(dim(lon))) && !identical(name_lon, names(dim(lon)))) { - stop("The dimension name of parameter 'lon' is not consistent ", - "with data dimension name for longitude.") - } else { - dim(lon) <- length(lon) - names(dim(lon)) <- name_lon - } - } else if (!is.null(names(dim(lon))) && names(dim(lon)) %in% names(dims)) { - name_lon <- names(dims[names(dim(lon))]) - if (length(lon) != dims[name_lon]) { - stop("The length of the longitudinal dimension doesn't match ", - "with the length of 'lon' parameter.") - } else { - warning(paste0("Detected the longitude dimension name to be ", names(dim(lon)), - ", which is not the expected names ('lon' or 'longitude') by s2dv_cube.")) - } - } else { - stop("Parameter 'lon' is provided but data doesn't contain a ", - "longitudinal dimension.") - } + # Datasets + if (!is.null(Datasets)) { + attrs$Datasets <- Datasets } - - ## lat - if (!is.null(lat)) { - if (any(names(dims) %in% c('lat', 'latitude'))) { - name_lat <- names(dims[names(dims) %in% c('lat', 'latitude')]) - if (dims[name_lat] != length(lat) & dims[name_lat] != 1) { - stop("Length of parameter 'lat' doesn't match the length of ", - "longitudinal dimension in parameter 'data'.") - } - if (!is.null(names(dim(lat))) && !identical(name_lat, names(dim(lat)))) { - stop("The dimension name of parameter 'lat' is not consistent ", - "with data dimension name for latitude.") - } else { - dim(lat) <- length(lat) - names(dim(lat)) <- name_lat - } - } else if (!is.null(names(dim(lat))) && names(dim(lat)) %in% names(dims)) { - name_lat <- names(dims[names(dim(lat))]) - if (length(lat) != dims[name_lat]) { - stop("The length of the latgitudinal dimension doesn't match ", - "with the length of 'lat' parameter.") - } else { - warning(paste0("Detected the latitude dimension name to be ", names(dim(lat)), - ", which is not the expected names ('lat' or 'latitude') by s2dv_cube.")) - } - } else { - stop("Parameter 'lat' is provided but data doesn't contain a ", - "latitudinal dimension.") - } + # when + if (!is.null(when)) { + attrs$when <- when } - - ## Dates - if (!is.null(Dates)) { - if (!is.list(Dates)) { - stop("Parameter 'Dates' must be a list.") - } else { - if (length(Dates) > 2) { - warning("Parameter 'Dates' is a list with more than 2 ", - "elements and only the first two will be used.") - Dates <- Dates[1 : 2] - } - if (names(Dates)[1] != 'start') { - warning("The name of the first element of parameter 'Dates' ", - "is expected to be 'start'.") - } - if (length(Dates) == 2) { - if (names(Dates)[2] != 'end') { - warning("The name of the second element of parameter 'Dates' ", - "is expected to be 'end'.") - } - if (length(Dates[[1]]) != length(Dates[[2]])) { - stop("The length of the elements in parameter 'Dates' must ", - "be equal.") - } - } - if (!is.null(time_dims)) { - time_dims <- dims[names(dims) %in% time_dims] - } else { - warning("Parameter 'time_dims' is not provided, assigning 'sdate', ", - "'time' and 'ftime' as default time dimension names.") - time_dims <- dims[names(dims) %in% c('sdate', 'time', 'ftime')] - } - if (prod(time_dims) != length(Dates[[1]])) { - stop("The length of the temporal dimension doesn't match ", - "the length of elements in parameter 'Dates'.") - } + # source_files + if (!is.null(source_files)) { + attrs$source_files <- source_files + } + # dots + dots <- list(...) + if (length(dots) != 0) { + for (i_arg in 1:length(dots)) { + attrs[[names(dots)[[i_arg]]]] <- dots[[i_arg]] } } - object <- list(data = data, lon = lon, lat = lat, Variable = Variable, - Datasets = Datasets, Dates = Dates, time_dims = time_dims, - when = when, source_files = source_files) + ## object + object <- list(data = data, dims = dims, coords = coords, attrs = attrs) class(object) <- 's2dv_cube' return(object) } diff --git a/R/zzz.R b/R/zzz.R index a30e8bfb49fe14941113acfdc8d2f3ac96da571e..b0c8b259b6434bbc11ad9c3d99a30bb22ae8b319 100644 --- a/R/zzz.R +++ b/R/zzz.R @@ -163,3 +163,36 @@ GradientCatsColorBar <- function(nmap, brks = NULL, cols = NULL, vertical = TRUE } +.KnownLonNames <- function() { + known_lon_names <- c('lon', 'lons', 'longitude', 'x', 'i', 'nav_lon') +} + +.KnownLatNames <- function() { + known_lat_names <- c('lat', 'lats', 'latitude', 'y', 'j', 'nav_lat') +} + +.KnownTimeNames <- function() { + known_time_names <- c('time', 'ftime', 'sdate', 'sdates', 'syear', 'sweek', 'sday', 'leadtimes') +} + +.KnownForecastTimeNames <- function() { + known_time_names <- c('time', 'ftime', 'ltime', 'leadtimes') +} + +.KnownStartDateNames <- function() { + known_time_names <- c('sdate', 'sdates', 'syear', 'sweek', 'sday') +} + +.KnownMemberNames <- function() { + known_time_names <- c('memb', 'member', 'members', 'ensemble', 'ensembles') +} + +.isNullOb <- function(x) is.null(x) | all(sapply(x, is.null)) + +.rmNullObs <- function(x) { + x <- base::Filter(Negate(.isNullOb), x) + lapply(x, function(x) if (is.list(x)) .rmNullObs(x) else x) +} + +# Definition of a global variable to store the warning message used in Calibration +warning_shown <- FALSE \ No newline at end of file diff --git a/README.md b/README.md index bc54095ee7c0ac1cdc8c56fedac625b24fb33dd5..2dfbb5b7e20d691c468c889909a7c50180ba9297 100644 --- a/README.md +++ b/README.md @@ -1,24 +1,24 @@ -Welcome to the CSTools GitLab website -====================================== +CSTools +======= -The Climate Services Tools, CSTools, is an easy-to-use R package designed and built to assess and improve the quality of climate forecasts for seasonal to multi–annual scales. The package contains process-based state-of-the-art methods for forecast calibration, bias correction, statistical and stochastic downscaling, optimal forecast combination and multivariate verification, as well as basic and advanced tools to obtain tailored products. +#### Assessing Skill of Climate Forecasts on Seasonal-to-Decadal Timescales -This package was developed in the context of the ERA4CS project MEDSCOPE and the H2020 S2S4E project. This GitLab project allows you to monitor its progress and to interact with other developers via the Issues section. +The Climate Services Tools, CSTools, is an easy-to-use R package designed and built to assess and improve the quality of climate forecasts for seasonal to multi–annual scales. The package contains process-based state-of-the-art methods for forecast calibration, bias correction, statistical and stochastic downscaling, optimal forecast combination and multivariate verification, as well as basic and advanced tools to obtain tailored products. -A scientific publication including use cases was published in the Geoscientific Model Development Journal, and it can be cited as follows: -> Pérez-Zanón, N., Caron, L.-P., Terzago, S., Van Schaeybroeck, B., Lledó, L., Manubens, N., Roulin, E., Alvarez-Castro, M. C., Batté, L., Bretonnière, P.-A., Corti, S., Delgado-Torres, C., Domínguez, M., Fabiano, F., Giuntoli, I., von Hardenberg, J., Sánchez-García, E., Torralba, V., and Verfaillie, D.: Climate Services Toolbox (CSTools) v4.0: from climate forecasts to climate forecast information, Geosci. Model Dev., 15, 6115–6142, https://doi.org/10.5194/gmd-15-6115-2022, 2022. +This package was developed in the context of the ERA4CS project MEDSCOPE and the H2020 S2S4E project and includes contributions from ArticXchange project founded by EU-PolarNet 2. This GitLab project allows you to monitor its progress and to interact with other developers via the Issues section. + +A scientific publication including use cases was published in the Geoscientific Model Development Journal, and it can be cited as follows: +> Pérez-Zanón, N., Caron, L.-P., Terzago, S., Van Schaeybroeck, B., Lledó, L., Manubens, N., Roulin, E., Alvarez-Castro, M. C., Batté, L., Bretonnière, P.-A., Corti, S., Delgado-Torres, C., Domínguez, M., Fabiano, F., Giuntoli, I., von Hardenberg, J., Sánchez-García, E., Torralba, V., and Verfaillie, D.: Climate Services Toolbox (CSTools) v4.0: from climate forecasts to climate forecast information, Geosci. Model Dev., 15, 6115–6142, https://doi.org/10.5194/gmd-15-6115-2022, 2022. On-line resources ----------------- A part from this GitLab project, that allows you to monitor CSTools progress, to interact with other developers via the Issues section and to contribute, you can find: -- The CRAN repository which includes the user manual and vignettes - -- Video tutorials - -- Other resources are under-development such [training material](https://earth.bsc.es/gitlab/external/cstools/-/tree/MEDCOF2022/inst/doc/MEDCOF2022) and a [full reproducible use case for forecast calibration](https://earth.bsc.es/gitlab/external/cstools/-/tree/develop-CalibrationVignette/FOCUS_7_2) +- The CRAN repository [https://CRAN.R-project.org/package=CSTools](https://CRAN.R-project.org/package=CSTools) which includes the user manual and vignettes. +- Video tutorials [https://www.medscope-project.eu/products/tool-box/cstools-video-tutorials/](https://www.medscope-project.eu/products/tool-box/cstools-video-tutorials/). +- Other resources are under-development such [training material](https://earth.bsc.es/gitlab/external/cstools/-/tree/MEDCOF2022/inst/doc/MEDCOF2022) and a [full reproducible use case for forecast calibration](https://earth.bsc.es/gitlab/external/cstools/-/tree/develop-CalibrationVignette/FOCUS_7_2). Installation ------------ @@ -26,29 +26,73 @@ Installation CSTools has a system dependency, the CDO libraries, for interpolation of grid data and retrieval of metadata. Make sure you have these libraries installed in the system or download and install from -. +[https://code.zmaw.de/projects/cdo](https://code.zmaw.de/projects/cdo). You can then install the public released version of CSTools from CRAN: + ```r install.packages("CSTools") ``` + Or the development version from the GitLab repository: + ```r # install.packages("devtools") devtools::install_git("https://earth.bsc.es/gitlab/external/cstools.git") ``` -How to contribute ------------------ +Overview +-------- + +The CSTools package functions can be distributed in the following methods: + +- **Data retrieval and formatting:** CST_Load, CST_Anomaly, CST_MergeDims, CST_SplitDims, CST_Subset, as.s2dv_cube, s2dv_cube, CST_SaveExp. +- **Classification:** CST_MultiEOF, CST_WeatherRegimes, CST_RegimsAssign, CST_CategoricalEnsCombination, CST_EnsClustering. +- **Downscaling:** CST_Analogs, CST_RainFARM, CST_RFTemp, CST_AdamontAnalog, CST_AnalogsPredictors. +- **Correction:** CST_BEI_Weighting, CST_BiasCorrection, CST_Calibration, CST_QuantileMapping, CST_DynBiasCorrection. +- **Assessment:** CST_MultiMetric, CST_MultivarRMSE +- **Visualization:** PlotCombinedMap, PlotForecastPDF, PlotMostLikelyQuantileMap, PlotPDFsOLE, PlotTriangles4Categories, PlotWeeklyClim. -Before adding a development, we suggest to contact the package mantainer. Details on the procedure and development guidelines can be found in [this issue](https://earth.bsc.es/gitlab/external/cstools/-/issues/3) +This package is designed to be compatible with other R packages such as [s2dv](https://CRAN.R-project.org/package=s2dv), [startR](https://CRAN.R-project.org/package=startR), [CSIndicators](https://CRAN.R-project.org/package=CSIndicators), [CSDownscale](https://earth.bsc.es/gitlab/es/csdownscale). Functions with the prefix **CST_** deal with a common object called `s2dv_cube` as inputs. Also, this object can be created from Load (s2dv) and from Start (startR) directly. Multiple functions from different packages can operate on this common data structure to easily define a complete post-processing workflow. + +The class `s2dv_cube` is mainly a list of named elements to keep data and metadata in a single object. Basic structure of the object: + +```r +$ data: [data array] +$ dims: [dimensions vector] +$ coords: [List of coordinates vectors] + $ sdate + $ time + $ lon + [...] +$ attrs: [List of the attributes] + $ Variable: + $ varName + $ metadata + $ Datasets + $ Dates + $ source_files + $ when + $ load_parameters +``` + +More information about the `s2dv_cube` object class can be found here: [description of the s2dv_cube object structure document](https://docs.google.com/document/d/1ko37JFl_h6mOjDKM5QSQGikfLBKZq1naL11RkJIwtMM/edit?usp=sharing). + +The current `s2dv_cube` object (CSTools 5.0.0) differs from the original object used in the previous versions of the packages. If you have **questions** on this change you can follow some of the points below: + +- [New s2dv_cube object discussion](https://earth.bsc.es/gitlab/external/cstools/-/issues/94) +- [How to deal with the compatibility break](https://earth.bsc.es/gitlab/external/cstools/-/issues/112) +- [Testing issue and specifications](https://earth.bsc.es/gitlab/external/cstools/-/issues/110) + +Contribute +---------- + +Before adding a development, we suggest to contact the package mantainer. Details on the procedure and development guidelines can be found in [this issue](https://earth.bsc.es/gitlab/external/cstools/-/issues/3). If you plan on contributing, you should rather clone the project on your workstation and modify it using the basic Git commands (clone, branch, add, commit, push, merge, ...). The code of each function should live in a separate file with the .R extension under the R folder, and the documentation of each function should live in a separate file with the .Rd extension under the man folder. -For an introductory video on Git, you can have a look at https://vimeo.com/41027679. - -You can also find all the necessary documentation on git here: https://git-scm.com/book/en/v2 -A lot of it may be a bit complicated for beginners (and not necessary for us), but the "Getting started" and "Git basics" sections are a good resources. +For an introductory video on Git, you can have a look at [https://vimeo.com/41027679](https://vimeo.com/41027679). +You can also find all the necessary documentation on git here: [https://git-scm.com/book/en/v2](https://git-scm.com/book/en/v2). A lot of it may be a bit complicated for beginners (and not necessary for us), but the "Getting started" and "Git basics" sections are a good resources. diff --git a/data/lonlat_prec.rda b/data/lonlat_prec.rda index 4c566a4af69dd006258e6e9901aa24f6b8d162dc..db5f1f06b3f4be4c9336de8d7281d3d43f8cc59e 100644 Binary files a/data/lonlat_prec.rda and b/data/lonlat_prec.rda differ diff --git a/data/lonlat_temp.rda b/data/lonlat_temp.rda index 92005f1df553dbaa6a1e27fb2eb6b0cab4fac925..94db7af93b455d8899346cfd62977e051489bb4e 100644 Binary files a/data/lonlat_temp.rda and b/data/lonlat_temp.rda differ diff --git a/inst/doc/UseCase1_WindEvent_March2018.R b/inst/doc/UseCase1_WindEvent_March2018.R index d3bbb9363ba0c9058efa573bae6f0a9512163a80..e0416a90df616948cacddf4b5bd160f0fe899eb6 100644 --- a/inst/doc/UseCase1_WindEvent_March2018.R +++ b/inst/doc/UseCase1_WindEvent_March2018.R @@ -10,6 +10,7 @@ rm(list=ls()); gc(); # in December 2017, January 2018 and February 2018 # This code includes the bias adjustent and the results visualization # ---------------------------------------- +# Update date: "March 2023" #library(CSTools) library(s2dv) @@ -55,7 +56,7 @@ for (mm in 1:3) { latmin = 36, latmax = 44, lonmin = -10, lonmax = 4, # latmin = 42, latmax = 44, lonmin = -10, lonmax = 1, output = 'lonlat', nprocs = 1) - str(wind_hcst$Dates) + str(wind_hcst$attrs$Dates) dim(wind_hcst$data) fcst_sdates <- paste0(year, months_in_advance[mm], '01') wind_fcst <- CSTools::CST_Load(var = 'sfcWind', exp = list(exp_path), @@ -66,7 +67,7 @@ for (mm in 1:3) { # latmin = 42, latmax = 44, lonmin = -10, lonmax = 1, output = 'lonlat', nprocs = 1) - str(wind_fcst$Dates) + str(wind_fcst$attrs$Dates) dim(wind_fcst$data) wind_ref <- CSTools::CST_Load(var = 'windagl100', obs = list(obs_path), @@ -77,15 +78,15 @@ for (mm in 1:3) { # latmin = 42, latmax = 44, lonmin = -10, lonmax = 1, output = 'lonlat', nprocs = 1, grid = 'r360x181') - str(wind_ref$Dates) + str(wind_ref$attrs$Dates) dim(wind_ref$data) - print(wind_ref$Dates$start) + print(wind_ref$attrs$Dates) wind_ref_terciles <- rbind(wind_ref_terciles, quantile(MeanDims(wind_ref$data, c('lat', 'lon')), c(0.3, 0.6))) wind_ref_extremes <- rbind(wind_ref_extremes, quantile(MeanDims(wind_ref$data, c('lat', 'lon')), c(0.1, 0.9))) - source("/esarchive/scratch/nperez/git/cstools/R/CST_BiasCorrection.R") + # source("/esarchive/scratch/nperez/git/cstools/R/CST_BiasCorrection.R") wind_fsct <- CST_BiasCorrection(exp = wind_hcst, obs = wind_ref, exp_cor = wind_fcst) @@ -140,8 +141,8 @@ for (mm in 1:3) { source("/esarchive/scratch/nperez/git/cstools/R/PlotMostLikelyQuantileMap.R") agg_png(paste0(output_dir, "Wind_MostLikely_", mm, "_obstercile.png"), width = 1050, height = 1000, units = 'px', res = 144) - PlotMostLikelyQuantileMap(probs = Mean_PB, lon = wind_fsct$lon, - lat = wind_fsct$lat, sizetit = 1.5, + PlotMostLikelyQuantileMap(probs = Mean_PB, lon = wind_fsct$coords$lon, + lat = wind_fsct$coords$lat, sizetit = 1.5, intylat = 2, intxlon = 2, coast_width = 1.5, legend_scale = 0.8, cat_dim = 'bin', dot_size = 2.5, @@ -197,7 +198,7 @@ agg_png(paste0(output_dir, "MostLikely_Observed_obstercile.png"), width = 1000, height = 1000, units = 'px', res = 144) s2dv::PlotEquiMap(wind_obs_obstercile, - lon = wind_obs$lon, lat = wind_obs$lat, + lon = wind_obs$coords$lon, lat = wind_obs$coords$lat, brks = c(0,1,2,3), cols = c("#6BAED6FF", "#FFEDA0FF", "#FC4E2AFF"), intylat = 2, intxlon = 2, diff --git a/inst/doc/UseCase2_PrecipitationDownscaling_RainFARM_RF100.R b/inst/doc/UseCase2_PrecipitationDownscaling_RainFARM_RF100.R index 146382dc8983caa3873a8b13665214b6d0dfb7d0..0a85b7548fe7fd9038cf96d3ca49599bc04f86ea 100644 --- a/inst/doc/UseCase2_PrecipitationDownscaling_RainFARM_RF100.R +++ b/inst/doc/UseCase2_PrecipitationDownscaling_RainFARM_RF100.R @@ -18,6 +18,7 @@ rm(list=ls()); gc(); # In this file, the lines are commented since they have been run and the # result saved on disk, then the result is loaded. # ---------------------------------------- +# Update: "March 2023" # # Load required libraries and setup output directory: library(CSTools) @@ -102,7 +103,7 @@ load(paste0(dir_output, 'weightsRF100.RDS')) # -------------------------------------------- agg_png(paste0(dir_output, "RF100_WeightsDec.png"), width = 1000, height = 1100, units = 'px',res = 144) -PlotEquiMap(weight$data[,,12], lon = weight$lon, lat = weight$lat, +PlotEquiMap(weight$data[,,12], lon = weight$coords$lon, lat = weight$coords$lat, filled.continents = FALSE, title_scale = 1, intylat = 2, intxlon = 2, toptitle = 'December Weights RF 100') @@ -135,20 +136,20 @@ for (realizations in 1:10) { # ---------------------------- agg_png(paste0(dir_output, "RF100_Down_11dec.png"), width = 1000, height = 1100, units = 'px',res = 144) - PlotEquiMap(fs$data[1,11,,],lon = fs$lon, lat = fs$lat, + PlotEquiMap(fs$data[1,11,,],lon = fs$coords$lon, lat = fs$coords$lat, filled.continents = FALSE, bar_limits = c(0,40), intylat = 2, intxlon = 2, title_scale = 1, triangle_ends = c(TRUE, FALSE), toptitle = 'Downsacaled RF 100', units = 'precipitation (mm)') dev.off() } - result$lon <- fs$lon - result$lat <- fs$lat + result$coords$lon <- fs$coords$lon + result$coords$lat <- fs$coords$lat result <- CST_MergeDims(result, merge_dims = c("ftime", "monthly"), na.rm = TRUE) - result$Dataset <- paste0('RF100_ECMWFC3S_QM_member_', member, '_real_', + result$attrs$Dataset <- paste0('RF100_ECMWFC3S_QM_member_', member, '_real_', realizations) - result$Dates[[1]] <- exp$Dates[[1]] + result$attrs$Dates <- exp$attrs$Dates CST_SaveExp(result, destination = dir_output, extra_string = paste0('member', k)) gc() diff --git a/inst/doc/UseCase2_PrecipitationDownscaling_RainFARM_RF4.R b/inst/doc/UseCase2_PrecipitationDownscaling_RainFARM_RF4.R index ee2df1943eef5547d21401f55c8fe10d2853132d..22584d3e6f45f741b726a6f547a78fc52a78f381 100644 --- a/inst/doc/UseCase2_PrecipitationDownscaling_RainFARM_RF4.R +++ b/inst/doc/UseCase2_PrecipitationDownscaling_RainFARM_RF4.R @@ -17,12 +17,14 @@ rm(list=ls()); gc(); # In this file, the lines are commented since they have been run and the # result saved on disk, then the result is loaded. # ---------------------------------------- +# Update: "March 2023" # # Load required libraries and setup output directory: library(CSTools) library(ClimProjDiags) library(zeallot) library(ragg) + dir_output <- '/esarchive/scratch/nperez/CSTools_manuscript/v20210603/' #slash end # -------------------------------------------- # STEP 1: @@ -110,7 +112,7 @@ fs <- CST_RainFARM(exp.qm, nf = 4, newfs <- CST_MergeDims(fs, merge_dims = c("ftime", "monthly"), na.rm = TRUE) -newfs$Dates[[1]] <- exp$Dates[[1]] +newfs$attrs$Dates <- exp$attrs$Dates CST_SaveExp(newfs, destination = paste0(dir_output, 'RF4/')) Rprof(NULL) @@ -121,7 +123,7 @@ profile.info <- summaryRprof(paste0(dir_output, "Rprof.out")) library(s2dv) agg_png(paste0(dir_output, "EXP_11dec.png"), width = 800, height = 900, units = 'px',res = 144) -PlotEquiMap(exp$data[1,1,1,11,,,2],lon = exp$lon, lat = exp$lat, +PlotEquiMap(exp$data[1,1,1,11,,,2],lon = exp$coords$lon, lat = exp$coords$lat, filled.continents = FALSE, bar_limits = c(0,40), intylat = 2, intxlon = 2, title_scale = 0.8, bar_label_scale = 1.3, axes_label_scale = 1.2, @@ -130,7 +132,7 @@ PlotEquiMap(exp$data[1,1,1,11,,,2],lon = exp$lon, lat = exp$lat, dev.off() agg_png(paste0(dir_output, "EXPQM_11dec.png"), width = 800, height = 900, units = 'px',res = 144) -PlotEquiMap(exp.qm$data[1,1,1,11,,,2],lon = exp$lon, lat = exp$lat, +PlotEquiMap(exp.qm$data[1,1,1,11,,,2],lon = exp$coords$lon, lat = exp$coords$lat, filled.continents = FALSE, bar_limits = c(0,40), intylat = 2, intxlon = 2, title_scale = 0.8, bar_label_scale = 1.3, axes_label_scale = 1.2, @@ -139,7 +141,7 @@ PlotEquiMap(exp.qm$data[1,1,1,11,,,2],lon = exp$lon, lat = exp$lat, dev.off() agg_png(paste0(dir_output, "RF4_Down_11dec.png"), width = 800, height = 900, units = 'px',res = 144) -PlotEquiMap(fs$data[1,1,1,11,,,2],lon = fs$lon, lat = fs$lat, +PlotEquiMap(fs$data[1,1,1,11,,,2],lon = fs$coords$lon, lat = fs$coords$lat, filled.continents = FALSE, bar_limits = c(0,40), intylat = 2, intxlon = 2, title_scale = 0.8, bar_label_scale = 1.3, axes_label_scale = 1.2, @@ -148,7 +150,7 @@ PlotEquiMap(fs$data[1,1,1,11,,,2],lon = fs$lon, lat = fs$lat, dev.off() agg_png(paste0(dir_output, "RF4_WeightsDec.png"), width = 800, height = 900, units = 'px',res = 144) -PlotEquiMap(weight$data[,,12], lon = weight$lon, lat = weight$lat, +PlotEquiMap(weight$data[,,12], lon = weight$coords$lon, lat = weight$coords$lat, filled.continents = FALSE, title_scale = 0.8, bar_label_scale = 1.3, axes_label_scale = 1.2, intylat = 2, intxlon = 2, degree_sym = TRUE, diff --git a/inst/doc/UseCase3_data_preparation_SCHEME_model.R b/inst/doc/UseCase3_data_preparation_SCHEME_model.R index ada24ef2c762ca78eef563a2fca2c97a75a44903..fdcd6acd848cc9b3184af362264a67e7a4a95748 100644 --- a/inst/doc/UseCase3_data_preparation_SCHEME_model.R +++ b/inst/doc/UseCase3_data_preparation_SCHEME_model.R @@ -1,6 +1,8 @@ # Author: Bert Van Schaeybroeck # Use Case 3: Seasonal forecasts for a river flow # ----------------------------------------------- +# Update: "March 2023" + rm(list = ls()) library(CSTools) library(s2dverification) @@ -337,12 +339,12 @@ obs.low.res.merge <- CST_MergeDims( #amount of ensemble members from experiment. For ECMWF Sys5 it is 25: amt.mbr <- as.numeric(dim(cal.merge$data)["member"]) -lon.low.res <- as.vector(cal.merge$lon) -lat.low.res <- as.vector(cal.merge$lat) -lon.high.res <- as.vector(obs.high.res$lon) -lat.high.res <- as.vector(obs.high.res$lat) -lon.eur <- as.vector(obs.msl.eur.merge.an$lon) -lat.eur <- as.vector(obs.msl.eur.merge.an$lat) +lon.low.res <- as.vector(cal.merge$coords$lon) +lat.low.res <- as.vector(cal.merge$coords$lat) +lon.high.res <- as.vector(obs.high.res$coords$lon) +lat.high.res <- as.vector(obs.high.res$coords$lat) +lon.eur <- as.vector(obs.msl.eur.merge.an$coords$lon) +lat.eur <- as.vector(obs.msl.eur.merge.an$coords$lat) #amount of lead times in months. For ECMWF Sys5 it is 7: amt.lead.mon <- as.numeric(dim(cal.merge$data)["monthly"]) @@ -448,8 +450,8 @@ for(i.mbr in seq(1, amt.mbr)){ pdf(file = file.fig) PlotEquiMap( exp.low.res.tmp[ , ], - lon = obs.low.res.merge$lon, - lat = obs.low.res.merge$lat, + lon = obs.low.res.merge$coords$lon, + lat = obs.low.res.merge$coords$lat, filled.continents = F, intylat = 2, intxlon = 2, @@ -463,8 +465,8 @@ for(i.mbr in seq(1, amt.mbr)){ pdf(file = file.fig) PlotEquiMap( cal.low.res.tmp, - lon = obs.low.res.merge$lon, - lat = obs.low.res.merge$lat, + lon = obs.low.res.merge$coords$lon, + lat = obs.low.res.merge$coords$lat, filled.continents = F, intylat = 2, intxlon = 2, @@ -477,8 +479,8 @@ for(i.mbr in seq(1, amt.mbr)){ pdf(file = file.fig) PlotEquiMap( obs.low.res.tmp[corr.dex, , ], - lon = obs.low.res.merge$lon, - lat = obs.low.res.merge$lat, + lon = obs.low.res.merge$coords$lon, + lat = obs.low.res.merge$coords$lat, filled.continents = F, intylat = 2, intxlon = 2, @@ -492,8 +494,8 @@ for(i.mbr in seq(1, amt.mbr)){ pdf(file = file.fig) PlotEquiMap( obs.high.res.tmp[corr.dex, , ], - lon = obs.high.res.merge$lon, - lat = obs.high.res.merge$lat, + lon = obs.high.res.merge$coords$lon, + lat = obs.high.res.merge$coords$lat, filled.continents = F, intylat = 2, intxlon = 2, diff --git a/man/AdamontQQCorr.Rd b/man/AdamontQQCorr.Rd index b6442f45456d3ea9dcb1dad3aff2cc088e1794f7..4495527a51317fff6f7749790b2c4cec2e6ad68c 100644 --- a/man/AdamontQQCorr.Rd +++ b/man/AdamontQQCorr.Rd @@ -18,38 +18,38 @@ AdamontQQCorr( ) } \arguments{ -\item{exp}{array with named dimensions (such as \code{$data} array of -experiment data from an object of class \code{s2dv_cube})} +\item{exp}{Array with named dimensions (such as \code{$data} array of +experiment data from an object of class \code{s2dv_cube}).} -\item{wt_exp}{corresponding weather types (same dimensions as \code{exp} but -lat/lon)} +\item{wt_exp}{Corresponding weather types (same dimensions as \code{exp} but +lat/lon).} -\item{obs}{array with named dimensions with reference data (can also be +\item{obs}{Array with named dimensions with reference data (can also be \code{$data} array of class \code{s2dv_cube}). lat/lon dimensions can differ from \code{exp} if non rectilinear latlon grids are used, in which case -regrid should be set to TRUE and .NearestNeighbors \code{NN} output should be -provided} +regrid should be set to TRUE and .NearestNeighbors \code{NN} output should +be provided.} -\item{wt_obs}{corresponding weather types (same dimensions as \code{obs} but -lat/lon)} +\item{wt_obs}{Corresponding weather types (same dimensions as \code{obs} but +lat/lon).} -\item{corrdims}{list of dimensions in \code{exp} for which quantile mapping -correction is applied} +\item{corrdims}{List of dimensions in \code{exp} for which quantile mapping +correction is applied.} -\item{londim}{character name of longitude dimension in \code{exp} and -\code{obs}} +\item{londim}{Character name of longitude dimension in \code{exp} and +\code{obs}.} -\item{latdim}{character name of latitude dimension in \code{exp} and -\code{obs}} +\item{latdim}{Character name of latitude dimension in \code{exp} and +\code{obs}.} -\item{regrid}{(optional) boolean indicating whether .NearestNeighbors -regridding is needed} +\item{regrid}{(optional) Boolean indicating whether .NearestNeighbors +regridding is needed.} -\item{NN}{(optional, if regrid=TRUE) list (output from .NearestNeighbors) -maps (nlat, nlon) onto (nlat_o, nlon_o)} +\item{NN}{(optional, if regrid = TRUE) List (output from .NearestNeighbors) +maps (nlat, nlon) onto (nlat_o, nlon_o).} } \value{ -an array (such as \code{$data} array from an object of class +An array (such as \code{$data} array from an object of class \code{s2dv_cube}) with named dimensions, containing experiment data on the lat/lon grid of \code{obs} array, corrected by quantile mapping depending on the weather types \code{wt_exp} @@ -60,15 +60,19 @@ for experiment data (typically a hindcast) onto reference \code{obs}, typically provided by reanalysis data. } \examples{ -\dontrun{ -wt_exp <- sample(1:3, 15*6*3, replace=T) -dim(wt_exp) <- c(dataset=1, member=15, sdate=6, ftime=3) -wt_obs <- sample(1:3, 6*3, replace=T) -dim(wt_obs) <- c(dataset=1, member=1, sdate=6, ftime=3) -exp_corr <- AdamontQQCorr(exp=lonlat_temp$exp$data, wt_exp=wt_exp, - obs=lonlat_temp$obs$data, wt_obs=wt_obs, - corrdims = c('dataset','member','sdate','ftime')) -} +wt_exp <- c(1,1,2,3,3,2,2,1,1,2,2,3) +dim(wt_exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) +wt_obs <- c(3,3,1,2,2,2,2,1,3,1,1,2) +dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) +exp <- 1 : c(1 * 1 * 4 * 3 * 4 * 4) +dim(exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, + lat = 4, lon = 4) +obs <- 101 : c(100 + 1 * 1 * 4 * 3 * 4 * 4) +dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, + lat = 4, lon = 4) +exp_corr <- AdamontQQCorr(exp = exp, wt_exp = wt_exp, + obs = obs, wt_obs = wt_obs, + corrdims = c('dataset', 'member', 'sdate', 'ftime')) } \author{ Paola Marson, \email{paola.marson@meteo.fr} for PROSNOW version diff --git a/man/Analogs.Rd b/man/Analogs.Rd index a7addc73e5bf4936c06e29be410841aae5519619..6cf62ad42b64d1e6f29ea5320f27bda32e0dd38f 100644 --- a/man/Analogs.Rd +++ b/man/Analogs.Rd @@ -29,7 +29,10 @@ on the large scale for which the analog is aimed. This field is used to in all the criterias. If parameter 'expVar' is not provided, the function will return the expL analog. The element 'data' in the 's2dv_cube' object must have, at least, latitudinal and longitudinal dimensions. The object is -expect to be already subset for the desired large scale region.} +expect to be already subset for the desired large scale region. Latitudinal +dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. +Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', +'nav_lon'.} \item{obsL}{An array of N named dimensions containing the observational field on the large scale. The element 'data' in the 's2dv_cube' object must have @@ -37,11 +40,11 @@ the same latitudinal and longitudinal dimensions as parameter 'expL' and a single temporal dimension with the maximum number of available observations.} \item{time_obsL}{A character string indicating the date of the observations -in the format "dd/mm/yyyy". Reference time to search for analogs.} +in the format "dd-mm-yyyy". Reference time to search for analogs.} \item{time_expL}{An array of N named dimensions (coinciding with time dimensions in expL) of character string(s) indicating the date(s) of the -experiment in the format "dd/mm/yyyy". Time(s) to find the analogs.} +experiment in the format "dd-mm-yyyy". Time(s) to find the analogs.} \item{lonL}{A vector containing the longitude of parameter 'expL'.} @@ -55,7 +58,7 @@ function will be the analog of parameter 'expVar'.} \item{obsVar}{An array of N named dimensions containing the field of the same variable as the passed in parameter 'expVar' for the same region.} -\item{criteria}{a character string indicating the criteria to be used for the +\item{criteria}{A character string indicating the criteria to be used for the selection of analogs: \itemize{\item{Large_dist} minimum Euclidean distance in the large scale pattern; \item{Local_dist} minimum Euclidean distance in the large scale pattern @@ -71,14 +74,14 @@ analogs. It can be NULL but if expL is not a forecast (time_expL contained in time_obsL), by default time_expL will be removed during the search of analogs.} -\item{lonVar}{a vector containing the longitude of parameter 'expVar'.} +\item{lonVar}{A vector containing the longitude of parameter 'expVar'.} -\item{latVar}{a vector containing the latitude of parameter 'expVar'.} +\item{latVar}{A vector containing the latitude of parameter 'expVar'.} -\item{region}{a vector of length four indicating the minimum longitude, +\item{region}{A vector of length four indicating the minimum longitude, the maximum longitude, the minimum latitude and the maximum latitude.} -\item{nAnalogs}{number of Analogs to be selected to apply the criterias +\item{nAnalogs}{Number of Analogs to be selected to apply the criterias 'Local_dist' or 'Local_cor'. This is not the necessary the number of analogs that the user can get, but the number of events with minimum distance in which perform the search of the best Analog. The default value for the @@ -100,7 +103,7 @@ correlation, while for Large_dist criteria the best analog will be the day with minimum Euclidean distance). Set to FALSE to get a single analog, the best analog, for instance for downscaling.} -\item{ncores}{the number of cores to use in parallel computation.} +\item{ncores}{The number of cores to use in parallel computation.} } \value{ An array with the dowscaled values of the best analogs for the criteria @@ -139,14 +142,14 @@ the large and local scale in based of the observations. The function is an adapted version of the method of Yiou et al 2013. } \examples{ -# Example 1:Downscaling using criteria 'Large_dist' and a single variable: +# Example 1: Downscaling using criteria 'Large_dist' and a single variable: expSLP <- rnorm(1:20) dim(expSLP) <- c(lat = 4, lon = 5) obsSLP <- c(rnorm(1:180), expSLP * 1.2) dim(obsSLP) <- c(time = 10, lat = 4, lon = 5) time_obsSLP <- paste(rep("01", 10), rep("01", 10), 1994 : 2003, sep = "-") downscale_field <- Analogs(expL = expSLP, obsL = obsSLP, - time_obsL = time_obsSLP,time_expL = "01-01-1994") + time_obsL = time_obsSLP,time_expL = "01-01-1994") # Example 2: Downscaling using criteria 'Large_dist' and 2 variables: obs.pr <- c(rnorm(1:200) * 0.001) @@ -154,23 +157,7 @@ dim(obs.pr) <- dim(obsSLP) downscale_field <- Analogs(expL = expSLP, obsL = obsSLP, obsVar = obs.pr, time_obsL = time_obsSLP, time_expL = "01-01-1994") -# Example 3:List of best Analogs using criteria 'Large_dist' and a single -obsSLP <- c(rnorm(1:1980), expSLP * 1.5) -dim(obsSLP) <- c(lat = 4, lon = 5, time = 100) -time_obsSLP <- paste(rep("01", 100), rep("01", 100), 1920 : 2019, sep = "-") -downscale_field <- Analogs(expL = expSLP, obsL = obsSLP, time_obsSLP, - nAnalogs = 5, time_expL = "01-01-2003", - AnalogsInfo = TRUE, excludeTime = "01-01-2003") - -# Example 4:List of best Analogs using criteria 'Large_dist' and 2 variables: -obsSLP <- c(rnorm(1:180), expSLP * 2) -dim(obsSLP) <- c(lat = 4, lon = 5, time = 10) -time_obsSLP <- paste(rep("01", 10), rep("01", 10), 1994 : 2003, sep = "-") -downscale_field <- Analogs(expL = expSLP, obsL = obsSLP, obsVar = obs.pr, - time_obsL = time_obsSLP,nAnalogs=5, - time_expL = "01-10-2003", AnalogsInfo = TRUE) - -# Example 5: Downscaling using criteria 'Local_dist' and 2 variables: +# Example 3: Downscaling using criteria 'Local_dist' and 2 variables: # analogs of local scale using criteria 2 region = c(lonmin = -1 ,lonmax = 2, latmin = 30, latmax = 33) Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, @@ -179,65 +166,31 @@ Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, region = region,time_expL = "01-10-2000", nAnalogs = 10, AnalogsInfo = TRUE) -# Example 6: list of best analogs using criteria 'Local_dist' and 2 -Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, - criteria = "Local_dist", lonL = seq(-1, 5, 1.5), - latL = seq(30, 35, 1.5), region = region, - time_expL = "01-10-2000", nAnalogs = 5, - AnalogsInfo = TRUE) - -# Example 7: Downscaling using Local_dist criteria -Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, - criteria = "Local_dist", lonL = seq(-1, 5, 1.5), - latL = seq(30, 35, 1.5), region = region, - time_expL = "01-10-2000", - nAnalogs = 10, AnalogsInfo = FALSE) - -# Example 8: Downscaling using criteria 'Local_cor' and 2 variables: +# Example 4: Downscaling using criteria 'Local_cor' and 2 variables: exp.pr <- c(rnorm(1:20) * 0.001) dim(exp.pr) <- dim(expSLP) -Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP,time_obsL = time_obsSLP, +Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, obsVar = obs.pr, expVar = exp.pr, criteria = "Local_cor", lonL = seq(-1, 5, 1.5), time_expL = "01-10-2000", latL = seq(30, 35, 1.5), lonVar = seq(-1, 5, 1.5), latVar = seq(30, 35, 1.5), nAnalogs = 8, region = region, AnalogsInfo = FALSE) -# same but without imposing nAnalogs,so nAnalogs will be set by default as 10 -Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP,time_obsL = time_obsSLP, - obsVar = obs.pr, expVar = exp.pr, - lonVar = seq(-1, 5, 1.5), latVar = seq(30, 35, 1.5), - criteria = "Local_cor", lonL = seq(-1,5,1.5), - time_expL = "01-10-2000", latL =seq(30, 35, 1.5), - region = region, AnalogsInfo = TRUE) -#'Example 9: List of best analogs in the three criterias Large_dist, +# Example 5: List of best analogs in the three criterias Large_dist, Large_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, criteria = "Large_dist", time_expL = "01-10-2000", nAnalogs = 7, AnalogsInfo = TRUE) Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, time_expL = "01-10-2000", criteria = "Local_dist", lonL = seq(-1, 5, 1.5), latL = seq(30, 35, 1.5), - nAnalogs = 7,region = region, AnalogsInfo = TRUE) -Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP,time_obsL = time_obsSLP, + nAnalogs = 7, region = region, AnalogsInfo = TRUE) +Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, obsVar = obsSLP, expVar = expSLP, - time_expL = "01-10-2000",criteria = "Local_cor", + time_expL = "01-10-2000", criteria = "Local_cor", lonL = seq(-1, 5, 1.5), latL = seq(30, 35, 1.5), lonVar = seq(-1, 5, 1.5), latVar = seq(30, 35, 1.5), - nAnalogs = 7,region = region, + nAnalogs = 7, region = region, AnalogsInfo = TRUE) -#Example 10: Downscaling using criteria 'Large_dist' and a single variable, -# more than 1 sdate: -expSLP <- rnorm(1:40) -dim(expSLP) <- c(sdate = 2, lat = 4, lon = 5) -obsSLP <- c(rnorm(1:180), expSLP * 1.2) -dim(obsSLP) <- c(time = 11, lat = 4, lon = 5) -time_obsSLP <- paste(rep("01", 11), rep("01", 11), 1993 : 2003, sep = "-") -time_expSLP <- paste(rep("01", 2), rep("01", 2), 1994 : 1995, sep = "-") -excludeTime <- c("01-01-2003", "01-01-2003") -dim(excludeTime) <- c(sdate = 2) -downscale_field_exclude <- Analogs(expL = expSLP, obsL = obsSLP, - time_obsL = time_obsSLP, time_expL = time_expSLP, - excludeTime = excludeTime, AnalogsInfo = TRUE) } \references{ Yiou, P., T. Salameh, P. Drobinski, L. Menut, R. Vautard, diff --git a/man/BEI_EMWeighting.Rd b/man/BEI_EMWeighting.Rd new file mode 100644 index 0000000000000000000000000000000000000000..72282ec1464b2c42c8173d6d10b110b3e43b7186 --- /dev/null +++ b/man/BEI_EMWeighting.Rd @@ -0,0 +1,58 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/CST_BEI_Weighting.R +\name{BEI_EMWeighting} +\alias{BEI_EMWeighting} +\title{Computing the weighted ensemble means for SFSs.} +\usage{ +BEI_EMWeighting(var_exp, aweights, time_dim_name = "time", memb_dim = "member") +} +\arguments{ +\item{var_exp}{Variable (e.g. precipitation, temperature, NAO index) +array from a SFS with at least dimensions (time, member) for a spatially +aggregated variable or dimensions (time, member, lat, lon) for a spatial +variable, as 'time' the spatial dimension by default.} + +\item{aweights}{Normalized weights array with at least dimensions +(time, member), when 'time' is the temporal dimension as default.} + +\item{time_dim_name}{A character string indicating the name of the +temporal dimension, by default 'time'.} + +\item{memb_dim}{A character string indicating the name of the +member dimension, by default 'member'.} +} +\value{ +BEI_EMWeighting() returns an array with at least one or three +dimensions depending if the variable is spatially aggregated variable +(as e.g. NAO index)(time) or it is spatial variable (as e.g. precipitation +or temperature) (time, lat, lon), containing the ensemble means computing +with weighted members. +} +\description{ +This function implements the computation to obtain the weighted +ensemble means for SFSs using a normalized weights array, +} +\examples{ +# Example 1 +var_exp <- 1 : (2 * 3 * 4) +dim(var_exp) <- c(time = 2, dataset = 3, member = 4) +aweights <- runif(24, min = 0.001, max = 0.999) +dim(aweights) <- c(time = 2, dataset = 3, member = 4) +res <- BEI_EMWeighting(var_exp, aweights) + +# Example 2 +var_exp <- 1 : (2 * 4 * 2 * 3) +dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) +aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) +dim(aweights) <- c(time = 2, member = 4) +res <- BEI_EMWeighting(var_exp, aweights) + +} +\references{ +Regionally improved seasonal forecast of precipitation through Best +estimation of winter NAO, Sanchez-Garcia, E. et al., +Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} +} +\author{ +Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} +} diff --git a/man/BEI_PDFBest.Rd b/man/BEI_PDFBest.Rd index 0ba24a8487daf4c3f47462b96ac7fbbe829d0700..0f8da8e9a9cfa2dd74afae1351f0a2c1d34d091e 100644 --- a/man/BEI_PDFBest.Rd +++ b/man/BEI_PDFBest.Rd @@ -23,43 +23,47 @@ which must be greater than 2.} \item{index_hind1}{Index (e.g. NAO index) array from a SFS (named SFS1) with at least two dimensions (time , member) or (time, statistic). The temporal dimension, by default 'time', must be greater than 2. -The dimension 'member' must be greater than 1. -The dimension 'statistic' must be equal to 2, for containing the two paramenters of -a normal distribution (mean and sd) representing the ensemble of a SFS. -It is not possible to have the dimension 'member' and 'statistic' at the same time.} +The dimension 'member' must be greater than 1. The dimension 'statistic' +must be equal to 2, for containing the two paramenters of a normal +distribution (mean and sd) representing the ensemble of a SFS. It is not +possible to have the dimension 'member' and 'statistic' +at the same time.} \item{index_hind2}{Index (e.g. NAO index) array from a SFS (named SFS2) with at least two dimensions (time , member) or (time, statistic). The temporal dimension, by default 'time', must be greater than 2. The dimension 'member' must be greater than 1. -The dimension 'statistic' must be equal to 2, for containing the two paramenters of -a normal distribution (mean and sd) representing the ensemble of a SFS. -It is not possible to have the dimension 'member' and 'statistic' together.} +The dimension 'statistic' must be equal to 2, for containing the two +paramenters of a normal distribution (mean and sd) representing the ensemble +of a SFS. It is not possible to have the dimension 'member' and 'statistic' +together.} -\item{index_fcst1}{(optional, default = NULL) Index (e.g. NAO index) array from forescating of SFS1 -with at least two dimensions (time , member) or (time, statistic). -The temporal dimension, by default 'time', must be equal to 1, the forecast year target. -The dimension 'member' must be greater than 1. -The dimension 'statistic' must be equal to 2, for containing the two paramenters of -a normal distribution (mean and sd) representing the ensemble of a SFS. -It is not possible to have the dimension 'member' and 'statistic' together.} +\item{index_fcst1}{(optional, default = NULL) Index (e.g. NAO index) array +from forescating of SFS1 with at least two dimensions (time , member) or +(time, statistic). The temporal dimension, by default 'time', must be equal +to 1, the forecast year target. The dimension 'member' must be greater than +1. The dimension 'statistic' must be equal to 2, for containing the two +paramenters of a normal distribution (mean and sd) representing the ensemble +of a SFS. It is not possible to have the dimension 'member' and 'statistic' +together.} -\item{index_fcst2}{(optional, default = NULL) Index (e.g. NAO index) array from forescating of SFS2 -with at least two dimensions (time , member) or (time, statistic). -The temporal dimension, by default 'time', must be equal to 1, the forecast year target. -The dimension 'member' must be greater than 1. -The dimension 'statistic' must be equal to 2, for containing the two paramenters of -a normal distribution (mean and sd) representing the ensemble of a SFS. -It is not possible to have the dimension 'member' and 'statistic' together.} +\item{index_fcst2}{(optional, default = NULL) Index (e.g. NAO index) array +from forescating of SFS2 with at least two dimensions (time , member) or +(time, statistic). The temporal dimension, by default 'time', must be equal +to 1, the forecast year target. The dimension 'member' must be greater than +1. The dimension 'statistic' must be equal to 2, for containing the two +paramenters of a normal distribution (mean and sd) representing the ensemble +of a SFS. It is not possible to have the dimension 'member' and 'statistic' +together.} \item{method_BC}{A character vector of maximun length 2 indicating the bias correction methodology to be applied on each SFS. If it is 'none' or any of -its elements is 'none', the bias correction won't be applied. -Available methods developped are "ME" (a bias correction scheme based on the -mean error or bias between observation and predictions to correct the -predicted values), and "LMEV" (a bias correction scheme based on a linear -model using ensemble variance of index as predictor). (see Sanchez-Garcia, -E. et al (2019), https://doi.org/10.5194/asr-16-165-2019 for more details).} +its elements is 'none', the bias correction won't be applied. Available +methods developped are "ME" (a bias correction scheme based on the mean +error or bias between observation and predictions to correct the predicted +values), and "LMEV" (a bias correction scheme based on a linear model using +ensemble variance of index as predictor). (see Sanchez-Garcia, E. et al +(2019), \doi{10.5194/asr-16-165-2019} for more details).} \item{time_dim_name}{A character string indicating the name of the temporal dimension, by default 'time'.} @@ -69,20 +73,19 @@ dimension, by default 'time'.} \value{ BEI_PDFBest() returns an array with the parameters that caracterize the PDFs, with at least a temporal dimension, by default 'time' and dimension -'statistic' equal to 2. -The firt statistic is the parameter 'mean' of the PDF for the best estimation -combining the two SFSs PDFs. -The second statistic is the parameter 'standard deviation' of the PDF for -the best estimation combining the two SFSs PDFs. -If index_fcst1 and/or index_fcst2 are null, returns the values for hindcast period. -Otherwise, it returns the values for a forecast year. +'statistic' equal to 2. The firt statistic is the parameter 'mean' of the PDF +for the best estimation combining the two SFSs PDFs. The second statistic is +the parameter 'standard deviation' of the PDF for the best estimation +combining the two SFSs PDFs. If index_fcst1 and/or index_fcst2 are null, +returns the values for hindcast period. Otherwise, it returns the values for a +forecast year. } \description{ This function implements the computation to obtain the index Probability Density Functions (PDFs) (e.g. NAO index) obtained to combining the Index PDFs for two Seasonal Forecast Systems (SFSs), the Best Index estimation (see Sanchez-Garcia, E. et al (2019), -https://doi.org/10.5194/asr-16-165-2019 for more details about the +\doi{10.5194/asr-16-165-2019} for more details about the methodology applied to estimate the Best Index). } \examples{ @@ -99,10 +102,7 @@ index_fcst2 <- rnorm(18, mean = -0.5, sd = 4) dim(index_fcst2) <- c(time = 1, member = 9, season = 2) method_BC <- 'ME' res <- BEI_PDFBest(index_obs, index_hind1, index_hind2, index_fcst1, -index_fcst2, method_BC) -dim(res) -# time statistic season -# 1 2 2 +index_fcst2, method_BC) # Example 2 for the BEI_PDFBest function index_obs<- rnorm(10, sd = 3) dim(index_obs) <- c(time = 5, season = 2) @@ -115,15 +115,13 @@ dim(index_fcst1) <- c(time = 1, member = 8, season = 2) index_fcst2 <- rnorm(18, mean = -0.5, sd = 4) dim(index_fcst2) <- c(time = 1, member = 9, season = 2) method_BC <- c('LMEV', 'ME') -res <- BEI_PDFBest(index_obs, index_hind1, index_hind2, index_fcst1, index_fcst2, method_BC) -dim(res) -# time statistic season -# 1 2 2 +res <- BEI_PDFBest(index_obs, index_hind1, index_hind2, index_fcst1, + index_fcst2, method_BC) } \references{ Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., -Adv. Sci. Res., 16, 165174, 2019, https://doi.org/10.5194/asr-16-165-2019 +Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} } \author{ Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} diff --git a/man/BEI_ProbsWeighting.Rd b/man/BEI_ProbsWeighting.Rd new file mode 100644 index 0000000000000000000000000000000000000000..d14321b897cd4c661f87384d90b4865b938ec8cc --- /dev/null +++ b/man/BEI_ProbsWeighting.Rd @@ -0,0 +1,73 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/CST_BEI_Weighting.R +\name{BEI_ProbsWeighting} +\alias{BEI_ProbsWeighting} +\title{Computing the weighted tercile probabilities for SFSs.} +\usage{ +BEI_ProbsWeighting( + var_exp, + aweights, + terciles, + time_dim_name = "time", + memb_dim = "member" +) +} +\arguments{ +\item{var_exp}{Variable (e.g. precipitation, temperature, NAO index) +array from a SFS with at least dimensions (time, member) for a spatially +aggregated variable or dimensions (time, member, lat, lon) for a spatial +variable, as 'time' the spatial dimension by default.} + +\item{aweights}{Normalized weights array with at least dimensions +(time, member), when 'time' is the temporal dimension as default.} + +\item{terciles}{A numeric array with at least one dimension 'tercil' equal to +2, the first element is the lower tercil for a hindcast period, and the second +element is the upper tercile.} + +\item{time_dim_name}{A character string indicating the name of the +temporal dimension, by default 'time'.} + +\item{memb_dim}{A character string indicating the name of the +member dimension, by default 'member'.} +} +\value{ +BEI_ProbsWeighting() returns an array with at least two or four +dimensions depending if the variable is a spatially aggregated variable +(as e.g. NAO index)(time, tercil) or it is spatial variable (as e.g. +precipitation or temperature)(time, tercile, lat, lon), containing the +terciles probabilities computing with weighted members. +The first tercil is the lower tercile, the second is the normal tercile and +the third is the upper tercile. +} +\description{ +This function implements the computation to obtain the tercile +probabilities for a weighted variable for SFSs using a normalized weights array, +} +\examples{ +# Example 1 +var_exp <- 1 : (2 * 4) +dim(var_exp) <- c(time = 2, member = 4) +aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) +dim(aweights) <- c(time = 2, member = 4) +terciles <- c(2.5,5) +dim(terciles) <- c(tercil = 2) +res <- BEI_ProbsWeighting(var_exp, aweights, terciles) + +# Example 2 +var_exp <- rnorm(48, 50, 9) +dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) +aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) +dim(aweights) <- c(time = 2, member = 4) +terciles <- rep(c(48,50), 2*3) +dim(terciles) <- c(tercil = 2, lat = 2, lon = 3) +res <- BEI_ProbsWeighting(var_exp, aweights, terciles) +} +\references{ +Regionally improved seasonal forecast of precipitation through Best +estimation of winter NAO, Sanchez-Garcia, E. et al., +Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} +} +\author{ +Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} +} diff --git a/man/BEI_TercilesWeighting.Rd b/man/BEI_TercilesWeighting.Rd new file mode 100644 index 0000000000000000000000000000000000000000..31d5f7319303eeb21d2bf4fb3f38f665d6835a44 --- /dev/null +++ b/man/BEI_TercilesWeighting.Rd @@ -0,0 +1,63 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/CST_BEI_Weighting.R +\name{BEI_TercilesWeighting} +\alias{BEI_TercilesWeighting} +\title{Computing the weighted terciles for SFSs.} +\usage{ +BEI_TercilesWeighting( + var_exp, + aweights, + time_dim_name = "time", + memb_dim = "member" +) +} +\arguments{ +\item{var_exp}{Variable (e.g. precipitation, temperature, NAO index) +array from a SFS with at least dimensions (time, member) for a spatially +aggregated variable or dimensions (time, member, lat, lon) for a spatial +variable, as 'time' the spatial dimension by default.} + +\item{aweights}{Normalized weights array with at least dimensions +(time, member), when 'time' is the temporal dimension as default.} + +\item{time_dim_name}{A character string indicating the name of the +temporal dimension, by default 'time'.} + +\item{memb_dim}{A character string indicating the name of the +member dimension, by default 'member'.} +} +\value{ +BEI_TercilesWeighting() returns an array with at least one +dimension depending if the variable is a spatially aggregated variable +(as e.g. NAO index)(tercil) or it is spatial variable (as e.g. +precipitation or temperature)(tercil, lat, lon), containing the +terciles computing with weighted members. +The first tercil is the lower tercile, the second is the upper tercile. +} +\description{ +This function implements the computation to obtain the terciles +for a weighted variable for SFSs using a normalized weights array, +} +\examples{ +# Example 1 +var_exp <- 1 : (2 * 4) +dim(var_exp) <- c(time = 2, member = 4) +aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) +dim(aweights) <- c(time = 2, member = 4) +res <- BEI_TercilesWeighting(var_exp, aweights) + +# Example 2 +var_exp <- rnorm(48, 50, 9) +dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) +aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) +dim(aweights) <- c(time = 2, member = 4) +res <- BEI_TercilesWeighting(var_exp, aweights) +} +\references{ +Regionally improved seasonal forecast of precipitation through Best +estimation of winter NAO, Sanchez-Garcia, E. et al., +Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} +} +\author{ +Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} +} diff --git a/man/BEI_Weights.Rd b/man/BEI_Weights.Rd index 867a4eb0102a41b88f8ac64ac7707d6fb7b3b37a..fdaacca762105bccc8abb16dfc86559e7d54470b 100644 --- a/man/BEI_Weights.Rd +++ b/man/BEI_Weights.Rd @@ -10,15 +10,15 @@ BEI_Weights(index_weight, pdf_weight, time_dim_name = "time") \item{index_weight}{Index (e.g. NAO index) array, from a dataset of SFSs for a period of years, with at least dimensions 'member'. Additional dimensions, for instance, a temporal dimension as 'time', -must have the same lenght in both parameters, -'index_weight' and 'pdf_weight'.} +must have the same lenght in both parameters, 'index_weight' and +'pdf_weight'.} \item{pdf_weight}{Statistics array to define a Gaussian PDF with at least -dimensions 'statistic'. -The firt statistic is the parameter 'mean' of the PDF and -the second statistic is the parameter 'standard deviation' of the PDF.} +dimensions 'statistic'. The firt statistic is the parameter 'mean' of the PDF +and the second statistic is the parameter 'standard deviation' of the PDF.} -\item{time_dim_name}{A character string indicating the name of the temporal dimension, by default 'time'.} +\item{time_dim_name}{A character string indicating the name of the temporal +dimension, by default 'time'.} } \value{ BEI_Weights() returns a normalized weights array with the same @@ -45,9 +45,9 @@ dim(res) } \references{ -Regionally improved seasonal forecast of precipitation through Best -estimation of winter NAO, Sanchez-Garcia, E. et al., -Adv. Sci. Res., 16, 165174, 2019, https://doi.org/10.5194/asr-16-165-2019 +Regionally improved seasonal forecast of precipitation through +Best estimation of winter NAO, Sanchez-Garcia, E. et al., +Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} } \author{ Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} diff --git a/man/BiasCorrection.Rd b/man/BiasCorrection.Rd index d944a32fa7f61e738e907bdc3c95e1971912f707..fa087478548e9b9f6bff5ba7394f8bd00b0ca472 100644 --- a/man/BiasCorrection.Rd +++ b/man/BiasCorrection.Rd @@ -11,20 +11,23 @@ BiasCorrection( na.rm = FALSE, memb_dim = "member", sdate_dim = "sdate", + dat_dim = NULL, ncores = NULL ) } \arguments{ \item{exp}{A multidimensional array with named dimensions containing the -seasonal forecast experiment data with at least 'member' and 'sdate' -dimensions.} +seasonal forecast experiment data with at least time and member dimensions.} \item{obs}{A multidimensional array with named dimensions containing the -observed data with at least 'sdate' dimension.} +observed data with at least time dimension.} \item{exp_cor}{A multidimensional array with named dimensions containing the -seasonl forecast experiment to be corrected. If it is NULL, the 'exp' -forecast will be corrected.} +seasonal forecast experiment to be corrected with at least time and member +dimension. If it is NULL, the 'exp' forecast will be corrected. If there is +only one corrected dataset, it should not have dataset dimension. If there +is a corresponding corrected dataset for each 'exp' forecast, the dataset +dimension must have the same length as in 'exp'. The default value is NULL.} \item{na.rm}{A logical value indicating whether missing values should be stripped before the computation proceeds, by default it is set to FALSE.} @@ -35,12 +38,20 @@ dimension. By default, it is set to 'member'.} \item{sdate_dim}{A character string indicating the name of the start date dimension. By default, it is set to 'sdate'.} +\item{dat_dim}{A character string indicating the name of dataset dimension. +The length of this dimension can be different between 'exp' and 'obs'. +The default value is NULL.} + \item{ncores}{An integer that indicates the number of cores for parallel computations using multiApply function. The default value is NULL.} } \value{ -An array containing the bias corrected forecasts with the same -dimensions of the experimental data. +An array containing the bias corrected forecasts with the dimensions +nexp, nobs and same dimensions as in the 'exp' object. nexp is the number of +experiment (i.e., 'dat_dim' in exp), and nobs is the number of observation +(i.e., 'dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. If +'exp_cor' is provided the returned array will be with the same dimensions as +'exp_cor'. } \description{ This function applies the simple bias adjustment technique @@ -58,7 +69,7 @@ a <- BiasCorrection(exp = mod1, obs = obs1) Torralba, V., F.J. Doblas-Reyes, D. MacLeod, I. Christel and M. Davis (2017). Seasonal climate prediction: a new source of information for the management of wind energy resources. Journal of Applied Meteorology and -Climatology, 56, 1231-1247, doi:10.1175/JAMC-D-16-0204.1. (CLIM4ENERGY, +Climatology, 56, 1231-1247, \doi{10.1175/JAMC-D-16-0204.1}. (CLIM4ENERGY, EUPORIAS, NEWA, RESILIENCE, SPECS) } \author{ diff --git a/man/CST_AdamontAnalog.Rd b/man/CST_AdamontAnalog.Rd index 889258a19176ba267cedd7db715467678fd1390d..e593b6a02ac26885eb69a9dc8c590acf9e5f9a76 100644 --- a/man/CST_AdamontAnalog.Rd +++ b/man/CST_AdamontAnalog.Rd @@ -33,68 +33,83 @@ AdamontAnalog( ) } \arguments{ -\item{exp}{\itemize{ -\item\code{CST_AdamontAnalog}{experiment data an object of class \code{s2dv_cube}, can be output -from quantile correction using CST_AdamontQQCorr} -\item\code{AdamontAnalog}{experiment data array with named dimension}}} +\item{exp}{A multidimensional array with named dimensions containing the +experiment data.} -\item{obs}{\itemize{ -\item\code{CST_AdamontAnalog}{reference data, also of class \code{s2dv_cube}.} -\item\code{AdamontAnalog}{reference data array with named dimension.}} -Note that lat/lon dimensions need to be the same as \code{exp}} +\item{obs}{A multidimensional array with named dimensions containing the +reference data. Note that lat/lon dimensions need to be the same as +\code{exp}.} -\item{wt_exp}{corresponding weather types (same dimensions as \code{exp$data} -but lat/lon)} +\item{wt_exp}{Corresponding weather types (same dimensions as \code{exp$data} +but lat/lon).} -\item{wt_obs}{corresponding weather types (same dimensions as \code{obs$data} -but lat/lon)} +\item{wt_obs}{Corresponding weather types (same dimensions as \code{obs$data} +but lat/lon).} -\item{nanalogs}{integer defining the number of analog values to return -(default: 5)} +\item{nanalogs}{Integer defining the number of analog values to return +(default: 5).} -\item{method}{a character string indicating the method used for analog -definition - Coded are 'pattcorr': pattern correlation - 'rain1' (for precip patterns): rain occurrence consistency - 'rain01' (for precip patterns): rain occurrence/non - occurrence consistency} +\item{method}{A character string indicating the method used for analog +definition. It can be: +\itemize{ + \item{'pattcorr': pattern correlation.} + \item{'rain1' (for precip patterns): rain occurrence consistency.} + \item{'rain01' (for precip patterns): rain occurrence/non occurrence + consistency} +}} -\item{thres}{real number indicating the threshold to define rain -occurrence/non occurrence in rain(0)1} +\item{thres}{Real number indicating the threshold to define rain +occurrence/non occurrence in rain (0)1.} -\item{search_obsdims}{list of dimensions in \code{obs} along which analogs are -searched for} +\item{search_obsdims}{List of dimensions in \code{obs} along which analogs are +searched for.} -\item{londim}{name of longitude dimension} +\item{londim}{Name of longitude dimension.} -\item{latdim}{name of latitude dimension} +\item{latdim}{Name of latitude dimension.} } \value{ -analog_vals -\itemize{ -\item\code{CST_AdamontAnalog}{an object of class \code{s2dv_cube} containing nanalogs -analog values for each value of \code{exp} input data} -\item\code{AdamontAnalog}{an array containing nanalogs analog values}} +analog_vals An object of class \code{s2dv_cube} containing + nanalogs analog values for each value of \code{exp} input data. + +analog_vals An array containing nanalogs analog values. } \description{ +This function searches for analogs in a reference dataset for +experiment data, based on corresponding weather types. The experiment data is +typically a hindcast, observations are typically provided by reanalysis data. + This function searches for analogs in a reference dataset for experiment data, based on corresponding weather types. The experiment data is typically a hindcast, observations are typically provided by reanalysis data. } \examples{ -\dontrun{ -wt_exp <- sample(1:3, 15*6*3, replace=T) -dim(wt_exp) <- c(dataset=1, member=15, sdate=6, ftime=3) -wt_obs <- sample(1:3, 6*3, replace=T) -dim(wt_obs) <- c(dataset=1, member=1, sdate=6, ftime=3) -} -\dontrun{ -wt_exp <- sample(1:3, 15*6*3, replace=T) -dim(wt_exp) <- c(dataset=1, member=15, sdate=6, ftime=3) -wt_obs <- sample(1:3, 6*3, replace=T) -dim(wt_obs) <- c(dataset=1, member=1, sdate=6, ftime=3) - obs=lonlat_temp$obs$data, wt_exp=wt_exp, wt_obs=wt_obs, nanalogs=2) -} +wt_exp <- sample(1:3, 15*6*3, replace = TRUE) +dim(wt_exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3) +wt_obs <- sample(1:3, 6*3, replace = TRUE) +dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3) +exp <- NULL +exp$data <- 1 : c(1 * 15 * 6 * 3 * 8 * 8) +dim(exp$data) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3, + lat = 8, lon = 8) +class(exp) <- 's2dv_cube' +obs <- NULL +obs$data <- 101 : c(100 + 1 * 1 * 6 * 3 * 8 * 8) +dim(obs$data) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3, + lat = 8, lon = 8) +class(obs) <- 's2dv_cube' +analog_vals <- CST_AdamontAnalog(exp = exp, obs = obs, wt_exp = wt_exp, + wt_obs = wt_obs, nanalogs = 2) +wt_exp <- sample(1:3, 15*6*3, replace = TRUE) +dim(wt_exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3) +wt_obs <- sample(1:3, 6*3, replace = TRUE) +dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3) +exp <- 1 : c(1 * 15 * 6 * 3 * 8 * 8) +dim(exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3, lat = 8, lon = 8) +obs <- 101 : c(100 + 1 * 1 * 6 * 3 * 8 * 8) +dim(obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3, lat = 8, lon = 8) +analog_vals <- AdamontAnalog(exp = exp, obs = obs, wt_exp = wt_exp, + wt_obs = wt_obs, nanalogs = 2) } \author{ Paola Marson, \email{paola.marson@meteo.fr} for PROSNOW version diff --git a/man/CST_AdamontQQCorr.Rd b/man/CST_AdamontQQCorr.Rd index eb750644b2a161b59841482a36da65077f8413eb..fcea5fc2ae61597b22fc9ff4e868ee76c6ec6199 100644 --- a/man/CST_AdamontQQCorr.Rd +++ b/man/CST_AdamontQQCorr.Rd @@ -16,32 +16,32 @@ CST_AdamontQQCorr( ) } \arguments{ -\item{exp}{experiment data an object of class \code{s2dv_cube}} +\item{exp}{Experiment data an object of class \code{s2dv_cube}.} -\item{wt_exp}{corresponding weather types (same dimensions as \code{exp$data} -but lat/lon)} +\item{wt_exp}{Corresponding weather types (same dimensions as \code{exp$data} +but lat/lon).} -\item{obs}{reference data, also of class \code{s2dv_cube}. lat/lon dimensions +\item{obs}{Reference data, also of class \code{s2dv_cube}. lat/lon dimensions can differ from \code{exp} if non rectilinear latlon grids are used, in which case regrid should be set to TRUE and .NearestNeighbors \code{NN} -output should be provided} +output should be provided.} -\item{wt_obs}{corresponding weather types (same dimensions as \code{obs} but -lat/lon)} +\item{wt_obs}{Corresponding weather types (same dimensions as \code{obs} but +lat/lon).} -\item{corrdims}{list of dimensions in \code{exp} for which quantile mapping -correction is applied} +\item{corrdims}{List of dimensions in \code{exp} for which quantile mapping +correction is applied.} -\item{londim}{character name of longitude dimension in \code{exp} and -\code{obs}} +\item{londim}{Character name of longitude dimension in \code{exp} and +\code{obs}.} -\item{latdim}{character name of latitude dimension in \code{exp} and -\code{obs}} +\item{latdim}{Character name of latitude dimension in \code{exp} and +\code{obs}.} } \value{ -an object of class \code{s2dv_cube} containing experiment data on the - lat/lon grid of \code{obs} input data, corrected by quantile mapping - depending on the weather types \code{wt_exp} +An object of class \code{s2dv_cube} containing experiment data on the +lat/lon grid of \code{obs} input data, corrected by quantile mapping +depending on the weather types \code{wt_exp}. } \description{ This function computes a quantile mapping based on weather types @@ -49,15 +49,23 @@ for experiment data (typically a hindcast) onto reference \code{obs}, typically provided by reanalysis data. } \examples{ -\dontrun{ -wt_exp <- sample(1:3, 15*6*3, replace=T) -dim(wt_exp) <- c(dataset=1, member=15, sdate=6, ftime=3) -wt_obs <- sample(1:3, 6*3, replace=T) -dim(wt_obs) <- c(dataset=1, member=1, sdate=6, ftime=3) -exp_corr <- CST_AdamontQQCorr(exp = lonlat_temp$exp, wt_exp = wt_exp, - obs=lonlat_temp$obs, wt_obs = wt_obs, - corrdims = c('dataset','member','sdate','ftime')) -} +wt_exp <- c(1,1,2,3,3,2,2,1,1,2,2,3) +dim(wt_exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) +wt_obs <- c(3,3,1,2,2,2,2,1,3,1,1,2) +dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) +exp <- NULL +exp$data <- 1 : c(1 * 1 * 4 * 3 * 4 * 4) +dim(exp$data) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, + lat = 4, lon = 4) +class(exp) <- 's2dv_cube' +obs <- NULL +obs$data <- 101 : c(100 + 1 * 1 * 4 * 3 * 4 * 4) +dim(obs$data) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, + lat = 4, lon = 4) +class(obs) <- 's2dv_cube' +exp_corr <- CST_AdamontQQCorr(exp = exp, wt_exp = wt_exp, + obs = obs, wt_obs = wt_obs, + corrdims = c('dataset','member','sdate','ftime')) } \author{ Lauriane Batté, \email{lauriane.batte@meteo.fr} diff --git a/man/CST_Analogs.Rd b/man/CST_Analogs.Rd index b0242f584023df4d157937e7f92d4c972791fa9e..cac70cdc023cf2898a37628c6acc4320fc540e45 100644 --- a/man/CST_Analogs.Rd +++ b/man/CST_Analogs.Rd @@ -25,7 +25,9 @@ large scale for which the analog is aimed. This field is used to in all the criterias. If parameter 'expVar' is not provided, the function will return the expL analog. The element 'data' in the 's2dv_cube' object must have, at least, latitudinal and longitudinal dimensions. The object is expect to be -already subset for the desired large scale region.} +already subset for the desired large scale region. Latitudinal dimension +accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal +dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'.} \item{obsL}{An 's2dv_cube' object containing the observational field on the large scale. The element 'data' in the 's2dv_cube' object must have the same @@ -63,11 +65,11 @@ time_obsL), by default time_expL will be removed during the search of analogs.} \item{time_expL}{A character string indicating the date of the experiment in the same format than time_obsL (i.e. "yyyy-mm-dd"). By default it is NULL -and dates are taken from element \code{$Dates$start} from expL.} +and dates are taken from element \code{$attrs$Dates} from expL.} \item{time_obsL}{A character string indicating the date of the observations in the date format (i.e. "yyyy-mm-dd"). By default it is NULL and dates are -taken from element \code{$Dates$start} from obsL.} +taken from element \code{$attrs$Dates} from obsL.} \item{nAnalogs}{Number of Analogs to be selected to apply the criterias 'Local_dist' or 'Local_cor'. This is not the necessary the number of analogs @@ -126,17 +128,22 @@ function within 'CSTools' package. } \examples{ expL <- rnorm(1:200) -dim(expL) <- c(member = 10,lat = 4, lon = 5) -obsL <- c(rnorm(1:180),expL[1,,]*1.2) -dim(obsL) <- c(time = 10,lat = 4, lon = 5) -time_obsL <- paste(rep("01", 10), rep("01", 10), 1994:2003, sep = "-") +dim(expL) <- c(member = 10, lat = 4, lon = 5) +obsL <- c(rnorm(1:180), expL[1, , ]*1.2) +dim(obsL) <- c(time = 10, lat = 4, lon = 5) +time_obsL <- as.POSIXct(paste(rep("01", 10), rep("01", 10), 1994:2003, sep = "-"), + format = "\%d-\%m-\%y") +dim(time_obsL) <- c(time = 10) time_expL <- time_obsL[1] -lon <- seq(-1,5,1.5) -lat <- seq(30,35,1.5) -expL <- s2dv_cube(data = expL, lat = lat, lon = lon, - Dates = list(start = time_expL, end = time_expL)) -obsL <- s2dv_cube(data = obsL, lat = lat, lon = lon, - Dates = list(start = time_obsL, end = time_obsL)) +lon <- seq(-1, 5, 1.5) +lat <- seq(30, 35, 1.5) +coords <- list(lon = seq(-1, 5, 1.5), lat = seq(30, 35, 1.5)) +attrs_expL <- list(Dates = time_expL) +attrs_obsL <- list(Dates = time_obsL) +expL <- list(data = expL, coords = coords, attrs = attrs_expL) +obsL <- list(data = obsL, coords = coords, attrs = attrs_obsL) +class(expL) <- 's2dv_cube' +class(obsL) <- 's2dv_cube' region <- c(min(lon), max(lon), min(lat), max(lat)) downscaled_field <- CST_Analogs(expL = expL, obsL = obsL, region = region) @@ -148,7 +155,7 @@ from surface pressure using analogues. Clim. Dyn., 41, 1419-1437. \email{pascal.yiou@lsce.ipsl.fr} } \seealso{ -code{\link{CST_Load}}, \code{\link[s2dv]{Load}} and +\code{\link{CST_Load}}, \code{\link[s2dv]{Load}} and \code{\link[s2dv]{CDORemap}} } \author{ diff --git a/man/CST_AnalogsPredictors.Rd b/man/CST_AnalogsPredictors.Rd index 152b0c8a7060026f06ff1c72b04b0a9b2ae12015..c2196534ba3efdad6d88e6419f323a83a9dbf83a 100644 --- a/man/CST_AnalogsPredictors.Rd +++ b/man/CST_AnalogsPredictors.Rd @@ -25,124 +25,138 @@ CST_AnalogsPredictors( ) } \arguments{ -\item{exp}{List of arrays with downscaled period seasonal forecast data. The list -has to contain model atmospheric variables (instantaneous 12h data) that must -be indentify by parenthesis name. -For precipitation: -- u component of wind at 500 hPa (u500_mod) in m/s -- v component of wind at 500 hPa (v500_mod) in m/s -- temperature at 500 hPa (t500_mod) in K -- temperature at 850 hPa (t850_mod) in K -- specific humidity at 700 hPa (q700_mod) in g/kg +\item{exp}{List of arrays with downscaled period seasonal forecast data. The +list has to contain model atmospheric variables (instantaneous 12h data) +that must be indentify by parenthesis name. For precipitation: +\itemize{ + \item{u component of wind at 500 hPa (u500_mod) in m/s.} + \item{v component of wind at 500 hPa (v500_mod) in m/s.} + \item{temperature at 500 hPa (t500_mod) in K.} + \item{temperature at 850 hPa (t850_mod) in K.} + \item{specific humidity at 700 hPa (q700_mod) in g/kg. } +} For temperature: -- u component of wind at 500 hPa (u500_mod) in m/s -- v component of wind at 500 hPa (v500_mod) in m/s -- temperature at 500 hPa (t500_mod) in K -- temperature at 700 hPa (t700_mod) in K -- temperature at 850 hPa (t850_mod) in K -- specific humidity at 700 hPa (q700_mod) in g/kg -- 2 meters temperature (tm2m_mod) in K -The arrays must have at least three dimensions with names 'lon', 'lat' and 'time'. -(lon = gridpoints of longitude, lat = gridpoints of latitude, time = number of downscaling days) -Seasonal forecast variables must have the same resolution and -domain as reanalysis variables ('obs' parameter, below).} +\itemize{ + \item{u component of wind at 500 hPa (u500_mod) in m/s.} + \item{v component of wind at 500 hPa (v500_mod) in m/s.} + \item{temperature at 500 hPa (t500_mod) in K.} + \item{temperature at 700 hPa (t700_mod) in K. } + \item{temperature at 850 hPa (t850_mod) in K.} + \item{specific humidity at 700 hPa (q700_mod) in g/kg. } + \item{2 meters temperature (tm2m_mod) in K.} +} +The arrays must have at least three dimensions with names 'lon', 'lat' and +'time'. (lon = gridpoints of longitude, lat = gridpoints of latitude, +time = number of downscaling days) Seasonal forecast variables must have the +same resolution and domain as reanalysis variables ('obs' parameter, below).} \item{slp}{Array with atmospheric seasonal forecast model sea level pressure -(instantaneous 12h data) that must be indentify as 'slp' (hPa). It has the same -resolution as 'exp' and 'obs' paremeters but with an extended domain. -This domain contains extra degrees (most in the north and west part) compare to -synoptic domain. The array must have at least three dimensions -with names 'lon', 'lat' and 'time'.} +(instantaneous 12h data) that must be indentify as 'slp' (hPa). It has the +same resolution as 'exp' and 'obs' paremeters but with an extended domain. +This domain contains extra degrees (most in the north and west part) compare +to synoptic domain. The array must have at least three dimensions with +names 'lon', 'lat' and 'time'.} \item{obs}{List of arrays with training period reanalysis data. The list has to contain reanalysis atmospheric variables (instantaneous -12h data) that must be indentify by parenthesis name. -For precipitation: -- u component of wind at 500 hPa (u500) in m/s -- v component of wind at 500 hPa (v500) in m/s -- temperature at 500 hPa (t500) in K -- temperature at 850 hPa (t850) in K -- sea level pressure (slp) in hPa -- specific humidity at 700 hPa (q700) in g/kg +12h data) that must be indentify by parenthesis name. For precipitation: +\itemize{ + \item{u component of wind at 500 hPa (u500) in m/s.} + \item{v component of wind at 500 hPa (v500) in m/s.} + \item{temperature at 500 hPa (t500) in K.} + \item{temperature at 850 hPa (t850) in K.} + \item{sea level pressure (slp) in hPa.} + \item{specific humidity at 700 hPa (q700) in g/kg.} +} For maximum and minimum temperature: -- u component of wind at 500 hPa (u500) in m/s -- v component of wind at 500 hPa (v500) in m/s -- temperature at 500 hPa (t500) in K -- temperature at 700 hPa (t700) in K -- temperature at 850 hPa (t850) in K -- sea level pressure (slp) in hPa -- specific humidity at 700 hPa (q700) in g/kg -- 2 meters temperature (tm2m) in K -The arrays must have at least three dimensions with names 'lon', 'lat' and 'time'.} +\itemize{ + \item{u component of wind at 500 hPa (u500) in m/s.} + \item{v component of wind at 500 hPa (v500) in m/s.} + \item{temperature at 500 hPa (t500) in K.} + \item{temperature at 700 hPa (t700) in K.} + \item{temperature at 850 hPa (t850) in K.} + \item{sea level pressure (slp) in hPa.} + \item{specific humidity at 700 hPa (q700) in g/kg} + \item{2 meters temperature (tm2m) in K} +} +The arrays must have at least three dimensions with names 'lon', 'lat' and +'time'.} \item{lon}{Vector of the synoptic longitude (from (-180º) to 180º), The vector must go from west to east. The same as for the training function.} -\item{lat}{Vector of the synoptic latitude. The vector must go from north to south. -The same as for the training function.} +\item{lat}{Vector of the synoptic latitude. The vector must go from north to +south. The same as for the training function.} \item{slp_lon}{Vector of the extended longitude (from (-180º) to 180º), The vector must go from west to east. The same as for the training function.} -\item{slp_lat}{Vector of the extended latitude. The vector must go from north to south. -The same as for the training function.} +\item{slp_lat}{Vector of the extended latitude. The vector must go from north +to south. The same as for the training function.} \item{var_name}{Variable name to downscale. There are two options: 'prec' for precipitation and 'temp' for maximum and minimum temperature.} \item{hr_obs}{Local path of HR observational files (maestro and pcp/tmx-tmn). -For precipitation can be downloaded from http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/v2/Serie_AEMET_v2_pcp_1951a202006_txt.tar.gz -For maximum and minimum temperature can be downloaded from http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/temperatura/v1/tmax/Serie_AEMET_v1_tmax_1951a202006_txt.tar.gz and http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/temperatura/v1/tmin/Serie_AEMET_v1_tmin_1951a202006_txt.tar.gz respetively. -Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), longitude (lon), latitude (lat) and -altitude (alt) in columns (vector structure). -Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km resolution spanish daily data -(precipitation or maximum and minimum temperature from january 1951 to june 2020. See README -file for more information. -IMPORTANT!: HR observational period must be the same as for reanalysis variables. -It is assumed that the training period is smaller than the HR original one (1951-2019), so it is -needed to make a new ascii file with the new period and the same structure as original, -specifying the training dates in the name (e.g. 'pcp_red_SPAIN_19810101-19961231.txt' for -'19810101-19961231' period).} +For precipitation and temperature can be downloaded from the following link: +\url{https://www.aemet.es/en/serviciosclimaticos/cambio_climat/datos_diarios?w=2} +respetively. Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), +longitude (lon), latitude (lat) and altitude (alt) in columns (vector +structure). Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km +resolution spanish daily data (precipitation or maximum and minimum +temperature from january 1951 to june 2020. See README file for more +information. IMPORTANT!: HR observational period must be the same as for +reanalysis variables. It is assumed that the training period is smaller than +the HR original one (1951-2019), so it is needed to make a new ascii file +with the new period and the same structure as original, specifying the +training dates in the name +(e.g. 'pcp_red_SPAIN_19810101-19961231.txt' for '19810101-19961231' period).} \item{tdates}{Training period dates in format YYYYMMDD(start)-YYYYMMDD(end) (e.g. 19810101-20181231).} -\item{ddates}{Downscaling period dates in format YYYYMMDD(start)-YYYYMMDD(end) (e.g. 20191001-20200331).} +\item{ddates}{Downscaling period dates in format YYYYMMDD(start)-YYYYMMDD(end) +(e.g. 20191001-20200331).} -\item{restrain}{Output (list of matrix) obtained from 'training_analogs' function. -For precipitation, 'restrain' object must contains um, vm, nger, gu92, gv92, -gu52, gv52, neni, vdmin, vref, ccm, lab_pred and cor_pred variables. -For maximum and minimum temperature, 'restrain' object must contains um, vm, -insol, neni, vdmin y vref. See 'AnalogsPred_train.R' for more information.} +\item{restrain}{Output (list of matrix) obtained from 'training_analogs' +function. For precipitation, 'restrain' object must contains um, vm, nger, +gu92, gv92, gu52, gv52, neni, vdmin, vref, ccm, lab_pred and cor_pred +variables. For maximum and minimum temperature, 'restrain' object must +contains um, vm, insol, neni, vdmin y vref. See 'AnalogsPred_train.R' for +more information.} -\item{dim_name_longitude}{A character string indicating the name of the longitude -dimension, by default 'longitude'.} +\item{dim_name_longitude}{A character string indicating the name of the +longitude dimension, by default 'longitude'.} -\item{dim_name_latitude}{A character string indicating the name of the latitude -dimension, by default 'latitude'.} +\item{dim_name_latitude}{A character string indicating the name of the +latitude dimension, by default 'latitude'.} \item{dim_name_time}{A character string indicating the name of the time dimension, by default 'time'.} } \value{ -Matrix with seasonal forecast precipitation (mm) or -maximum and minimum temperature (dozens of ºC) in a 5km x 5km regular grid -over peninsular Spain and Balearic Islands. The resulted matrices have two -dimensions ('ddates' x 'nptos').(ddates = number of downscaling days -and nptos = number of 'hr_obs' gridpoints). +Matrix with seasonal forecast precipitation (mm) or maximum and +minimum temperature (dozens of ºC) in a 5km x 5km regular grid over peninsular +Spain and Balearic Islands. The resulted matrices have two dimensions +('ddates' x 'nptos').(ddates = number of downscaling days and nptos = number +of 'hr_obs' gridpoints). } \description{ -This function downscales low resolution precipitation data (e.g. from -Seasonal Forecast Models) through the association with an observational high -resolution (HR) dataset (AEMET 5 km gridded data of daily precipitation (Peral et al., 2017)) -and a collection of predictors and past synoptic situations similar to estimated day. -The method uses three domains: -- peninsular Spain and Balearic Islands domain (5 km resolution): HR precipitation - and the downscaling result domain. -- synoptic domain (low resolution, e.g. 1.5º x 1.5º): it should be centered over Iberian Peninsula - and cover enough extension to detect as much synoptic situations as possible. -- extended domain (low resolution, e.g. 1.5º x 1.5º): it should have the same resolution -as synoptic domain. It is used for SLP Seasonal Forecast Models. +This function downscales low resolution precipitation data (e.g. +from Seasonal Forecast Models) through the association with an observational +high resolution (HR) dataset (AEMET 5 km gridded data of daily precipitation +(Peral et al., 2017)) and a collection of predictors and past synoptic +situations similar to estimated day. The method uses three domains: +\itemize{ + \item{Peninsular Spain and Balearic Islands domain (5 km resolution): HR precipitation + and the downscaling result domain.} + \item{Synoptic domain (low resolution, e.g. 1.5º x 1.5º): it should be + centered over Iberian Peninsula and cover enough extension to detect + as much synoptic situations as possible.} + \item{Extended domain (low resolution, e.g. 1.5º x 1.5º): it should have the + same resolution as synoptic domain. It is used for SLP Seasonal + Forecast Models.} +} } \author{ Marta Dominguez Alonso - AEMET, \email{mdomingueza@aemet.es} diff --git a/man/CST_Anomaly.Rd b/man/CST_Anomaly.Rd index 3af85b5fc6ca2d742cf29bd6e0a1159f54dc088e..5a2784e1dfb9387a4c5a4b70c56b9d44b55ccca8 100644 --- a/man/CST_Anomaly.Rd +++ b/man/CST_Anomaly.Rd @@ -2,7 +2,8 @@ % Please edit documentation in R/CST_Anomaly.R \name{CST_Anomaly} \alias{CST_Anomaly} -\title{Anomalies relative to a climatology along selected dimension with or without cross-validation} +\title{Anomalies relative to a climatology along selected dimension with or without +cross-validation} \usage{ CST_Anomaly( exp = NULL, @@ -68,28 +69,24 @@ computation is carried out independently for experimental and observational data products. } \examples{ -# Example 1: mod <- 1 : (2 * 3 * 4 * 5 * 6 * 7) dim(mod) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) obs <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) -exp <- list(data = mod, lat = lat, lon = lon) -obs <- list(data = obs, lat = lat, lon = lon) +coords <- list(lon = lon, lat = lat) +exp <- list(data = mod, coords = coords) +obs <- list(data = obs, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' -anom1 <- CST_Anomaly(exp = exp, obs = obs, cross = FALSE, memb = TRUE) -anom2 <- CST_Anomaly(exp = exp, obs = obs, cross = TRUE, memb = TRUE) -anom3 <- CST_Anomaly(exp = exp, obs = obs, cross = TRUE, memb = FALSE) -anom4 <- CST_Anomaly(exp = exp, obs = obs, cross = FALSE, memb = FALSE) -anom5 <- CST_Anomaly(lonlat_temp$exp) -anom6 <- CST_Anomaly(obs = lonlat_temp$obs) +anom <- CST_Anomaly(exp = exp, obs = obs, cross = FALSE, memb = TRUE) } \seealso{ -\code{\link[s2dv]{Ano_CrossValid}}, \code{\link[s2dv]{Clim}} and \code{\link{CST_Load}} +\code{\link[s2dv]{Ano_CrossValid}}, \code{\link[s2dv]{Clim}} and +\code{\link{CST_Load}} } \author{ Perez-Zanon Nuria, \email{nuria.perez@bsc.es} diff --git a/man/CST_BEI_Weighting.Rd b/man/CST_BEI_Weighting.Rd index d6f65bb52123d014377037e435f5c33893756b9f..fa9cf10bef068829f4c79162ae12c6a81fcc01ff 100644 --- a/man/CST_BEI_Weighting.Rd +++ b/man/CST_BEI_Weighting.Rd @@ -9,7 +9,8 @@ CST_BEI_Weighting( aweights, terciles = NULL, type = "ensembleMean", - time_dim_name = "time" + time_dim_name = "time", + memb_dim = "member" ) } \arguments{ @@ -36,16 +37,18 @@ or four dimensions depending if the variable is spatially aggregated variable (as e.g. precipitation or temperature), dimension (time, tercile, lat, lon), containing the terciles probabilities computing with weighted members. The first tercil is the lower tercile, the second is the normal tercile and -the third is the upper tercile. -If 'type' = 'ensembleMean', the function returns, in the element data from -'var_exp' parameter, an array with at least one or three dimensions -depending if the variable is a spatially aggregated variable -(as e.g. NAO index)(time) or it is spatial variable (as e.g. precipitation -or temperature) (time, lat, lon), containing the ensemble means computing -with weighted members.} +the third is the upper tercile. If 'type' = 'ensembleMean', the function +returns, in the element data from 'var_exp' parameter, an array with at +least one or three dimensions depending if the variable is a spatially +aggregated variable (as e.g. NAO index)(time) or it is spatial variable (as +e.g. precipitation or temperature) (time, lat, lon), containing the ensemble +means computing with weighted members.} \item{time_dim_name}{A character string indicating the name of the temporal dimension, by default 'time'.} + +\item{memb_dim}{A character string indicating the name of the +member dimension, by default 'member'.} } \value{ CST_BEI_Weighting() returns a CSTools object (i.e., of the @@ -64,19 +67,17 @@ Systems (SFSs). \examples{ var_exp <- 1 : (2 * 4 * 3 * 2) dim(var_exp) <- c(time = 2, member = 4, lat = 3, lon = 2) -aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3, 0.1, 0.2, 0.4, 0.4, 0.1, 0.2, 0.4, 0.2) +aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3, 0.1, 0.2, 0.4, 0.4, 0.1, + 0.2, 0.4, 0.2) dim(aweights) <- c(time = 2, member = 4, dataset = 2) var_exp <- list(data = var_exp) class(var_exp) <- 's2dv_cube' res_CST <- CST_BEI_Weighting(var_exp, aweights) -dim(res_CST$data) -# time lat lon dataset -# 2 3 2 2 } \references{ Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., -Adv. Sci. Res., 16, 165174, 2019, https://doi.org/10.5194/asr-16-165-2019 +Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} } \author{ Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} diff --git a/man/CST_BiasCorrection.Rd b/man/CST_BiasCorrection.Rd index fb96babd17b91ea27c8b4c4c0d794f33d8c5a734..4de9257784afa9affd1f9ca9f45b5f2425fe421b 100644 --- a/man/CST_BiasCorrection.Rd +++ b/man/CST_BiasCorrection.Rd @@ -11,20 +11,26 @@ CST_BiasCorrection( na.rm = FALSE, memb_dim = "member", sdate_dim = "sdate", + dat_dim = NULL, ncores = NULL ) } \arguments{ \item{exp}{An object of class \code{s2dv_cube} as returned by \code{CST_Load} function, containing the seasonal forecast experiment data in the element -named \code{$data}} +named \code{$data} with at least time and member dimensions.} \item{obs}{An object of class \code{s2dv_cube} as returned by \code{CST_Load} -function, containing the observed data in the element named \code{$data}.} +function, containing the observed data in the element named \code{$data} +with at least time dimension.} \item{exp_cor}{An object of class \code{s2dv_cube} as returned by -\code{CST_Load} function, containing the seasonl forecast experiment to be -corrected. If it is NULL, the 'exp' forecast will be corrected.} +\code{CST_Load} function, containing the seasonal forecast experiment to be +corrected with at least time dimension. If it is NULL, the 'exp' forecast +will be corrected. If there is only one corrected dataset, it should not +have dataset dimension. If there is a corresponding corrected dataset for +each 'exp' forecast, the dataset dimension must have the same length as in +'exp'. The default value is NULL.} \item{na.rm}{A logical value indicating whether missing values should be stripped before the computation proceeds, by default it is set to FALSE.} @@ -35,12 +41,20 @@ dimension. By default, it is set to 'member'.} \item{sdate_dim}{A character string indicating the name of the start date dimension. By default, it is set to 'sdate'.} +\item{dat_dim}{A character string indicating the name of dataset dimension. +The length of this dimension can be different between 'exp' and 'obs'. +The default value is NULL.} + \item{ncores}{An integer that indicates the number of cores for parallel computations using multiApply function. The default value is NULL.} } \value{ An object of class \code{s2dv_cube} containing the bias corrected -forecasts with the same dimensions of the experimental data. +forecasts with the dimensions nexp, nobs and same dimensions as in the 'exp' +object. nexp is the number of experiment (i.e., 'dat_dim' in exp), and nobs is +the number of observation (i.e., 'dat_dim' in obs). If dat_dim is NULL, nexp +and nobs are omitted. If 'exp_cor' is provided the returned array will be with +the same dimensions as 'exp_cor'. } \description{ This function applies the simple bias adjustment technique @@ -54,8 +68,9 @@ obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) -exp <- list(data = mod1, lat = lat, lon = lon) -obs <- list(data = obs1, lat = lat, lon = lon) +coords <- list(lat = lat, lon = lon) +exp <- list(data = mod1, coords = coords) +obs <- list(data = obs1, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' a <- CST_BiasCorrection(exp = exp, obs = obs) @@ -64,7 +79,7 @@ a <- CST_BiasCorrection(exp = exp, obs = obs) Torralba, V., F.J. Doblas-Reyes, D. MacLeod, I. Christel and M. Davis (2017). Seasonal climate prediction: a new source of information for the management of wind energy resources. Journal of Applied Meteorology and -Climatology, 56, 1231-1247, doi:10.1175/JAMC-D-16-0204.1. (CLIM4ENERGY, +Climatology, 56, 1231-1247, \doi{10.1175/JAMC-D-16-0204.1}. (CLIM4ENERGY, EUPORIAS, NEWA, RESILIENCE, SPECS) } \author{ diff --git a/man/CST_Calibration.Rd b/man/CST_Calibration.Rd index 73cac8ab543a701708cab5c0f2d5b936152043a4..c8d1320f8e674f6c5d89ef9353c7a487010e94d8 100644 --- a/man/CST_Calibration.Rd +++ b/man/CST_Calibration.Rd @@ -17,41 +17,115 @@ CST_Calibration( alpha = NULL, memb_dim = "member", sdate_dim = "sdate", - ncores = 1 + dat_dim = NULL, + ncores = NULL ) } \arguments{ -\item{exp}{an object of class \code{s2dv_cube} as returned by \code{CST_Load} function, containing the seasonal hindcast experiment data in the element named \code{$data}. The hindcast is used to calibrate the forecast in case the forecast is provided; if not, the same hindcast will be calibrated instead.} +\item{exp}{An object of class \code{s2dv_cube} as returned by \code{CST_Load} +function with at least 'sdate' and 'member' dimensions, containing the +seasonal hindcast experiment data in the element named \code{data}. The +hindcast is used to calibrate the forecast in case the forecast is provided; +if not, the same hindcast will be calibrated instead.} -\item{obs}{an object of class \code{s2dv_cube} as returned by \code{CST_Load} function, containing the observed data in the element named \code{$data}.} +\item{obs}{An object of class \code{s2dv_cube} as returned by \code{CST_Load} +function with at least 'sdate' dimension, containing the observed data in +the element named \code{$data}.} -\item{exp_cor}{an optional object of class \code{s2dv_cube} as returned by \code{CST_Load} function, containing the seasonal forecast experiment data in the element named \code{$data}. If the forecast is provided, it will be calibrated using the hindcast and observations; if not, the hindcast will be calibrated instead.} +\item{exp_cor}{An optional object of class \code{s2dv_cube} as returned by +\code{CST_Load} function with at least 'sdate' and 'member' dimensions, +containing the seasonal forecast experiment data in the element named +\code{data}. If the forecast is provided, it will be calibrated using the +hindcast and observations; if not, the hindcast will be calibrated instead. +If there is only one corrected dataset, it should not have dataset dimension. +If there is a corresponding corrected dataset for each 'exp' forecast, the +dataset dimension must have the same length as in 'exp'. The default value +is NULL.} -\item{cal.method}{is the calibration method used, can be either \code{bias}, \code{evmos}, \code{mse_min}, \code{crps_min} or \code{rpc-based}. Default value is \code{mse_min}.} +\item{cal.method}{A character string indicating the calibration method used, +can be either \code{bias}, \code{evmos}, \code{mse_min}, \code{crps_min} or +\code{rpc-based}. Default value is \code{mse_min}.} -\item{eval.method}{is the sampling method used, can be either \code{in-sample} or \code{leave-one-out}. Default value is the \code{leave-one-out} cross validation. In case the forecast is provided, any chosen eval.method is over-ruled and a third option is used.} +\item{eval.method}{A character string indicating the sampling method used, it +can be either \code{in-sample} or \code{leave-one-out}. Default value is the +\code{leave-one-out} cross validation. In case the forecast is provided, any +chosen eval.method is over-ruled and a third option is used.} -\item{multi.model}{is a boolean that is used only for the \code{mse_min} method. If multi-model ensembles or ensembles of different sizes are used, it must be set to \code{TRUE}. By default it is \code{FALSE}. Differences between the two approaches are generally small but may become large when using small ensemble sizes. Using multi.model when the calibration method is \code{bias}, \code{evmos} or \code{crps_min} will not affect the result.} +\item{multi.model}{A boolean that is used only for the \code{mse_min} +method. If multi-model ensembles or ensembles of different sizes are used, +it must be set to \code{TRUE}. By default it is \code{FALSE}. Differences +between the two approaches are generally small but may become large when +using small ensemble sizes. Using multi.model when the calibration method is +\code{bias}, \code{evmos} or \code{crps_min} will not affect the result.} -\item{na.fill}{is a boolean that indicates what happens in case calibration is not possible or will yield unreliable results. This happens when three or less forecasts-observation pairs are available to perform the training phase of the calibration. By default \code{na.fill} is set to true such that NA values will be returned. If \code{na.fill} is set to false, the uncorrected data will be returned.} +\item{na.fill}{A boolean that indicates what happens in case calibration is +not possible or will yield unreliable results. This happens when three or +less forecasts-observation pairs are available to perform the training phase +of the calibration. By default \code{na.fill} is set to true such that NA +values will be returned. If \code{na.fill} is set to false, the uncorrected +data will be returned.} -\item{na.rm}{is a boolean that indicates whether to remove the NA values or not. The default value is \code{TRUE}. See Details section for further information about its use and compatibility with \code{na.fill}.} +\item{na.rm}{A boolean that indicates whether to remove the NA values or not. +The default value is \code{TRUE}. See Details section for further +information about its use and compatibility with \code{na.fill}.} -\item{apply_to}{is a character string that indicates whether to apply the calibration to all the forecast (\code{"all"}) or only to those where the correlation between the ensemble mean and the observations is statistically significant (\code{"sign"}). Only useful if \code{cal.method == "rpc-based"}.} +\item{apply_to}{A character string that indicates whether to apply the +calibration to all the forecast (\code{"all"}) or only to those where the +correlation between the ensemble mean and the observations is statistically +significant (\code{"sign"}). Only useful if \code{cal.method == "rpc-based"}.} -\item{alpha}{is a numeric value indicating the significance level for the correlation test. Only useful if \code{cal.method == "rpc-based" & apply_to == "sign"}.} +\item{alpha}{A numeric value indicating the significance level for the +correlation test. Only useful if \code{cal.method == "rpc-based" & apply_to +== "sign"}.} -\item{memb_dim}{is a character string indicating the name of the member dimension. By default, it is set to 'member'.} +\item{memb_dim}{A character string indicating the name of the member dimension. +By default, it is set to 'member'.} -\item{sdate_dim}{is a character string indicating the name of the start date dimension. By default, it is set to 'sdate'.} +\item{sdate_dim}{A character string indicating the name of the start date +dimension. By default, it is set to 'sdate'.} -\item{ncores}{is an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one.} +\item{dat_dim}{A character string indicating the name of dataset dimension. +The length of this dimension can be different between 'exp' and 'obs'. +The default value is NULL.} + +\item{ncores}{An integer that indicates the number of cores for parallel +computations using multiApply function. The default value is one.} } \value{ -an object of class \code{s2dv_cube} containing the calibrated forecasts in the element \code{$data} with the same dimensions as the one in the exp object. +An object of class \code{s2dv_cube} containing the calibrated +forecasts in the element \code{data} with the dimensions nexp, nobs and same +dimensions as in the 'exp' object. nexp is the number of experiment +(i.e., 'dat_dim' in exp), and nobs is the number of observation (i.e., +'dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. If 'exp_cor' +is provided the returned array will be with the same dimensions as 'exp_cor'. } \description{ -Equivalent to function \code{Calibration} but for objects of class \code{s2dv_cube}. +Five types of member-by-member bias correction can be performed. +The \code{"bias"} method corrects the bias only, the \code{"evmos"} method +applies a variance inflation technique to ensure the correction of the bias +and the correspondence of variance between forecast and observation (Van +Schaeybroeck and Vannitsem, 2011). The ensemble calibration methods +\code{"mse_min"} and \code{"crps_min"} correct the bias, the overall forecast +variance and the ensemble spread as described in Doblas-Reyes et al. (2005) +and Van Schaeybroeck and Vannitsem (2015), respectively. While the +\code{"mse_min"} method minimizes a constrained mean-squared error using three +parameters, the \code{"crps_min"} method features four parameters and +minimizes the Continuous Ranked Probability Score (CRPS). The +\code{"rpc-based"} method adjusts the forecast variance ensuring that the +ratio of predictable components (RPC) is equal to one, as in Eade et al. +(2014). It is equivalent to function \code{Calibration} but for objects +of class \code{s2dv_cube}. +} +\details{ +Both the \code{na.fill} and \code{na.rm} parameters can be used to +indicate how the function has to handle the NA values. The \code{na.fill} +parameter checks whether there are more than three forecast-observations pairs +to perform the computation. In case there are three or less pairs, the +computation is not carried out, and the value returned by the function depends +on the value of this parameter (either NA if \code{na.fill == TRUE} or the +uncorrected value if \code{na.fill == TRUE}). On the other hand, \code{na.rm} +is used to indicate the function whether to remove the missing values during +the computation of the parameters needed to perform the calibration. } \examples{ # Example 1: @@ -61,12 +135,12 @@ obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) -exp <- list(data = mod1, lat = lat, lon = lon) -obs <- list(data = obs1, lat = lat, lon = lon) +coords <- list(lat = lat, lon = lon) +exp <- list(data = mod1, coords = coords) +obs <- list(data = obs1, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' a <- CST_Calibration(exp = exp, obs = obs, cal.method = "mse_min", eval.method = "in-sample") -str(a) # Example 2: mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) @@ -77,14 +151,34 @@ obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) -exp <- list(data = mod1, lat = lat, lon = lon) -obs <- list(data = obs1, lat = lat, lon = lon) +coords <- list(lat = lat, lon = lon) +exp <- list(data = mod1, coords = coords) +obs <- list(data = obs1, coords = coords) exp_cor <- list(data = mod2, lat = lat, lon = lon) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' attr(exp_cor, 'class') <- 's2dv_cube' a <- CST_Calibration(exp = exp, obs = obs, exp_cor = exp_cor, cal.method = "evmos") -str(a) + +} +\references{ +Doblas-Reyes F.J, Hagedorn R, Palmer T.N. The rationale behind the +success of multi-model ensembles in seasonal forecasting-II calibration and +combination. Tellus A. 2005;57:234-252. \doi{10.1111/j.1600-0870.2005.00104.x} + +Eade, R., Smith, D., Scaife, A., Wallace, E., Dunstone, N., +Hermanson, L., & Robinson, N. (2014). Do seasonal-to-decadal climate +predictions underestimate the predictability of the read world? Geophysical +Research Letters, 41(15), 5620-5628. \doi{10.1002/2014GL061146} + +Van Schaeybroeck, B., & Vannitsem, S. (2011). Post-processing +through linear regression. Nonlinear Processes in Geophysics, 18(2), +147. \doi{10.5194/npg-18-147-2011} + +Van Schaeybroeck, B., & Vannitsem, S. (2015). Ensemble +post-processing using member-by-member approaches: theoretical aspects. +Quarterly Journal of the Royal Meteorological Society, 141(688), 807-818. +\doi{10.1002/qj.2397} } \seealso{ \code{\link{CST_Load}} diff --git a/man/CST_CategoricalEnsCombination.Rd b/man/CST_CategoricalEnsCombination.Rd index c23f8341c201921984cdebb6bf05a43997042752..85ebb7f8f00c0c393418d341af002cfee464a726 100644 --- a/man/CST_CategoricalEnsCombination.Rd +++ b/man/CST_CategoricalEnsCombination.Rd @@ -2,7 +2,8 @@ % Please edit documentation in R/CST_CategoricalEnsCombination.R \name{CST_CategoricalEnsCombination} \alias{CST_CategoricalEnsCombination} -\title{Make categorical forecast based on a multi-model forecast with potential for calibrate} +\title{Make categorical forecast based on a multi-model forecast with potential for +calibrate} \usage{ CST_CategoricalEnsCombination( exp, @@ -14,87 +15,106 @@ CST_CategoricalEnsCombination( ) } \arguments{ -\item{exp}{an object of class \code{s2dv_cube} as returned by \code{CST_Load} function, containing the seasonal forecast experiment data in the element named \code{$data}. The amount of forecasting models is equal to the size of the \code{dataset} dimension of the data array. The amount of members per model may be different. The size of the \code{member} dimension of the data array is equal to the maximum of the ensemble members among the models. Models with smaller ensemble sizes have residual indices of \code{member} dimension in the data array filled with NA values.} +\item{exp}{An object of class \code{s2dv_cube} as returned by \code{CST_Load} +function, containing the seasonal forecast experiment data in the element +named \code{$data}. The amount of forecasting models is equal to the size of +the \code{dataset} dimension of the data array. The amount of members per +model may be different. The size of the \code{member} dimension of the data +array is equal to the maximum of the ensemble members among the models. +Models with smaller ensemble sizes have residual indices of \code{member} +dimension in the data array filled with NA values.} -\item{obs}{an object of class \code{s2dv_cube} as returned by \code{CST_Load} function, containing the observed data in the element named \code{$data}.} +\item{obs}{An object of class \code{s2dv_cube} as returned by \code{CST_Load} +function, containing the observed data in the element named \code{$data}.} -\item{cat.method}{method used to produce the categorical forecast, can be either \code{pool}, \code{comb}, \code{mmw} or \code{obs}. The method pool assumes equal weight for all ensemble members while the method comb assumes equal weight for each model. The weighting method is descirbed in Rajagopalan et al. (2002), Robertson et al. (2004) and Van Schaeybroeck and Vannitsem (2019). Finally, the \code{obs} method classifies the observations into the different categories and therefore contains only 0 and 1 values.} +\item{cat.method}{Method used to produce the categorical forecast, can be +either \code{pool}, \code{comb}, \code{mmw} or \code{obs}. The method pool +assumes equal weight for all ensemble members while the method comb assumes +equal weight for each model. The weighting method is descirbed in +Rajagopalan et al. (2002), Robertson et al. (2004) and Van Schaeybroeck and +Vannitsem (2019). Finally, the \code{obs} method classifies the observations +into the different categories and therefore contains only 0 and 1 values.} -\item{eval.method}{is the sampling method used, can be either \code{"in-sample"} or \code{"leave-one-out"}. Default value is the \code{"leave-one-out"} cross validation.} +\item{eval.method}{Is the sampling method used, can be either +\code{"in-sample"} or \code{"leave-one-out"}. Default value is the +\code{"leave-one-out"} cross validation.} -\item{amt.cat}{is the amount of categories. Equally-sized quantiles will be calculated based on the amount of categories.} +\item{amt.cat}{Is the amount of categories. Equally-sized quantiles will be +calculated based on the amount of categories.} \item{...}{other parameters to be passed on to the calibration procedure.} } \value{ -an object of class \code{s2dv_cube} containing the categorical forecasts in the element called \code{$data}. The first two dimensions of the returned object are named dataset and member and are both of size one. An additional dimension named category is introduced and is of size amt.cat. +An object of class \code{s2dv_cube} containing the categorical +forecasts in the element called \code{$data}. The first two dimensions of the +returned object are named dataset and member and are both of size one. An +additional dimension named category is introduced and is of size amt.cat. } \description{ -This function converts a multi-model ensemble forecast -into a categorical forecast by giving the probability -for each category. Different methods are available to combine -the different ensemble forecasting models into -probabilistic categorical forecasts. +This function converts a multi-model ensemble forecast into a +categorical forecast by giving the probability for each category. Different +methods are available to combine the different ensemble forecasting models +into probabilistic categorical forecasts. -Motivation: -Beyond the short range, the unpredictable component of weather -predictions becomes substantial due to the chaotic nature of the earth -system. Therefore, predictions can mostly be skillful when used in a probabilistic sense. -In practice this is done using ensemble forecasts. It is then common to -convert the ensemble forecasts to occurence probabilities for different categories. -These categories typically are taken as terciles from climatolgical distributions. -For instance for temperature, there is a cold, normal and warm class. -Commonly multiple ensemble forecasting systems -are available but some models may be more competitive than others -for the variable, region and user need under consideration. Therefore, -when calculating the category probabilities, the ensemble members of -the different forecasting system may be differently weighted. -Such weighting is typically done by comparison of the ensemble forecasts -with observations. +Motivation: Beyond the short range, the unpredictable component of weather +predictions becomes substantial due to the chaotic nature of the earth system. +Therefore, predictions can mostly be skillful when used in a probabilistic +sense. In practice this is done using ensemble forecasts. It is then common to +convert the ensemble forecasts to occurence probabilities for different +categories. These categories typically are taken as terciles from +climatolgical distributions. For instance for temperature, there is a cold, +normal and warm class. Commonly multiple ensemble forecasting systems are +available but some models may be more competitive than others for the +variable, region and user need under consideration. Therefore, when +calculating the category probabilities, the ensemble members of the different +forecasting system may be differently weighted. Such weighting is typically +done by comparison of the ensemble forecasts with observations. -Description of the tool: -The tool considers all forecasts (all members from all forecasting systems) -and converts them into occurrence probabilities of different categories. -The amount of categories can be changed and are taken as the -climatological quantiles (e.g. terciles), extracted -from the observational data. -The methods that are available to combine the ensemble forecasting models into -probabilistic categorical forecasts are: 1) ensemble pooling where +Description of the tool: The tool considers all forecasts (all members from +all forecasting systems) and converts them into occurrence probabilities of +different categories. The amount of categories can be changed and are taken as +the climatological quantiles (e.g. terciles), extracted from the observational +data. The methods that are available to combine the ensemble forecasting +models into probabilistic categorical forecasts are: 1) ensemble pooling where all ensemble members of all ensemble systems are weighted equally, 2) model combination where each model system is weighted equally, and, 3) model weighting. The model weighting method is described in Rajagopalan et al. (2002), -Robertson et al. 2004 and Van Schaeybroeck and Vannitsem (2019). -More specifically, this method uses different weights for the -occurence probability predicted by the available models and by a climatological model -and optimizes the weights by minimizing the ignorance score. -Finally, the function can also be used to categorize the observations -in the categorical quantiles. +Robertson et al. 2004 and Van Schaeybroeck and Vannitsem (2019). More +specifically, this method uses different weights for the occurence probability +predicted by the available models and by a climatological model and optimizes +the weights by minimizing the ignorance score. Finally, the function can also +be used to categorize the observations in the categorical quantiles. } \examples{ - -mod1 <- 1 : (2 * 3 * 4 * 5 * 6 * 7) -dim(mod1) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) -mod1[ 2, 3, , , , ] <- NA -dimnames(mod1)[[1]] <- c("MF", "UKMO") -obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) -dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) +mod1 <- 1 : (2 * 2* 4 * 5 * 2 * 2) +dim(mod1) <- c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, lon = 2) +mod1[2, 1, , , , ] <- NA +datasets <- c("MF", "UKMO") +obs1 <- 1 : (1 * 1 * 4 * 5 * 2 * 2) +dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, lon = 2) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) -exp <- list(data = mod1, lat = lat, lon = lon) -obs <- list(data = obs1, lat = lat, lon = lon) +coords <- list(lat = lat, lon = lon) +attrs <- list(Datasets = datasets) +exp <- list(data = mod1, coords = coords, attrs = attrs) +obs <- list(data = obs1, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' -\donttest{ -a <- CST_CategoricalEnsCombination(exp = exp, obs = obs, amt.cat = 3, cat.method = "mmw") -} +a <- CST_CategoricalEnsCombination(exp = exp, obs = obs, amt.cat = 3, + cat.method = "mmw") } \references{ -Rajagopalan, B., Lall, U., & Zebiak, S. E. (2002). Categorical climate forecasts through regularization and optimal combination of multiple GCM ensembles. Monthly Weather Review, 130(7), 1792-1811. +Rajagopalan, B., Lall, U., & Zebiak, S. E. (2002). Categorical +climate forecasts through regularization and optimal combination of multiple +GCM ensembles. Monthly Weather Review, 130(7), 1792-1811. -Robertson, A. W., Lall, U., Zebiak, S. E., & Goddard, L. (2004). Improved combination of multiple atmospheric GCM ensembles for seasonal prediction. Monthly Weather Review, 132(12), 2732-2744. +Robertson, A. W., Lall, U., Zebiak, S. E., & Goddard, L. (2004). +Improved combination of multiple atmospheric GCM ensembles for seasonal +prediction. Monthly Weather Review, 132(12), 2732-2744. -Van Schaeybroeck, B., & Vannitsem, S. (2019). Postprocessing of Long-Range Forecasts. In Statistical Postprocessing of Ensemble Forecasts (pp. 267-290). +Van Schaeybroeck, B., & Vannitsem, S. (2019). Postprocessing of +Long-Range Forecasts. In Statistical Postprocessing of Ensemble Forecasts (pp. 267-290). } \author{ Bert Van Schaeybroeck, \email{bertvs@meteo.be} diff --git a/man/CST_DynBiasCorrection.Rd b/man/CST_DynBiasCorrection.Rd index e2467e72bbc98eee7b45c0b5bd6192506bf22fcf..2197343a469d260183883f471b4daa740743b9b3 100644 --- a/man/CST_DynBiasCorrection.Rd +++ b/man/CST_DynBiasCorrection.Rd @@ -16,29 +16,29 @@ CST_DynBiasCorrection( ) } \arguments{ -\item{exp}{an s2v_cube object with the experiment data} +\item{exp}{An s2v_cube object with the experiment data.} -\item{obs}{an s2dv_cube object with the reference data} +\item{obs}{An s2dv_cube object with the reference data.} -\item{method}{a character string indicating the method to apply bias -correction among these ones: "PTF","RQUANT","QUANT","SSPLIN"} +\item{method}{A character string indicating the method to apply bias +correction among these ones: "PTF","RQUANT","QUANT","SSPLIN".} -\item{wetday}{logical indicating whether to perform wet day correction +\item{wetday}{Logical indicating whether to perform wet day correction or not OR a numeric threshold below which all values are set to zero (by default is set to 'FALSE').} -\item{proxy}{a character string indicating the proxy for local dimension +\item{proxy}{A character string indicating the proxy for local dimension 'dim' or inverse of persistence 'theta' to apply the dynamical conditioned bias correction method.} -\item{quanti}{a number lower than 1 indicating the quantile to perform -the computation of local dimension and theta} +\item{quanti}{A number lower than 1 indicating the quantile to perform +the computation of local dimension and theta.} -\item{ncores}{The number of cores to use in parallel computation} +\item{ncores}{The number of cores to use in parallel computation.} } \value{ -dynbias an s2dvcube object with a bias correction performed -conditioned by local dimension 'dim' or inverse of persistence 'theta' +dynbias An s2dvcube object with a bias correction performed +conditioned by local dimension 'dim' or inverse of persistence 'theta'. } \description{ This function perform a bias correction conditioned by the @@ -52,36 +52,36 @@ values with lower 'dim', and the same for theta. The function gives two options of bias correction: one for 'dim' and/or one for 'theta' } \examples{ -# example 1: simple data s2dvcube style -set.seed(1) expL <- rnorm(1:2000) -dim (expL) <- c(time =100,lat = 4, lon = 5) -obsL <- c(rnorm(1:1980),expL[1,,]*1.2) -dim (obsL) <- c(time = 100,lat = 4, lon = 5) -time_obsL <- paste(rep("01", 100), rep("01", 100), 1920 : 2019, sep = "-") -time_expL <- paste(rep("01", 100), rep("01", 100), 1929 : 2019, sep = "-") -lon <- seq(-1,5,1.5) -lat <- seq(30,35,1.5) -# qm=0.98 # too high for this short dataset, it is possible that doesn't +dim(expL) <- c(time = 100, lat = 4, lon = 5) +obsL <- c(rnorm(1:1980), expL[1, , ] * 1.2) +dim(obsL) <- c(time = 100, lat = 4, lon = 5) +time_obsL <- as.POSIXct(paste(rep("01", 100), rep("01", 100), 1920:2019, sep = "-"), + format = "\%d-\%m-\%y") +time_expL <- as.POSIXct(paste(rep("01", 100), rep("01", 100), 1929:2019, sep = "-"), + format = "\%d-\%m-\%y") +lon <- seq(-1, 5, 1.5) +lat <- seq(30, 35, 1.5) +# qm = 0.98 #'too high for this short dataset, it is possible that doesn't # get the requirement, in that case it would be necessary select a lower qm -# for instance qm=0.60 -expL <- s2dv_cube(data = expL, lat = lat, lon = lon, - Dates = list(start = time_expL, end = time_expL)) -obsL <- s2dv_cube(data = obsL, lat = lat, lon = lon, - Dates = list(start = time_obsL, end = time_obsL)) +# for instance qm = 0.60 +expL <- s2dv_cube(data = expL, coords = list(lon = lon, lat = lat), + Dates = time_expL) +obsL <- s2dv_cube(data = obsL, coords = list(lon = lon, lat = lat), + Dates = time_obsL) # to use DynBiasCorrection dynbias1 <- DynBiasCorrection(exp = expL$data, obs = obsL$data, proxy= "dim", - quanti = 0.6) + quanti = 0.6) # to use CST_DynBiasCorrection dynbias2 <- CST_DynBiasCorrection(exp = expL, obs = obsL, proxy= "dim", - quanti = 0.6) + quanti = 0.6) } \references{ Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale atmospheric predictability.Nature Communications, 10(1), 1316. -DOI = https://doi.org/10.1038/s41467-019-09305-8 " +\doi{10.1038/s41467-019-09305-8}" Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. diff --git a/man/CST_EnsClustering.Rd b/man/CST_EnsClustering.Rd index 83f5e7a1a1b17c3217044ffaecc09ccb383fd3de..7b1dd6cf4a71ebda8404e98fa93ce1c3d939e49b 100644 --- a/man/CST_EnsClustering.Rd +++ b/man/CST_EnsClustering.Rd @@ -19,112 +19,107 @@ CST_EnsClustering( ) } \arguments{ -\item{exp}{An object of the class 's2dv_cube', containing the variables to be analysed. -Each data object in the list is expected to have an element named \code{$data} with at least two -spatial dimensions named "lon" and "lat", and dimensions "dataset", "member", "ftime", "sdate".} +\item{exp}{An object of the class 's2dv_cube', containing the variables to be +analysed. The element 'data' in the 's2dv_cube' object must have, at +least, spatial and temporal dimensions. Latitudinal dimension accepted +names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal +dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'.} -\item{time_moment}{Decides the moment to be applied to the time dimension. Can be either 'mean' (time mean), -'sd' (standard deviation along time) or 'perc' (a selected percentile on time). -If 'perc' the keyword 'time_percentile' is also used.} +\item{time_moment}{Decides the moment to be applied to the time dimension. Can +be either 'mean' (time mean), 'sd' (standard deviation along time) or 'perc' +(a selected percentile on time). If 'perc' the keyword 'time_percentile' is +also used.} -\item{numclus}{Number of clusters (scenarios) to be calculated. -If set to NULL the number of ensemble members divided by 10 is used, with a minimum of 2 and a maximum of 8.} +\item{numclus}{Number of clusters (scenarios) to be calculated. If set to NULL +the number of ensemble members divided by 10 is used, with a minimum of 2 +and a maximum of 8.} \item{lon_lim}{List with the two longitude margins in `c(-180,180)` format.} \item{lat_lim}{List with the two latitude margins.} -\item{variance_explained}{variance (percentage) to be explained by the set of EOFs. -Defaults to 80. Not used if numpcs is specified.} +\item{variance_explained}{variance (percentage) to be explained by the set of +EOFs. Defaults to 80. Not used if numpcs is specified.} \item{numpcs}{Number of EOFs retained in the analysis (optional).} -\item{time_dim}{String or character array with name(s) of dimension(s) over which to compute statistics. -If omitted c("ftime", "sdate", "time") are searched in this order.} +\item{time_dim}{String or character array with name(s) of dimension(s) over +which to compute statistics. If omitted c("ftime", "sdate", "time") are +searched in this order.} -\item{time_percentile}{Set the percentile in time you want to analyse (used for `time_moment = "perc").} +\item{time_percentile}{Set the percentile in time you want to analyse (used +for `time_moment = "perc").} -\item{cluster_dim}{Dimension along which to cluster. Typically "member" or "sdate". -This can also be a list like c("member", "sdate").} +\item{cluster_dim}{Dimension along which to cluster. Typically "member" or +"sdate". This can also be a list like c("member", "sdate").} \item{verbose}{Logical for verbose output} } \value{ -A list with elements \code{$cluster} (cluster assigned for each member), - \code{$freq} (relative frequency of each cluster), \code{$closest_member} - (representative member for each cluster), \code{$repr_field} (list of fields - for each representative member), \code{composites} (list of mean fields for each cluster), - \code{$lon} (selected longitudes of output fields), - \code{$lat} (selected longitudes of output fields). +A list with elements \code{$cluster} (cluster assigned for each +member), \code{$freq} (relative frequency of each cluster), +\code{$closest_member} (representative member for each cluster), +\code{$repr_field} (list of fields for each representative member), +\code{composites} (list of mean fields for each cluster), \code{$lon} +(selected longitudes of output fields), \code{$lat} (selected longitudes of +output fields). } \description{ This function performs a clustering on members/starting dates -and returns a number of scenarios, with representative members for each of them. -The clustering is performed in a reduced EOF space. +and returns a number of scenarios, with representative members for each of +them. The clustering is performed in a reduced EOF space. Motivation: Ensemble forecasts give a probabilistic insight of average weather conditions on extended timescales, i.e. from sub-seasonal to seasonal and beyond. With large ensembles, it is often an advantage to be able to group members -according to similar characteristics and to select the most representative member for each cluster. -This can be useful to characterize the most probable forecast scenarios in a multi-model -(or single model) ensemble prediction. This approach, applied at a regional level, -can also be used to identify the subset of ensemble members that best represent the -full range of possible solutions for downscaling applications. -The choice of the ensemble members is made flexible in order to meet the requirements -of specific (regional) climate information products, to be tailored for different regions and user needs. +according to similar characteristics and to select the most representative +member for each cluster. This can be useful to characterize the most probable +forecast scenarios in a multi-model (or single model) ensemble prediction. +This approach, applied at a regional level, can also be used to identify the +subset of ensemble members that best represent the full range of possible +solutions for downscaling applications. The choice of the ensemble members is +made flexible in order to meet the requirements of specific (regional) climate +information products, to be tailored for different regions and user needs. Description of the tool: -EnsClustering is a cluster analysis tool, based on the k-means algorithm, for ensemble predictions. -The aim is to group ensemble members according to similar characteristics and -to select the most representative member for each cluster. -The user chooses which feature of the data is used to group the ensemble members by clustering: -time mean, maximum, a certain percentile (e.g., 75% as in the examples below), -standard deviation and trend over the time period. For each ensemble member this value -is computed at each grid point, obtaining N lat-lon maps, where N is the number of ensemble members. -The anomaly is computed subtracting the ensemble mean of these maps to each of the single maps. -The anomaly is therefore computed with respect to the ensemble members (and not with respect to the time) -and the Empirical Orthogonal Function (EOF) analysis is applied to these anomaly maps. -Regarding the EOF analysis, the user can choose either how many Principal Components (PCs) -to retain or the percentage of explained variance to keep. After reducing dimensionality via -EOF analysis, k-means analysis is applied using the desired subset of PCs. - -The major final outputs are the classification in clusters, i.e. which member belongs -to which cluster (in k-means analysis the number k of clusters needs to be defined -prior to the analysis) and the most representative member for each cluster, -which is the closest member to the cluster centroid. -Other outputs refer to the statistics of clustering: in the PC space, the minimum and -the maximum distance between a member in a cluster and the cluster centroid -(i.e. the closest and the furthest member), the intra-cluster standard -deviation for each cluster (i.e. how much the cluster is compact). +EnsClustering is a cluster analysis tool, based on the k-means algorithm, for +ensemble predictions. The aim is to group ensemble members according to +similar characteristics and to select the most representative member for each +cluster. The user chooses which feature of the data is used to group the +ensemble members by clustering: time mean, maximum, a certain percentile +(e.g., 75% as in the examples below), standard deviation and trend over the +time period. For each ensemble member this value is computed at each grid +point, obtaining N lat-lon maps, where N is the number of ensemble members. +The anomaly is computed subtracting the ensemble mean of these maps to each of +the single maps. The anomaly is therefore computed with respect to the +ensemble members (and not with respect to the time) and the Empirical +Orthogonal Function (EOF) analysis is applied to these anomaly maps. Regarding +the EOF analysis, the user can choose either how many Principal Components +(PCs) to retain or the percentage of explained variance to keep. After +reducing dimensionality via EOF analysis, k-means analysis is applied using +the desired subset of PCs. + +The major final outputs are the classification in clusters, i.e. which member +belongs to which cluster (in k-means analysis the number k of clusters needs +to be defined prior to the analysis) and the most representative member for +each cluster, which is the closest member to the cluster centroid. Other +outputs refer to the statistics of clustering: in the PC space, the minimum +and the maximum distance between a member in a cluster and the cluster +centroid (i.e. the closest and the furthest member), the intra-cluster +standard deviation for each cluster (i.e. how much the cluster is compact). } \examples{ -\donttest{ -exp <- lonlat_temp$exp -# Example 1: Cluster on all start dates, members and models -res <- CST_EnsClustering(exp, numclus = 3, - cluster_dim = c("member", "dataset", "sdate")) -iclus <- res$cluster[2, 1, 3] - -#print(paste("Cluster of 2. member, 1. dataset, 3. sdate:", iclus)) -#print(paste("Frequency (numerosity) of cluster (", iclus, ") :", res$freq[iclus])) -s2dv::PlotEquiMap(res$repr_field[iclus, , ], exp$lon, exp$lat, - filled.continents = FALSE, - toptitle = paste("Representative field of cluster", iclus)) - -# Example 2: Cluster on members retaining 4 EOFs during -# preliminary dimensional reduction -res <- CST_EnsClustering(exp, numclus = 3, numpcs = 4, cluster_dim = "member") - -# Example 3: Cluster on members, retain 80\% of variance during -# preliminary dimensional reduction -res <- CST_EnsClustering(exp, numclus = 3, variance_explained = 80, - cluster_dim = "member") - -# Example 4: Compute percentile in time -res <- CST_EnsClustering(exp, numclus = 3, time_percentile = 90, - time_moment = "perc", cluster_dim = "member") -} +dat_exp <- array(abs(rnorm(1152))*275, dim = c(dataset = 1, member = 4, + sdate = 6, ftime = 3, + lat = 4, lon = 4)) +lon <- seq(0, 3) +lat <- seq(48, 45) +coords <- list(lon = lon, lat = lat) +exp <- list(data = dat_exp, coords = coords) +attr(exp, 'class') <- 's2dv_cube' +res <- CST_EnsClustering(exp = exp, numclus = 3, + cluster_dim = c("sdate")) } \author{ diff --git a/man/CST_InsertDim.Rd b/man/CST_InsertDim.Rd new file mode 100644 index 0000000000000000000000000000000000000000..2aa22a5bc84617e92c9ae743a113d871f22faa55 --- /dev/null +++ b/man/CST_InsertDim.Rd @@ -0,0 +1,53 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/CST_InsertDim.R +\name{CST_InsertDim} +\alias{CST_InsertDim} +\title{Add a named dimension to an object of class s2dv_cube} +\usage{ +CST_InsertDim(data, posdim, lendim, name, values = NULL) +} +\arguments{ +\item{data}{An object of class \code{s2dv_cube} to which the additional +dimension should be added.} + +\item{posdim}{An integer indicating the position of the new dimension.} + +\item{lendim}{An integer indicating the length of the new dimension.} + +\item{name}{A character string indicating the name for the new dimension.} + +\item{values}{A vector containing the values of the new dimension and any +relevant attributes. If NULL, a sequence of integers from 1 to lendim will +be added.} +} +\value{ +An object of class \code{s2dv_cube} with similar data, coordinates and +attributes as the \code{data} input, but with an additional dimension. +} +\description{ +Insert an extra dimension into an array at position 'posdim' with length +'lendim'. The array in \code{data} repeats along the new dimension. +The dimensions, coordinates and attributes are modified accordingly. +} +\examples{ +#Example with sample data: +# Check original dimensions and coordinates +lonlat_temp$exp$dims +names(lonlat_temp$exp$coords) +# Add 'variable' dimension +exp <- CST_InsertDim(lonlat_temp$exp, + posdim = 2, + lendim = 1, + name = "variable", + values = c("tas")) +# Check new dimensions and coordinates +exp$dims +exp$coords$variable + +} +\seealso{ +\link[s2dv]{InsertDim} +} +\author{ +Agudetse Roures Victoria, \email{victoria.agudetse@bsc.es} +} diff --git a/man/CST_MergeDims.Rd b/man/CST_MergeDims.Rd index 0762e83f9567f082040a8692ebf18c2c90ca45fd..5f9b4d1301a4bdf34ecdcddffae5015ce7df81b9 100644 --- a/man/CST_MergeDims.Rd +++ b/man/CST_MergeDims.Rd @@ -12,32 +12,34 @@ CST_MergeDims( ) } \arguments{ -\item{data}{a 's2dv_cube' object} +\item{data}{An 's2dv_cube' object} -\item{merge_dims}{a character vector indicating the names of the dimensions to merge} +\item{merge_dims}{A character vector indicating the names of the dimensions to +merge.} -\item{rename_dim}{a character string indicating the name of the output dimension. If left at NULL, the first dimension name provided in parameter \code{merge_dims} will be used.} +\item{rename_dim}{a character string indicating the name of the output +dimension. If left at NULL, the first dimension name provided in parameter +\code{merge_dims} will be used.} -\item{na.rm}{a logical indicating if the NA values should be removed or not.} +\item{na.rm}{A logical indicating if the NA values should be removed or not.} } \description{ -This function merges two dimensions of the array \code{data} in a 's2dv_cube' object into one. The user can select the dimensions to merge and provide the final name of the dimension. The user can select to remove NA values or keep them. +This function merges two dimensions of the array \code{data} in a +'s2dv_cube' object into one. The user can select the dimensions to merge and +provide the final name of the dimension. The user can select to remove NA +values or keep them. } \examples{ - data <- 1 : c(2 * 3 * 4 * 5 * 6 * 7) dim(data) <- c(time = 7, lat = 2, lon = 3, monthly = 4, member = 6, dataset = 5, var = 1) data[2,,,,,,] <- NA data[c(3,27)] <- NA -data <-list(data = data) +data <- list(data = data) class(data) <- 's2dv_cube' new_data <- CST_MergeDims(data, merge_dims = c('time', 'monthly')) -dim(new_data$data) new_data <- CST_MergeDims(data, merge_dims = c('lon', 'lat'), rename_dim = 'grid') -dim(new_data$data) new_data <- CST_MergeDims(data, merge_dims = c('time', 'monthly'), na.rm = TRUE) -dim(new_data$data) } \author{ Nuria Perez-Zanon, \email{nuria.perez@bsc.es} diff --git a/man/CST_MultiEOF.Rd b/man/CST_MultiEOF.Rd index 0621112e7932655715184c63d67aa4b038716475..11f8877fa0ada4c68692da0052eba93ad1567bdc 100644 --- a/man/CST_MultiEOF.Rd +++ b/man/CST_MultiEOF.Rd @@ -14,61 +14,64 @@ CST_MultiEOF( ) } \arguments{ -\item{datalist}{A list of objects of the class 's2dv_cube', containing the variables to be analysed. -Each data object in the list is expected to have an element named \code{$data} with at least two -spatial dimensions named "lon" and "lat", a dimension "ftime" and a dimension "sdate".} +\item{datalist}{A list of objects of the class 's2dv_cube', containing the +variables to be analysed. Each data object in the list is expected to have +an element named \code{$data} with at least two spatial dimensions named +"lon" and "lat", a dimension "ftime" and a dimension "sdate". Latitudinal +dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. +Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', +'nav_lon'.} -\item{neof_max}{Maximum number of single eofs considered in the first decomposition} +\item{neof_max}{Maximum number of single eofs considered in the first +decomposition.} -\item{neof_composed}{Number of composed eofs to return in output} +\item{neof_composed}{Number of composed eofs to return in output.} -\item{minvar}{Minimum variance fraction to be explained in first decomposition} +\item{minvar}{Minimum variance fraction to be explained in first decomposition.} -\item{lon_lim}{Vector with longitudinal range limits for the EOF calculation for all input variables} +\item{lon_lim}{Vector with longitudinal range limits for the EOF calculation +for all input variables.} -\item{lat_lim}{Vector with latitudinal range limits for the EOF calculation for all input variables} +\item{lat_lim}{Vector with latitudinal range limits for the EOF calculation +for all input variables.} } \value{ -A list with elements \code{$coeff} (an array of time-varying principal component coefficients), - \code{$variance} (a matrix of explained variances), - \code{eof_pattern} (a matrix of EOF patterns obtained by regression for each variable). +A list with elements \code{$coeff} (an array of time-varying principal +component coefficients), \code{$variance} (a matrix of explained variances), +\code{eof_pattern} (a matrix of EOF patterns obtained by regression for each +variable). } \description{ This function performs EOF analysis over multiple variables, -accepting in input a list of CSTools objects. Based on Singular Value Decomposition. For each field the EOFs are computed and the corresponding PCs are standardized (unit variance, zero mean); the minimum number of principal components needed to reach the user-defined variance is retained. The function weights the input data for the latitude cosine square root. +accepting in input a list of CSTools objects. Based on Singular Value +Decomposition. For each field the EOFs are computed and the corresponding PCs +are standardized (unit variance, zero mean); the minimum number of principal +components needed to reach the user-defined variance is retained. The function +weights the input data for the latitude cosine square root. } \examples{ -\donttest{ -library(zeallot) -library(ClimProjDiags) -c(exp, obs) \%<-\% lonlat_temp -# Create three datasets (from the members) -exp1 <- exp -exp2 <- exp -exp3 <- exp -exp1$data <- Subset(exp$data, along = 2, indices = 1 : 5) -exp2$data <- Subset(exp$data, along = 2, indices = 6 : 10) -exp3$data <- Subset(exp$data, along = 2, indices = 11 : 15) +seq <- 1 : (2 * 3 * 4 * 5 * 6 * 8) +mod1 <- sin( 0.7 + seq )^2 + cos( seq ^ 2 * 1.22 ) +dim(mod1) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, + lon = 8) +mod2 <- sin( seq * 2 ) ^ 3 + cos( seq ^ 2 ) +dim(mod2) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, + lon = 8) +lon <- seq(0, 35, 5) +lat <- seq(0, 25, 5) +exp1 <- list(data = mod1, coords = list(lat = lat, lon = lon)) +exp2 <- list(data = mod2, coords = list(lat = lat, lon = lon)) +attr(exp1, 'class') <- 's2dv_cube' +attr(exp2, 'class') <- 's2dv_cube' +d = as.POSIXct(c("2017/01/01", "2017/01/02", "2017/01/03", "2017/01/04", + "2017/01/05", "2018/01/01", "2018/01/02", "2018/01/03", + "2018/01/04", "2018/01/05", "2019/01/01", "2019/01/02", + "2019/01/03", "2019/01/04", "2019/01/05", "2020/01/01", + "2020/01/02", "2020/01/03", "2020/01/04", "2020/01/05")) +exp1$attrs$Dates = d +exp2$attrs$Dates = d -cal <- CST_MultiEOF(list(exp1, exp2, exp3), neof_max=5, neof_composed=2) -str(cal) -# List of 3 -# $ coeff : num [1:3, 1:6, 1:2, 1:5] -0.312 -0.588 0.724 1.202 1.181 ... -# $ variance : num [1:2, 1:5] 0.413 0.239 0.352 0.27 0.389 ... -# $ eof_pattern: num [1:3, 1:53, 1:22, 1:2, 1:5] -1.47 -0.446 -0.656 -1.534 -0.464 ... -dim(cal$coeff) -# ftime sdate eof member -# 3 6 2 3 - -cal <- CST_MultiEOF(list(exp1, exp2, exp3) , minvar=0.9) -str(cal) -# $ coeff : num [1:3, 1:6, 1:5, 1:5] 0.338 0.603 -0.736 -1.191 -1.198 ... -# $ variance : num [1:5, 1:5] 0.3903 0.2264 0.1861 0.1032 0.0379 ... -# $ eof_pattern: num [1:3, 1:53, 1:22, 1:5, 1:5] 1.477 0.454 0.651 1.541 0.47 ... - -cal <- CST_MultiEOF(list(exp1, exp2)) -cal <- CST_MultiEOF(list(exp1, exp2, exp3), lon_lim=c(5, 30), lat_lim=c(35, 50), neof_composed=3) -} +cal <- CST_MultiEOF(datalist = list(exp1, exp2), neof_composed = 2) } \author{ Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} diff --git a/man/CST_MultiMetric.Rd b/man/CST_MultiMetric.Rd index 5590ab893945db323d703490692be975c704ceed..7b4bc000e0b90af84ee636d6507ed78cd18115e6 100644 --- a/man/CST_MultiMetric.Rd +++ b/man/CST_MultiMetric.Rd @@ -15,55 +15,67 @@ CST_MultiMetric( ) } \arguments{ -\item{exp}{an object of class \code{s2dv_cube} as returned by \code{CST_Anomaly} function, containing the anomaly of the seasonal forecast experiments data in the element named \code{$data}.} +\item{exp}{An object of class \code{s2dv_cube} as returned by +\code{CST_Anomaly} function, containing the anomaly of the seasonal forecast +experiments data in the element named \code{$data}.} -\item{obs}{an object of class \code{s2dv_cube} as returned by \code{CST_Anomaly} function, containing the anomaly of observed data in the element named \code{$data}.} +\item{obs}{An object of class \code{s2dv_cube} as returned by +\code{CST_Anomaly} function, containing the anomaly of observed data in the +element named \code{$data}.} -\item{metric}{a character string giving the metric for computing the maximum skill. This must be one of the strings 'correlation', 'rms', 'rmsss' and 'rpss'. If 'rpss' is chossen the terciles probabilities are evaluated.} +\item{metric}{A character string giving the metric for computing the maximum +skill. This must be one of the strings 'correlation', 'rms', 'rmsss' and +'rpss'. If 'rpss' is chossen the terciles probabilities are evaluated.} -\item{multimodel}{a logical value indicating whether a Multi-Model Mean should be computed.} +\item{multimodel}{A logical value indicating whether a Multi-Model Mean should +be computed.} -\item{time_dim}{name of the temporal dimension where a mean will be applied. It can be NULL, the default value is 'ftime'.} +\item{time_dim}{Name of the temporal dimension where a mean will be applied. +It can be NULL, the default value is 'ftime'.} -\item{memb_dim}{name of the member dimension. It can be NULL, the default value is 'member'.} +\item{memb_dim}{Name of the member dimension. It can be NULL, the default +value is 'member'.} -\item{sdate_dim}{name of the start date dimension or a dimension name identifiying the different forecast. It can be NULL, the default value is 'sdate'.} +\item{sdate_dim}{Name of the start date dimension or a dimension name +identifiying the different forecast. It can be NULL, the default value is +'sdate'.} } \value{ -an object of class \code{s2dv_cube} containing the statistics of the selected metric in the element \code{$data} which is a list of arrays: for the metric requested and others for statistics about its signeificance. The arrays have two dataset dimensions equal to the 'dataset' dimension in the \code{exp$data} and \code{obs$data} inputs. If \code{multimodel} is TRUE, the first position in the first 'nexp' dimension correspons to the Multi-Model Mean. +An object of class \code{s2dv_cube} containing the statistics of the +selected metric in the element \code{$data} which is a list of arrays: for the +metric requested and others for statistics about its signeificance. The arrays +have two dataset dimensions equal to the 'dataset' dimension in the +\code{exp$data} and \code{obs$data} inputs. If \code{multimodel} is TRUE, the +first position in the first 'nexp' dimension correspons to the Multi-Model Mean. } \description{ -This function calculates correlation (Anomaly Correlation Coefficient; ACC), root mean square error (RMS) and the root mean square error skill score (RMSSS) of individual anomaly models and multi-models mean (if desired) with the observations. +This function calculates correlation (Anomaly Correlation +Coefficient; ACC), root mean square error (RMS) and the root mean square error +skill score (RMSSS) of individual anomaly models and multi-models mean (if +desired) with the observations. } \examples{ -library(zeallot) -mod <- 1 : (2 * 3 * 4 * 5 * 6 * 7) -dim(mod) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) -obs <- 1 : (1 * 1 * 4 * 5 * 6 * 7) -dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) +mod <- rnorm(2*2*4*5*2*2) +dim(mod) <- c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, lon = 2) +obs <- rnorm(1*1*4*5*2*2) +dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, lon = 2) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) -exp <- list(data = mod, lat = lat, lon = lon) -obs <- list(data = obs, lat = lat, lon = lon) +coords <- list(lat = lat, lon = lon) +exp <- list(data = mod, coords = coords) +obs <- list(data = obs, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' -c(ano_exp, ano_obs) \%<-\% CST_Anomaly(exp = exp, obs = obs, cross = TRUE, memb = TRUE) -a <- CST_MultiMetric(exp = ano_exp, obs = ano_obs) -str(a) -\donttest{ -exp <- lonlat_temp$exp -obs <- lonlat_temp$obs -a <- CST_MultiMetric(exp, obs, metric = 'rpss', multimodel = FALSE) -a <- CST_MultiMetric(exp, obs, metric = 'correlation') -a <- CST_MultiMetric(exp, obs, metric = 'rms') -a <- CST_MultiMetric(exp, obs, metric = 'rmsss') -} +a <- CST_MultiMetric(exp = exp, obs = obs) } \references{ -Mishra, N., Prodhomme, C., & Guemas, V. (n.d.). Multi-Model Skill Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, 29-31.\url{https://link.springer.com/article/10.1007/s00382-018-4404-z} +Mishra, N., Prodhomme, C., & Guemas, V. (n.d.). Multi-Model Skill +Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, +29-31. \doi{10.1007/s00382-018-4404-z} } \seealso{ -\code{\link[s2dv]{Corr}}, \code{\link[s2dv]{RMS}}, \code{\link[s2dv]{RMSSS}} and \code{\link{CST_Load}} +\code{\link[s2dv]{Corr}}, \code{\link[s2dv]{RMS}}, +\code{\link[s2dv]{RMSSS}} and \code{\link{CST_Load}} } \author{ Mishra Niti, \email{niti.mishra@bsc.es} diff --git a/man/CST_MultivarRMSE.Rd b/man/CST_MultivarRMSE.Rd index 577c08dbd8290241d3da8606d418bf6b835ca75d..6fa8d1b5c377097d1263d2ddb5f8cd417fbcbddb 100644 --- a/man/CST_MultivarRMSE.Rd +++ b/man/CST_MultivarRMSE.Rd @@ -4,57 +4,92 @@ \alias{CST_MultivarRMSE} \title{Multivariate Root Mean Square Error (RMSE)} \usage{ -CST_MultivarRMSE(exp, obs, weight = NULL) +CST_MultivarRMSE( + exp, + obs, + weight = NULL, + memb_dim = "member", + dat_dim = "dataset", + sdate_dim = "sdate", + ftime_dim = "ftime" +) } \arguments{ -\item{exp}{a list of objects, one for each variable, of class \code{s2dv_cube} as returned by \code{CST_Anomaly} function, containing the anomaly of the seasonal forecast experiment data in the element named \code{$data}.} +\item{exp}{A list of objects, one for each variable, of class \code{s2dv_cube} +as returned by \code{CST_Anomaly} function, containing the anomaly of the +seasonal forecast experiment data in the element named \code{$data}.} -\item{obs}{a list of objects, one for each variable (in the same order than the input in 'exp') of class \code{s2dv_cube} as returned by \code{CST_Anomaly} function, containing the observed anomaly data in the element named \code{$data}.} +\item{obs}{A list of objects, one for each variable (in the same order than +the input in 'exp') of class \code{s2dv_cube} as returned by +\code{CST_Anomaly} function, containing the observed anomaly data in the +element named \code{$data}.} -\item{weight}{(optional) a vector of weight values to assign to each variable. If no weights are defined, a value of 1 is assigned to every variable.} +\item{weight}{(optional) A vector of weight values to assign to each variable. +If no weights are defined, a value of 1 is assigned to every variable.} + +\item{memb_dim}{A character string indicating the name of the member +dimension. It must be one dimension in 'exp' and 'obs'. The default value is +'member'.} + +\item{dat_dim}{A character string indicating the name of the dataset +dimension. It must be one dimension in 'exp' and 'obs'. If there is no +dataset dimension, it can be NULL. The default value is 'dataset'.} + +\item{sdate_dim}{A character string indicating the name of the start date +dimension. It must be one dimension in 'exp' and 'obs'. The default value is +'sdate'.} + +\item{ftime_dim}{A character string indicating the name of the forecast time +dimension. It must be one dimension in 'exp' and 'obs'. The default value is +'ftime'.} } \value{ -an object of class \code{s2dv_cube} containing the RMSE in the element \code{$data} which is an array with two datset dimensions equal to the 'dataset' dimension in the \code{exp$data} and \code{obs$data} inputs. An array with dimensions: c(number of exp, number of obs, 1 (the multivariate RMSE value), number of lat, number of lon) +An object of class \code{s2dv_cube} containing the RMSE in the element + \code{$data} which is an array with two datset dimensions equal to the + 'dataset' dimension in the \code{exp$data} and \code{obs$data} inputs. An + array with dimensions: c(number of exp, number of obs, 1 (the multivariate + RMSE value), number of lat, number of lon) } \description{ -This function calculates the RMSE from multiple variables, as the mean of each variable's RMSE scaled by its observed standard deviation. Variables can be weighted based on their relative importance (defined by the user). +This function calculates the RMSE from multiple variables, as the +mean of each variable's RMSE scaled by its observed standard deviation. +Variables can be weighted based on their relative importance (defined by the +user). } \examples{ -# Creation of sample s2dv objects. These are not complete s2dv objects -# though. The Load function returns complete objects. -# using package zeallot is optional: -library(zeallot) # Example with 2 variables -mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) -mod2 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) +mod1 <- abs(rnorm(1 * 3 * 4 * 5 * 6 * 7)) +mod2 <- abs(rnorm(1 * 3 * 4 * 5 * 6 * 7)) dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) dim(mod2) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) -obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) -obs2 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) +obs1 <- abs(rnorm(1 * 1 * 4 * 5 * 6 * 7)) +obs2 <- abs(rnorm(1 * 1 * 4 * 5 * 6 * 7)) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) dim(obs2) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) -exp1 <- list(data = mod1, lat = lat, lon = lon, Datasets = "EXP1", - source_files = "file1", Variable = list('pre')) +coords <- list(lat = lat, lon = lon) +exp1 <- list(data = mod1, coords = coords, + attrs = list(Datasets = "EXP1", source_files = "file1", + Variable = list(varName = 'pre'))) +exp2 <- list(data = mod2, coords = coords, + attrs = list(Datasets = "EXP2", source_files = "file2", + Variable = list(varName = 'tas'))) +obs1 <- list(data = obs1, coords = coords, + attrs = list(Datasets = "OBS1", source_files = "file1", + Variable = list(varName = 'pre'))) +obs2 <- list(data = obs2, coords = coords, + attrs = list(Datasets = "OBS2", source_files = "file2", + Variable = list(varName = 'tas'))) attr(exp1, 'class') <- 's2dv_cube' -exp2 <- list(data = mod2, lat = lat, lon = lon, Datasets = "EXP2", - source_files = "file2", Variable = list('tas')) attr(exp2, 'class') <- 's2dv_cube' -obs1 <- list(data = obs1, lat = lat, lon = lon, Datasets = "OBS1", - source_files = "file1", Variable = list('pre')) attr(obs1, 'class') <- 's2dv_cube' -obs2 <- list(data = obs2, lat = lat, lon = lon, Datasets = "OBS2", - source_files = "file2", Variable = list('tas')) attr(obs2, 'class') <- 's2dv_cube' - -c(ano_exp1, ano_obs1) \%<-\% CST_Anomaly(exp1, obs1, cross = TRUE, memb = TRUE) -c(ano_exp2, ano_obs2) \%<-\% CST_Anomaly(exp2, obs2, cross = TRUE, memb = TRUE) -ano_exp <- list(exp1, exp2) -ano_obs <- list(ano_obs1, ano_obs2) -weight <- c(1, 2) -a <- CST_MultivarRMSE(exp = ano_exp, obs = ano_obs, weight = weight) -str(a) +anom1 <- CST_Anomaly(exp1, obs1, cross = TRUE, memb = TRUE) +anom2 <- CST_Anomaly(exp2, obs2, cross = TRUE, memb = TRUE) +ano_exp <- list(anom1$exp, anom2$exp) +ano_obs <- list(anom1$obs, anom2$obs) +a <- CST_MultivarRMSE(exp = ano_exp, obs = ano_obs, weight = c(1, 2)) } \seealso{ \code{\link[s2dv]{RMS}} and \code{\link{CST_Load}} diff --git a/man/CST_ProxiesAttractor.Rd b/man/CST_ProxiesAttractor.Rd index ddf7df27e9431f104d0e127756b462729eee996e..58f949af388ee6c572421514b885d1e487e43f46 100644 --- a/man/CST_ProxiesAttractor.Rd +++ b/man/CST_ProxiesAttractor.Rd @@ -7,12 +7,13 @@ CST_ProxiesAttractor(data, quanti, ncores = NULL) } \arguments{ -\item{data}{a s2dv_cube object with the data to create the attractor. Must be a matrix with the timesteps in nrow -and the grids in ncol(dat(time,grids)} +\item{data}{An s2dv_cube object with the data to create the attractor. Must be +a matrix with the timesteps in nrow and the grids in ncol(dat(time,grids)} -\item{quanti}{a number lower than 1 indicating the quantile to perform the computation of local dimension and theta} +\item{quanti}{A number lower than 1 indicating the quantile to perform the +computation of local dimension and theta.} -\item{ncores}{The number of cores to use in parallel computation} +\item{ncores}{The number of cores to use in parallel computation.} } \value{ dim and theta @@ -21,20 +22,27 @@ dim and theta This function computes two dinamical proxies of the attractor: The local dimension (d) and the inverse of the persistence (theta) for an 's2dv_cube' object. -These two parameters will be used as a condition for the computation of dynamical -scores to measure predictability and to compute bias correction conditioned by -the dynamics with the function DynBiasCorrection -Funtion based on the matlab code (davide.faranda@lsce.ipsl.fr) used in +These two parameters will be used as a condition for the computation of +dynamical scores to measure predictability and to compute bias correction +conditioned by the dynamics with the function DynBiasCorrection Function +based on the matlab code (davide.faranda@lsce.ipsl.fr) used in } \examples{ # Example 1: Computing the attractor using simple s2dv data -attractor <- CST_ProxiesAttractor(data = lonlat_temp$obs, quanti = 0.6) - +obs <- rnorm(2 * 3 * 4 * 8 * 8) +dim(obs) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) +lon <- seq(10, 13.5, 0.5) +lat <- seq(40, 43.5, 0.5) +coords <- list(lon = lon, lat = lat) +data <- list(data = obs, coords = coords) +class(data) <- "s2dv_cube" +attractor <- CST_ProxiesAttractor(data = data, quanti = 0.6) } \references{ -Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). -The hammam effect or how a warm ocean enhances large scale atmospheric predictability. -Nature Communications, 10(1), 1316. DOI = https://doi.org/10.1038/s41467-019-09305-8 " +Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., +and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large +scale atmospheric predictability. Nature Communications, 10(1), 1316. +\doi{10.1038/s41467-019-09305-8}" Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. diff --git a/man/CST_QuantileMapping.Rd b/man/CST_QuantileMapping.Rd index 71b32da9f87dbca3d93f1c6b33736bb42ccd2a7a..0ca3423f9d64002f088417482ca195593cb8bdff 100644 --- a/man/CST_QuantileMapping.Rd +++ b/man/CST_QuantileMapping.Rd @@ -30,16 +30,17 @@ is applied in object 'exp'.} cross-validation would be applied when exp_cor is not provided. 'sdate' by default.} -\item{memb_dim}{A character string indicating the dimension name where -ensemble members are stored in the experimental arrays. 'member' by default.} +\item{memb_dim}{A character string indicating the dimension name where +ensemble members are stored in the experimental arrays. It can be NULL if +there is no ensemble member dimension. It is set as 'member' by default.} \item{window_dim}{A character string indicating the dimension name where samples have been stored. It can be NULL (default) in case all samples are used.} -\item{method}{A character string indicating the method to be used:'PTF', -'DIST','RQUANT','QUANT','SSPLIN'. By default, the empirical quantile mapping -'QUANT' is used.} +\item{method}{A character string indicating the method to be used:'PTF', +'DIST', 'RQUANT', 'QUANT', 'SSPLIN'. By default, the empirical quantile +mapping 'QUANT' is used.} \item{na.rm}{A logical value indicating if missing values should be removed (FALSE by default).} @@ -67,29 +68,13 @@ exp$data <- 1 : c(1 * 3 * 5 * 4 * 3 * 2) dim(exp$data) <- c(dataset = 1, member = 3, sdate = 5, ftime = 4, lat = 3, lon = 2) class(exp) <- 's2dv_cube' +obs <- NULL obs$data <- 101 : c(100 + 1 * 1 * 5 * 4 * 3 * 2) dim(obs$data) <- c(dataset = 1, member = 1, sdate = 5, ftime = 4, lat = 3, lon = 2) class(obs) <- 's2dv_cube' res <- CST_QuantileMapping(exp, obs) -# Use data in package -\donttest{ -exp <- lonlat_temp$exp -exp$data <- exp$data[, , 1:4, , 1:2, 1:3] -dim(exp$data) <- c(dataset = 1, member = 15, sdate = 4, ftime = 3, - lat = 2, lon = 3) -obs <- lonlat_temp$obs -obs$data <- obs$data[, , 1:4, , 1:2, 1:3] -dim(obs$data) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, - lat = 2, lon = 3) -exp_cor <- lonlat_temp$exp -exp_cor$data <- exp_cor$data[, 1, 5:6, , 1:2, 1:3] -dim(exp_cor$data) <- c(dataset = 1, member = 1, sdate = 2, ftime = 3, - lat = 2, lon = 3) -res <- CST_QuantileMapping(exp, obs, exp_cor, window_dim = 'ftime') -} - } \seealso{ \code{\link[qmap]{fitQmap}} and \code{\link[qmap]{doQmap}} diff --git a/man/CST_RFSlope.Rd b/man/CST_RFSlope.Rd index b76ac93ef98a9ee157a71a664ab86a24824c84c3..3eeb1a0f3a4fc98d93fe2aa13d74491d91dc5095 100644 --- a/man/CST_RFSlope.Rd +++ b/man/CST_RFSlope.Rd @@ -4,50 +4,48 @@ \alias{CST_RFSlope} \title{RainFARM spectral slopes from a CSTools object} \usage{ -CST_RFSlope(data, kmin = 1, time_dim = NULL, ncores = 1) +CST_RFSlope(data, kmin = 1, time_dim = NULL, ncores = NULL) } \arguments{ -\item{data}{An object of the class 's2dv_cube', containing the spatial precipitation fields to downscale. -The data object is expected to have an element named \code{$data} with at least two -spatial dimensions named "lon" and "lat" and one or more dimensions over which -to average these slopes, which can be specified by parameter \code{time_dim}.} +\item{data}{An object of the class 's2dv_cube', containing the spatial +precipitation fields to downscale. The data object is expected to have an +element named \code{$data} with at least two spatial dimensions named "lon" +and "lat" and one or more dimensions over which to average these slopes, +which can be specified by parameter \code{time_dim}.} \item{kmin}{First wavenumber for spectral slope (default \code{kmin=1}).} -\item{time_dim}{String or character array with name(s) of dimension(s) (e.g. "ftime", "sdate", "member" ...) -over which to compute spectral slopes. If a character array of dimension names is provided, the spectral slopes -will be computed as an average over all elements belonging to those dimensions. -If omitted one of c("ftime", "sdate", "time") is searched and the first one with more than one element is chosen.} +\item{time_dim}{String or character array with name(s) of dimension(s) (e.g. +"ftime", "sdate", "member" ...) over which to compute spectral slopes. If a +character array of dimension names is provided, the spectral slopes will be +computed as an average over all elements belonging to those dimensions. If +omitted one of c("ftime", "sdate", "time") is searched and the first one +with more than one element is chosen.} -\item{ncores}{is an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one.} +\item{ncores}{Is an integer that indicates the number of cores for parallel +computations using multiApply function. The default value is one.} } \value{ CST_RFSlope() returns spectral slopes using the RainFARM convention -(the logarithmic slope of k*|A(k)|^2 where A(k) are the spectral amplitudes). -The returned array has the same dimensions as the \code{exp} element of the input object, -minus the dimensions specified by \code{lon_dim}, \code{lat_dim} and \code{time_dim}. + (the logarithmic slope of k*|A(k)|^2 where A(k) are the spectral amplitudes). + The returned array has the same dimensions as the \code{exp} element of the + input object, minus the dimensions specified by \code{lon_dim}, + \code{lat_dim} and \code{time_dim}. } \description{ -This function computes spatial spectral slopes from a CSTools object -to be used for RainFARM stochastic precipitation downscaling method and accepts a CSTools object (of the class 's2dv_cube') as input. +This function computes spatial spectral slopes from a CSTools +object to be used for RainFARM stochastic precipitation downscaling method and +accepts a CSTools object (of the class 's2dv_cube') as input. } \examples{ -#Example using CST_RFSlope for a CSTools object exp <- 1 : (2 * 3 * 4 * 8 * 8) dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) lon <- seq(10, 13.5, 0.5) -dim(lon) <- c(lon = length(lon)) lat <- seq(40, 43.5, 0.5) -dim(lat) <- c(lat = length(lat)) -data <- list(data = exp, lon = lon, lat = lat) +coords <- list(lon = lon, lat = lat) +data <- list(data = exp, coords = coords) +class(data) <- 's2dv_cube' slopes <- CST_RFSlope(data) -dim(slopes) -# dataset member sdate -# 1 2 3 -slopes -# [,1] [,2] [,3] -#[1,] 1.893503 1.893503 1.893503 -#[2,] 1.893503 1.893503 1.893503 } \author{ Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} diff --git a/man/CST_RFTemp.Rd b/man/CST_RFTemp.Rd index 8ab5b6f329b63174cab568199ee4e5245b8927e5..ce2bd29e8a6e5c2fc612aaed545e91c63b11b15e 100644 --- a/man/CST_RFTemp.Rd +++ b/man/CST_RFTemp.Rd @@ -23,52 +23,53 @@ CST_RFTemp( } \arguments{ \item{data}{An object of the class 's2dv_cube' as returned by `CST_Load`, -containing the temperature fields to downscale. -The data object is expected to have an element named \code{$data} -with at least two spatial dimensions named "lon" and "lat". -(these default names can be changed with the \code{lon_dim} and -\code{lat_dim} parameters)} +containing the temperature fields to downscale. The data object is expected +to have an element named \code{$data} with at least two spatial dimensions +named "lon" and "lat". (these default names can be changed with the +\code{lon_dim} and \code{lat_dim} parameters).} \item{oro}{An object of the class 's2dv_cube' as returned by `CST_Load`, -containing fine scale orography (in meters). -The destination downscaling area must be contained in the orography field.} +containing fine scale orography (in meters). The destination downscaling +area must be contained in the orography field.} -\item{xlim}{vector with longitude bounds for downscaling; -the full input field is downscaled if `xlim` and `ylim` are not specified.} +\item{xlim}{Vector with longitude bounds for downscaling; the full input +field is downscaled if `xlim` and `ylim` are not specified.} -\item{ylim}{vector with latitude bounds for downscaling} +\item{ylim}{Vector with latitude bounds for downscaling} -\item{lapse}{float with environmental lapse rate} +\item{lapse}{Float with environmental lapse rate} -\item{lon_dim}{string with name of longitude dimension} +\item{lon_dim}{String with name of longitude dimension} -\item{lat_dim}{string with name of latitude dimension} +\item{lat_dim}{String with name of latitude dimension} -\item{time_dim}{a vector of character string indicating the name of temporal dimension. By default, it is set to NULL and it considers "ftime", "sdate" and "time" as temporal dimensions.} +\item{time_dim}{A vector of character string indicating the name of temporal +dimension. By default, it is set to NULL and it considers "ftime", "sdate" +and "time" as temporal dimensions.} -\item{nolapse}{logical, if true `oro` is interpreted as a fine-scale -climatology and used directly for bias correction} +\item{nolapse}{Logical, if true `oro` is interpreted as a fine-scale +climatology and used directly for bias correction.} -\item{verbose}{logical if to print diagnostic output} +\item{verbose}{Logical if to print diagnostic output.} -\item{compute_delta}{logical if true returns only a delta to be used for +\item{compute_delta}{Logical if true returns only a delta to be used for out-of-sample forecasts. Returns an object of the class 's2dv_cube', containing a delta. Activates `nolapse = TRUE`.} -\item{method}{string indicating the method used for interpolation: +\item{method}{String indicating the method used for interpolation: "nearest" (nearest neighbours followed by smoothing with a circular uniform weights kernel), "bilinear" (bilinear interpolation) The two methods provide similar results, but nearest is slightly better provided that the fine-scale grid is correctly centered as a subdivision -of the large-scale grid} +of the large-scale grid.} \item{delta}{An object of the class 's2dv_cube', containing a delta to be applied to the downscaled input data. Activates `nolapse = TRUE`. The grid of this object must coincide with that of the required output.} } \value{ -CST_RFTemp() returns a downscaled CSTools object -(i.e., of the class 's2dv_cube'). +CST_RFTemp() returns a downscaled CSTools object (i.e., of the class +'s2dv_cube'). } \description{ This function implements a simple lapse rate correction of a @@ -84,23 +85,27 @@ t <- rnorm(7 * 6 * 2 * 3 * 4)*10 + 273.15 + 10 dim(t) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 6, lon = 7) lon <- seq(3, 9, 1) lat <- seq(42, 47, 1) -exp <- list(data = t, lat = lat, lon = lon) +coords <- list(lat = lat, lon = lon) +exp <- list(data = t, coords = coords) attr(exp, 'class') <- 's2dv_cube' o <- runif(29*29)*3000 -dim(o) <- c(lat = 29, lon = 29) +dim(o) <- c(lats = 29, lons = 29) lon <- seq(3, 10, 0.25) lat <- seq(41, 48, 0.25) -oro <- list(data = o, lat = lat, lon = lon) +coords <- list(lat = lat, lon = lon) +oro <- list(data = o, coords = coords) attr(oro, 'class') <- 's2dv_cube' -res <- CST_RFTemp(exp, oro, xlim=c(4,8), ylim=c(43, 46), lapse=6.5) +res <- CST_RFTemp(data = exp, oro = oro, xlim = c(4,8), ylim = c(43, 46), + lapse = 6.5, time_dim = 'ftime', + lon_dim = 'lon', lat_dim = 'lat') } \references{ Method described in ERA4CS MEDSCOPE milestone M3.2: -High-quality climate prediction data available to WP4 -[https://www.medscope-project.eu/the-project/deliverables-reports/]([https://www.medscope-project.eu/the-project/deliverables-reports/) +High-quality climate prediction data available to WP4 here: +\url{https://www.medscope-project.eu/the-project/deliverables-reports/} and in H2020 ECOPOTENTIAL Deliverable No. 8.1: -High resolution (1-10 km) climate, land use and ocean change scenarios -[https://www.ecopotential-project.eu/images/ecopotential/documents/D8.1.pdf](https://www.ecopotential-project.eu/images/ecopotential/documents/D8.1.pdf) +High resolution (1-10 km) climate, land use and ocean change scenarios available +here: \url{https://ec.europa.eu/research/participants/documents/downloadPublic?documentIds=080166e5b6cd2324&appId=PPGMS} } \author{ Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} diff --git a/man/CST_RFWeights.Rd b/man/CST_RFWeights.Rd index acae8c6a512d775a3f0b0e892f4c121d12f907e0..3afde99133ee9ceea6be7d0ec47be800c09747cf 100644 --- a/man/CST_RFWeights.Rd +++ b/man/CST_RFWeights.Rd @@ -17,59 +17,69 @@ CST_RFWeights( ) } \arguments{ -\item{climfile}{Filename of a fine-scale precipitation climatology. -The file is expected to be in NetCDF format and should contain -at least one precipitation field. If several fields at different times are provided, -a climatology is derived by time averaging. -Suitable climatology files could be for example a fine-scale precipitation climatology -from a high-resolution regional climate model (see e.g. Terzago et al. 2018), a local -high-resolution gridded climatology from observations, or a reconstruction such as those which -can be downloaded from the WORLDCLIM (http://www.worldclim.org) or CHELSA (http://chelsa-climate.org) -websites. The latter data will need to be converted to NetCDF format before being used (see for example the GDAL tools (https://www.gdal.org). -It could also be a 's2dv_cube' object.} +\item{climfile}{Filename of a fine-scale precipitation climatology. The file +is expected to be in NetCDF format and should contain at least one +precipitation field. If several fields at different times are provided, +a climatology is derived by time averaging. Suitable climatology files could +be for example a fine-scale precipitation climatology from a high-resolution +regional climate model (see e.g. Terzago et al. 2018), a local +high-resolution gridded climatology from observations, or a reconstruction +such as those which can be downloaded from the WORLDCLIM +(\url{https://www.worldclim.org}) or CHELSA (\url{https://chelsa-climate.org/}) +websites. The latter data will need to be converted to NetCDF format before +being used (see for example the GDAL tools (\url{https://gdal.org/}). It +could also be an 's2dv_cube' object.} -\item{nf}{Refinement factor for downscaling (the output resolution is increased by this factor).} +\item{nf}{Refinement factor for downscaling (the output resolution is +increased by this factor).} \item{lon}{Vector of longitudes.} -\item{lat}{Vector of latitudes. -The number of longitudes and latitudes is expected to be even and the same. If not -the function will perform a subsetting to ensure this condition.} +\item{lat}{Vector of latitudes. The number of longitudes and latitudes is +expected to be even and the same. If not the function will perform a +subsetting to ensure this condition.} \item{varname}{Name of the variable to be read from \code{climfile}.} -\item{fsmooth}{Logical to use smooth conservation (default) or large-scale box-average conservation.} +\item{fsmooth}{Logical to use smooth conservation (default) or large-scale +box-average conservation.} -\item{lonname}{a character string indicating the name of the longitudinal dimension set as 'lon' by default.} +\item{lonname}{A character string indicating the name of the longitudinal +dimension set as 'lon' by default.} -\item{latname}{a character string indicating the name of the latitudinal dimension set as 'lat' by default.} +\item{latname}{A character string indicating the name of the latitudinal +dimension set as 'lat' by default.} -\item{ncores}{an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one.} +\item{ncores}{An integer that indicates the number of cores for parallel +computations using multiApply function. The default value is one.} } \value{ -An object of class 's2dv_cube' containing in matrix \code{data} the weights with dimensions (lon, lat). +An object of class 's2dv_cube' containing in matrix \code{data} the +weights with dimensions (lon, lat). } \description{ -Compute climatological ("orographic") weights from a fine-scale precipitation climatology file. +Compute climatological ("orographic") weights from a fine-scale +precipitation climatology file. } \examples{ # Create weights to be used with the CST_RainFARM() or RainFARM() functions -# using an external fine-scale climatology file. - -\dontrun{ -# Specify lon and lat of the input -lon <- seq(10,13.5,0.5) -lat <- seq(40,43.5,0.5) -nf <- 8 -ww <- CST_RFWeights("./worldclim.nc", nf, lon, lat, fsmooth = TRUE) -} +# using an external random data in the form of 's2dv_cube'. +obs <- rnorm(2 * 3 * 4 * 8 * 8) +dim(obs) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) +lon <- seq(10, 13.5, 0.5) +lat <- seq(40, 43.5, 0.5) +coords <- list(lon = lon, lat = lat) +data <- list(data = obs, coords = coords) +class(data) <- "s2dv_cube" +res <- CST_RFWeights(climfile = data, nf = 3, lon, lat, lonname = 'lon', + latname = 'lat', fsmooth = TRUE) } \references{ Terzago, S., Palazzi, E., & von Hardenberg, J. (2018). Stochastic downscaling of precipitation in complex orography: A simple method to reproduce a realistic fine-scale climatology. Natural Hazards and Earth System Sciences, 18(11), -2825-2840. http://doi.org/10.5194/nhess-18-2825-2018 . +2825-2840. \doi{10.5194/nhess-18-2825-2018}. } \author{ Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} diff --git a/man/CST_RainFARM.Rd b/man/CST_RainFARM.Rd index f86ab89cb65487fd36c22bd40e6ccddcf6cfc289..71cb54a155ef6270e75fd910f358e96deed24299 100644 --- a/man/CST_RainFARM.Rd +++ b/man/CST_RainFARM.Rd @@ -22,59 +22,68 @@ CST_RainFARM( \arguments{ \item{data}{An object of the class 's2dv_cube' as returned by `CST_Load`, containing the spatial precipitation fields to downscale. -The data object is expected to have an element named \code{$data} with at least two -spatial dimensions named "lon" and "lat" and one or more dimensions over which -to compute average spectral slopes (unless specified with parameter \code{slope}), -which can be specified by parameter \code{time_dim}. -The number of longitudes and latitudes in the input data is expected to be even and the same. If not -the function will perform a subsetting to ensure this condition.} +The data object is expected to have an element named \code{$data} with at +least two spatial dimensions named "lon" and "lat" and one or more +dimensions over which to compute average spectral slopes (unless specified +with parameter \code{slope}), which can be specified by parameter +\code{time_dim}. The number of longitudes and latitudes in the input data is +expected to be even and the same. If not the function will perform a +subsetting to ensure this condition.} \item{weights}{Matrix with climatological weights which can be obtained using -the \code{CST_RFWeights} function. If \code{weights=1.} (default) no weights are used. -The names of these dimensions must be at least 'lon' and 'lat'.} +the \code{CST_RFWeights} function. If \code{weights = 1.} (default) no +weights are used. The names of these dimensions must be at least 'lon' and +'lat'.} -\item{slope}{Prescribed spectral slope. The default is \code{slope=0.} -meaning that the slope is determined automatically over the dimensions specified by \code{time_dim}. A 1D array with named dimension can be provided (see details and examples)} +\item{slope}{Prescribed spectral slope. The default is \code{slope = 0.} +meaning that the slope is determined automatically over the dimensions +specified by \code{time_dim}. A 1D array with named dimension can be +provided (see details and examples).} -\item{nf}{Refinement factor for downscaling (the output resolution is increased by this factor).} +\item{nf}{Refinement factor for downscaling (the output resolution is +increased by this factor).} -\item{kmin}{First wavenumber for spectral slope (default: \code{kmin=1}).} +\item{kmin}{First wavenumber for spectral slope (default: \code{kmin = 1}).} -\item{nens}{Number of ensemble members to produce (default: \code{nens=1}).} +\item{nens}{Number of ensemble members to produce (default: \code{nens = 1}).} -\item{fglob}{Logical to conserve global precipitation over the domain (default: FALSE).} +\item{fglob}{Logical to conserve global precipitation over the domain +(default: FALSE).} -\item{fsmooth}{Logical to conserve precipitation with a smoothing kernel (default: TRUE).} +\item{fsmooth}{Logical to conserve precipitation with a smoothing kernel +(default: TRUE).} -\item{nprocs}{The number of parallel processes to spawn for the use for parallel computation in multiple cores. (default: 1)} +\item{nprocs}{The number of parallel processes to spawn for the use for +parallel computation in multiple cores. (default: 1)} \item{time_dim}{String or character array with name(s) of dimension(s) (e.g. "ftime", "sdate", "member" ...) over which to compute spectral slopes. If a character array of dimension names is provided, the spectral slopes -will be computed as an average over all elements belonging to those dimensions. -If omitted one of c("ftime", "sdate", "time") is searched and the first one with more -than one element is chosen.} +will be computed as an average over all elements belonging to those +dimensions. If omitted one of c("ftime", "sdate", "time") is searched and +the first one with more than one element is chosen.} \item{verbose}{Logical for verbose output (default: FALSE).} -\item{drop_realization_dim}{Logical to remove the "realization" stochastic ensemble dimension, -needed for saving data through function CST_SaveData (default: FALSE) -with the following behaviour if set to TRUE: - -1) if \code{nens==1}: the dimension is dropped; - -2) if \code{nens>1} and a "member" dimension exists: - the "realization" and "member" dimensions are compacted (multiplied) and the resulting dimension is named "member"; - -3) if \code{nens>1} and a "member" dimension does not exist: the "realization" dimension is renamed to "member".} +\item{drop_realization_dim}{Logical to remove the "realization" stochastic +ensemble dimension, needed for saving data through function CST_SaveData +(default: FALSE) with the following behaviour if set to TRUE: +\enumerate{ + \item{if \code{nens == 1}: the dimension is dropped;} + \item{if \code{nens > 1} and a "member" dimension exists: the "realization" + and "member" dimensions are compacted (multiplied) and the resulting + dimension is named "member";} + \item{if \code{nens > 1} and a "member" dimension does not exist: the + "realization" dimension is renamed to "member".} +}} } \value{ CST_RainFARM() returns a downscaled CSTools object (i.e., of the -class 's2dv_cube'). -If \code{nens>1} an additional dimension named "realizatio"n is added to the -\code{$data} array after the "member" dimension (unless -\code{drop_realization_dim=TRUE} is specified). -The ordering of the remaining dimensions in the \code{$data} element of the input object is maintained. +class 's2dv_cube'). If \code{nens > 1} an additional dimension named +"realization" is added to the \code{$data} array after the "member" dimension +(unless \code{drop_realization_dim = TRUE} is specified). The ordering of the +remaining dimensions in the \code{$data} element of the input object is +maintained. } \description{ This function implements the RainFARM stochastic precipitation @@ -84,40 +93,30 @@ Adapted for climate downscaling and including orographic correction as described in Terzago et al. 2018. } \details{ -Wether parameter 'slope' and 'weights' presents seasonality dependency, a dimension name should match between these parameters and the input data in parameter 'data'. See example 2 below where weights and slope vary with 'sdate' dimension. +Wether parameter 'slope' and 'weights' presents seasonality +dependency, a dimension name should match between these parameters and the +input data in parameter 'data'. See example 2 below where weights and slope +vary with 'sdate' dimension. } \examples{ -#Example 1: using CST_RainFARM for a CSTools object +# Example 1: using CST_RainFARM for a CSTools object nf <- 8 # Choose a downscaling by factor 8 exp <- 1 : (2 * 3 * 4 * 8 * 8) dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) lon <- seq(10, 13.5, 0.5) -dim(lon) <- c(lon = length(lon)) lat <- seq(40, 43.5, 0.5) -dim(lat) <- c(lat = length(lat)) -data <- list(data = exp, lon = lon, lat = lat) +coords <- list(lon = lon, lat = lat) +data <- list(data = exp, coords = coords) +class(data) <- 's2dv_cube' # Create a test array of weights ww <- array(1., dim = c(lon = 8 * nf, lat = 8 * nf)) -res <- CST_RainFARM(data, nf = nf, weights = ww, nens=3) -str(res) -#List of 3 -# $ data: num [1, 1:2, 1:3, 1:3, 1:4, 1:64, 1:64] 260 553 281 278 143 ... -# $ lon : num [1:64] 9.78 9.84 9.91 9.97 10.03 ... -# $ lat : num [1:64] 39.8 39.8 39.9 40 40 ... -dim(res$data) -# dataset member realization sdate ftime lat lon -# 1 2 3 3 4 64 64 - -# Example 2: -slo <- array(c(0.1, 0.5, 0.7), c(sdate= 3)) -wei <- array(rnorm(8 * 8 * 3), c(lon = 8, lat = 8, sdate = 3)) -res <- CST_RainFARM(lonlat_prec, - weights = wei, slope = slo, nf = 2) +res <- CST_RainFARM(data, nf = nf, weights = ww, nens = 3, time_dim = 'ftime') } \references{ Terzago, S. et al. (2018). NHESS 18(11), 2825-2840. -http://doi.org/10.5194/nhess-18-2825-2018 ; -D'Onofrio et al. (2014), J of Hydrometeorology 15, 830-843; Rebora et. al. (2006), JHM 7, 724. +\doi{10.5194/nhess-18-2825-2018}; +D'Onofrio et al. (2014), J of Hydrometeorology 15, 830-843; Rebora et. al. +(2006), JHM 7, 724. } \author{ Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} diff --git a/man/CST_RegimesAssign.Rd b/man/CST_RegimesAssign.Rd index 82d06e0b03a18d55d768505b3aa3debfb317cc56..2fec7426ca2c259211199ea49c751b8ea5745ec8 100644 --- a/man/CST_RegimesAssign.Rd +++ b/man/CST_RegimesAssign.Rd @@ -15,43 +15,60 @@ CST_RegimesAssign( ) } \arguments{ -\item{data}{a 's2dv_cube' object.} +\item{data}{An 's2dv_cube' object.} -\item{ref_maps}{a 's2dv_cube' object as the output of CST_WeatherRegimes.} +\item{ref_maps}{An 's2dv_cube' object as the output of CST_WeatherRegimes.} -\item{method}{whether the matching will be performed in terms of minimum distance (default = 'distance') or -the maximum spatial correlation (method = 'ACC') between the maps.} +\item{method}{Whether the matching will be performed in terms of minimum +distance (default = 'distance') or the maximum spatial correlation +(method = 'ACC') between the maps.} -\item{composite}{a logical parameter indicating if the composite maps are computed or not (default = FALSE).} +\item{composite}{A logical parameter indicating if the composite maps are +computed or not (default = FALSE).} -\item{memb}{a logical value indicating whether to compute composites for separate members (default FALSE) or as unique ensemble (TRUE). -This option is only available for when parameter 'composite' is set to TRUE and the data object has a dimension named 'member'.} +\item{memb}{A logical value indicating whether to compute composites for +separate members (default FALSE) or as unique ensemble (TRUE). This option +is only available for when parameter 'composite' is set to TRUE and the data +object has a dimension named 'member'.} -\item{ncores}{the number of multicore threads to use for parallel computation.} +\item{ncores}{The number of multicore threads to use for parallel computation.} } \value{ -A list with two elements \code{$data} (a 's2dv_cube' object containing the composites cluster=1,..,K for case (*1) - \code{$pvalue} (array with the same structure as \code{$data} containing the pvalue of the composites obtained through a t-test - that accounts for the serial dependence of the data with the same structure as Composite.)(only when composite = 'TRUE'), - \code{$cluster} (array with the same dimensions as data (except latitude and longitude which are removed) indicating the ref_maps to which each point is allocated.) , - \code{$frequency} (A vector of integers (from k=1,...k n reference maps) indicating the percentage of assignations corresponding to each map.), +A list with two elements \code{$data} (a 's2dv_cube' object containing +the composites cluster=1,..,K for case (*1) or only k=1 for any specific +cluster, i.e., case (*2)) (only when composite = 'TRUE') and \code{$statistics} +that includes \code{$pvalue} (array with the same structure as \code{$data} +containing the pvalue of the composites obtained through a t-test that +accounts for the serial dependence of the data with the same structure as +Composite.)(only when composite = 'TRUE'), \code{$cluster} (array with the +same dimensions as data (except latitude and longitude which are removed) +indicating the ref_maps to which each point is allocated.), \code{$frequency} +(A vector of integers (from k=1,...k n reference maps) indicating the +percentage of assignations corresponding to each map.). } \description{ -This function performs the matching between a field of anomalies and a set -of maps which will be used as a reference. The anomalies will be assigned to the reference map -for which the minimum Eucledian distance (method=’distance’) or highest spatial correlation -(method = 'ACC') is obtained. +This function performs the matching between a field of anomalies +and a set of maps which will be used as a reference. The anomalies will be +assigned to the reference map for which the minimum Eucledian distance +(method =’distance’) or highest spatial correlation (method = 'ACC') is +obtained. } \examples{ -\dontrun{ -regimes <- CST_WeatherRegimes(data = lonlat_temp$obs, EOFs = FALSE, ncenters = 4) -res1 <- CST_RegimesAssign(data = lonlat_temp$exp, ref_maps = regimes, composite = FALSE) -res2 <- CST_RegimesAssign(data = lonlat_temp$exp, ref_maps = regimes, composite = TRUE) -} +data <- array(abs(rnorm(1280, 282.7, 6.4)), dim = c(dataset = 2, member = 2, + sdate = 3, ftime = 3, + lat = 4, lon = 4)) +coords <- list(lon = seq(0, 3), lat = seq(47, 44)) +exp <- list(data = data, coords = coords) +class(exp) <- 's2dv_cube' +regimes <- CST_WeatherRegimes(data = exp, EOFs = FALSE, + ncenters = 4) +res1 <- CST_RegimesAssign(data = exp, ref_maps = regimes, + composite = FALSE) } \references{ -Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools -for the development of a climate service. Thesis. Available online: \url{https://eprints.ucm.es/56841/} +Torralba, V. (2019) Seasonal climate prediction for the wind +energy sector: methods and tools for the development of a climate service. +Thesis. Available online: \url{https://eprints.ucm.es/56841/} } \author{ Verónica Torralba - BSC, \email{veronica.torralba@bsc.es} diff --git a/man/CST_SaveExp.Rd b/man/CST_SaveExp.Rd index 787d1bd77b21e317bf757af4262e7de7283ab822..032f01c4ca3b3ee0b6c46bd0c09ee9bd2652e5bd 100644 --- a/man/CST_SaveExp.Rd +++ b/man/CST_SaveExp.Rd @@ -2,38 +2,98 @@ % Please edit documentation in R/CST_SaveExp.R \name{CST_SaveExp} \alias{CST_SaveExp} -\title{Save CSTools objects of class 's2dv_cube' containing experiments or observed -data in NetCDF format} +\title{Save objects of class 's2dv_cube' to data in NetCDF format} \usage{ -CST_SaveExp(data, destination = "./CST_Data", extra_string = NULL) +CST_SaveExp( + data, + destination = "./", + sdate_dim = "sdate", + ftime_dim = "time", + dat_dim = "dataset", + var_dim = "var", + memb_dim = "member", + single_file = FALSE, + extra_string = NULL +) } \arguments{ -\item{data}{an object of class \code{s2dv_cube}.} +\item{data}{An object of class \code{s2dv_cube}.} -\item{destination}{a character string containing the directory name in which +\item{destination}{A character string containing the directory name in which to save the data. NetCDF file for each starting date are saved into the -folder tree: destination/experiment/variable/. By default the function -creates and saves the data into the folder "CST_Data" in the working -directory.} +folder tree: \cr +destination/Dataset/variable/. By default the function +creates and saves the data into the working directory.} -\item{extra_string}{a character string to be include as part of the file name, for instance, to identify member or realization. It would be added to the file name between underscore characters.} +\item{sdate_dim}{A character string indicating the name of the start date +dimension. By default, it is set to 'sdate'. It can be NULL if there is no +start date dimension.} + +\item{ftime_dim}{A character string indicating the name of the forecast time +dimension. By default, it is set to 'time'. It can be NULL if there is no +forecast time dimension.} + +\item{dat_dim}{A character string indicating the name of dataset dimension. +By default, it is set to 'dataset'. It can be NULL if there is no dataset +dimension.} + +\item{var_dim}{A character string indicating the name of variable dimension. +By default, it is set to 'var'. It can be NULL if there is no variable +dimension.} + +\item{memb_dim}{A character string indicating the name of the member dimension. +By default, it is set to 'member'. It can be NULL if there is no member +dimension.} + +\item{single_file}{A logical value indicating if all object is saved in a +single file (TRUE) or in multiple files (FALSE). When it is FALSE, +the array is separated for Datasets, variable and start date. It is FALSE +by default.} + +\item{extra_string}{A character string to be include as part of the file name, +for instance, to identify member or realization. It would be added to the +file name between underscore characters.} +} +\value{ +Multiple or single NetCDF files containing the data array.\cr +\item{\code{single_file = TRUE}}{ + All data is saved in a single file located in the specified destination + path with the following name: + ___.nc. Multiple + variables are saved separately in the same file. The forecast time units + is extracted from the frequency of the time steps (hours, days, months). + The first value of forecast time is 1. If no frequency is found, the units + will be 'hours since' each start date and the time steps are assumed to be + equally spaced. +} +\item{\code{single_file = FALSE}}{ + The data array is subset and stored into multiple files. Each file + contains the data subset for each start date, variable and dataset. Files + with different variables and Datasets are stored in separated directories + within the following directory tree: destination/Dataset/variable/. + The name of each file will be: + __.nc. +} } \description{ This function allows to divide and save a object of class 's2dv_cube' into a NetCDF file, allowing to reload the saved data using -\code{CST_Load} function. +\code{Start} function from StartR package. If the original 's2dv_cube' object +has been created from \code{CST_Load()}, then it can be reloaded with +\code{Load()}. } \examples{ \dontrun{ -library(CSTools) data <- lonlat_temp$exp -destination <- "./path2/" -CST_SaveExp(data = data, destination = destination) +destination <- "./" +CST_SaveExp(data = data, destination = destination, ftime_dim = 'ftime', + var_dim = NULL, ftime_dim = 'ftime', var_dim = NULL) } } \seealso{ -\code{\link{CST_Load}}, \code{\link{as.s2dv_cube}} and \code{\link{s2dv_cube}} +\code{\link[startR]{Start}}, \code{\link{as.s2dv_cube}} and +\code{\link{s2dv_cube}} } \author{ Perez-Zanon Nuria, \email{nuria.perez@bsc.es} diff --git a/man/CST_SplitDim.Rd b/man/CST_SplitDim.Rd index 80a94da3c8735a4fe76ecb202af69d788732bf36..b07d9897ceac08db2b876ffa67d30322c87d34b5 100644 --- a/man/CST_SplitDim.Rd +++ b/man/CST_SplitDim.Rd @@ -14,26 +14,46 @@ CST_SplitDim( ) } \arguments{ -\item{data}{a 's2dv_cube' object} +\item{data}{A 's2dv_cube' object} -\item{split_dim}{a character string indicating the name of the dimension to split} +\item{split_dim}{A character string indicating the name of the dimension to +split.} -\item{indices}{a vector of numeric indices or dates. If left at NULL, the dates provided in the s2dv_cube object (element Dates) will be used.} +\item{indices}{A vector of numeric indices or dates. If left at NULL, the +dates provided in the s2dv_cube object (element Dates) will be used.} -\item{freq}{a character string indicating the frequency: by 'day', 'month' and 'year' or 'monthly' (by default). 'month' identifies months between 1 and 12 independently of the year they belong to, while 'monthly' differenciates months from different years.} +\item{freq}{A character string indicating the frequency: by 'day', 'month' and +'year' or 'monthly' (by default). 'month' identifies months between 1 and 12 +independently of the year they belong to, while 'monthly' differenciates +months from different years.} -\item{new_dim_name}{a character string indicating the name of the new dimension.} +\item{new_dim_name}{A character string indicating the name of the new +dimension.} -\item{insert_ftime}{an integer indicating the number of time steps to add at the begining of the time series.} +\item{insert_ftime}{An integer indicating the number of time steps to add at +the begining of the time series.} } \description{ -This function split a dimension in two. The user can select the dimension to split and provide indices indicating how to split that dimension or dates and the frequency expected (monthly or by day, month and year). The user can also provide a numeric frequency indicating the length of each division. +This function split a dimension in two. The user can select the +dimension to split and provide indices indicating how to split that dimension +or dates and the frequency expected (monthly or by day, month and year). The +user can also provide a numeric frequency indicating the length of each +division. } \details{ -Parameter 'insert_ftime' has been included for the case of using daily data, requiring split the temporal dimensions by months (or similar) and the first lead time doesn't correspondt to the 1st day of the month. In this case, the insert_ftime could be used, to get a final output correctly organized. E.g.: leadtime 1 is the 2nd of November and the input time series extend to the 31st of December. When requiring split by month with \code{inset_ftime = 1}, the 'monthly' dimension of length two will indicate the month (position 1 for November and position 2 for December), dimension 'time' will be length 31. For November, the position 1 and 31 will be NAs, while from positon 2 to 30 will be filled with the data provided. This allows to select correctly days trhough time dimension. +Parameter 'insert_ftime' has been included for the case of using +daily data, requiring split the temporal dimensions by months (or similar) and +the first lead time doesn't correspondt to the 1st day of the month. In this +case, the insert_ftime could be used, to get a final output correctly +organized. E.g.: leadtime 1 is the 2nd of November and the input time series +extend to the 31st of December. When requiring split by month with +\code{inset_ftime = 1}, the 'monthly' dimension of length two will indicate +the month (position 1 for November and position 2 for December), dimension +'time' will be length 31. For November, the position 1 and 31 will be NAs, +while from positon 2 to 30 will be filled with the data provided. This allows +to select correctly days trhough time dimension. } \examples{ - data <- 1 : 20 dim(data) <- c(time = 10, lat = 2) data <-list(data = data) @@ -46,13 +66,9 @@ time <- c(seq(ISOdate(1903, 1, 1), ISOdate(1903, 1, 4), "days"), data <- list(data = data$data, Dates = time) class(data) <- 's2dv_cube' new_data <- CST_SplitDim(data, indices = time) -dim(new_data$data) new_data <- CST_SplitDim(data, indices = time, freq = 'day') -dim(new_data$data) new_data <- CST_SplitDim(data, indices = time, freq = 'month') -dim(new_data$data) new_data <- CST_SplitDim(data, indices = time, freq = 'year') -dim(new_data$data) } \author{ Nuria Perez-Zanon, \email{nuria.perez@bsc.es} diff --git a/man/CST_Subset.Rd b/man/CST_Subset.Rd new file mode 100644 index 0000000000000000000000000000000000000000..a9f480aa0cc48cd5eace57633d778fbd43a73769 --- /dev/null +++ b/man/CST_Subset.Rd @@ -0,0 +1,70 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/CST_Subset.R +\name{CST_Subset} +\alias{CST_Subset} +\title{Subset an object of class s2dv_cube} +\usage{ +CST_Subset(x, along, indices, drop = FALSE, var_dim = NULL, dat_dim = NULL) +} +\arguments{ +\item{x}{An object of class \code{s2dv_cube} to be sliced.} + +\item{along}{A vector with references to the dimensions to take the subset +from: either integers or dimension names.} + +\item{indices}{A list of indices to take from each dimension specified in +'along'. If a single dimension is specified in 'along', it can be directly +provided as an integer or a vector.} + +\item{drop}{Whether to drop all the dimensions of length 1 in the resulting +array, none, only those that are specified in 'along', or only those that +are not specified in 'along'. The possible values are: 'all' or TRUE, 'none' +or FALSE, 'selected', and 'non-selected'. The default value is FALSE.} + +\item{var_dim}{A chatacter string indicating the name of the variable +dimension. The default value is NULL.} + +\item{dat_dim}{A character string indicating the name of dataset dimension. +The default value is NULL.} +} +\value{ +An object of class \code{s2dv_cube} with similar data, coordinates and + attributes as the \code{x} input, but with trimmed or dropped dimensions. +} +\description{ +This function allows to subset (i.e. slice, take a chunk of) the data inside +an object of class \code{s2dv_cube} and modify the dimensions, coordinates and +attributes accordingly, removing any variables, time steps and spatial +coordinates that are dropped when subsetting. It ensures that the information +inside the s2dv_cube remains coherent with the data it contains.\cr\cr +As in the function \code{Subset} from the ClimProjDiags package, the +dimensions to subset along can be specified via the parameter \code{along} +either with integer indices or by their name.\cr\cr +There are additional ways to adjust which dimensions are dropped in the +resulting object: either to drop all, to drop none, to drop only the ones that +have been sliced or to drop only the ones that have not been sliced.\cr\cr +The \code{load_parameters} and \code{when} attributes of the original cube +are preserved. The \code{source_files} attribute is subset along the +\code{var_dim} and \code{dat_dim} dimensions. +} +\examples{ +#Example with sample data: +# Check original dimensions and coordinates +lonlat_temp$exp$dims +names(lonlat_temp$exp$coords) +# Subset the s2dv_cube +exp_subset <- CST_Subset(lonlat_temp$exp, + along = c("lat", "lon"), + indices = list(1:10, 1:10), + drop = 'non-selected') +# Check new dimensions and coordinates +exp_subset$dims +names(exp_subset$coords) + +} +\seealso{ +\link[ClimProjDiags]{Subset} +} +\author{ +Agudetse Roures Victoria, \email{victoria.agudetse@bsc.es} +} diff --git a/man/CST_WeatherRegimes.Rd b/man/CST_WeatherRegimes.Rd index 64b24eec9d1bc23648aef017bb369aa1c61e9d9f..bd4bad00ffee3b5bafdd8e7053ce55e67f82fa6b 100644 --- a/man/CST_WeatherRegimes.Rd +++ b/man/CST_WeatherRegimes.Rd @@ -17,55 +17,78 @@ CST_WeatherRegimes( ) } \arguments{ -\item{data}{a 's2dv_cube' object} +\item{data}{An 's2dv_cube' object.} -\item{ncenters}{Number of clusters to be calculated with the clustering function.} +\item{ncenters}{Number of clusters to be calculated with the clustering +function.} -\item{EOFs}{Whether to compute the EOFs (default = 'TRUE') or not (FALSE) to filter the data.} +\item{EOFs}{Whether to compute the EOFs (default = 'TRUE') or not (FALSE) to +filter the data.} -\item{neofs}{number of modes to be kept (default = 30).} +\item{neofs}{Number of modes to be kept (default = 30).} -\item{varThreshold}{Value with the percentage of variance to be explained by the PCs. -Only sufficient PCs to explain this much variance will be used in the clustering.} +\item{varThreshold}{Value with the percentage of variance to be explained by +the PCs. Only sufficient PCs to explain this much variance will be used in +the clustering.} -\item{method}{Different options to estimate the clusters. The most traditional approach is the k-means analysis (default=’kmeans’) -but the function also support the different methods included in the hclust . These methods are: -"ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). -For more details about these methods see the hclust function documentation included in the stats package.} +\item{method}{Different options to estimate the clusters. The most traditional +approach is the k-means analysis (default=’kmeans’) but the function also +support the different methods included in the hclust . These methods are: +"ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" +(= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). For more details +about these methods see the hclust function documentation included in the +stats package.} -\item{iter.max}{Parameter to select the maximum number of iterations allowed (Only if method='kmeans' is selected).} +\item{iter.max}{Parameter to select the maximum number of iterations allowed +(Only if method='kmeans' is selected).} -\item{nstart}{Parameter for the cluster analysis determining how many random sets to choose (Only if method='kmeans' is selected).} +\item{nstart}{Parameter for the cluster analysis determining how many random +sets to choose (Only if method='kmeans' is selected).} \item{ncores}{The number of multicore threads to use for parallel computation.} } \value{ -A list with two elements \code{$data} (a 's2dv_cube' object containing the composites cluster=1,..,K for case (*1) - \code{$pvalue} (array with the same structure as \code{$data} containing the pvalue of the composites obtained through a t-test that accounts for the serial dependence.), - \code{cluster} (A matrix or vector with integers (from 1:k) indicating the cluster to which each time step is allocated.), - \code{persistence} (Percentage of days in a month/season before a cluster is replaced for a new one (only if method=’kmeans’ has been selected.)), - \code{frequency} (Percentage of days in a month/season belonging to each cluster (only if method=’kmeans’ has been selected).), +A list with two elements \code{$data} (a 's2dv_cube' object containing +the composites cluster = 1,..,K for case (*1) or only k = 1 for any specific +cluster, i.e., case (*2)) and \code{$statistics} that includes \code{$pvalue} +(array with the same structure as \code{$data} containing the pvalue of the +composites obtained through a t-test that accounts for the serial dependence.), +\code{cluster} (A matrix or vector with integers (from 1:k) indicating the +cluster to which each time step is allocated.), \code{persistence} (Percentage +of days in a month/season before a cluster is replaced for a new one (only if +method=’kmeans’ has been selected.)), \code{frequency} (Percentage of days in +a month/season belonging to each cluster (only if method=’kmeans’ has been +selected).), } \description{ -This function computes the weather regimes from a cluster analysis. -It is applied on the array \code{data} in a 's2dv_cube' object. The dimensionality of this object can be also reduced -by using PCs obtained from the application of the #'EOFs analysis to filter the dataset. -The cluster analysis can be performed with the traditional k-means or those methods +This function computes the weather regimes from a cluster +analysis. It is applied on the array \code{data} in a 's2dv_cube' object. The +dimensionality of this object can be also reduced by using PCs obtained from +the application of the #'EOFs analysis to filter the dataset. The cluster +analysis can be performed with the traditional k-means or those methods included in the hclust (stats package). } \examples{ -\dontrun{ -res1 <- CST_WeatherRegimes(data = lonlat_temp$obs, EOFs = FALSE, ncenters = 4) -res2 <- CST_WeatherRegimes(data = lonlat_temp$obs, EOFs = TRUE, ncenters = 3) -} +data <- array(abs(rnorm(1280, 283.7, 6)), dim = c(dataset = 2, member = 2, + sdate = 3, ftime = 3, + lat = 4, lon = 4)) +coords <- list(lon = seq(0, 3), lat = seq(47, 44)) +obs <- list(data = data, coords = coords) +class(obs) <- 's2dv_cube' + +res1 <- CST_WeatherRegimes(data = obs, EOFs = FALSE, ncenters = 4) +res2 <- CST_WeatherRegimes(data = obs, EOFs = TRUE, ncenters = 3) + } \references{ -Cortesi, N., V., Torralba, N., González-Reviriego, A., Soret, and F.J., Doblas-Reyes (2019). -Characterization of European wind speed variability using weather regimes. Climate Dynamics,53, -4961–4976, doi:10.1007/s00382-019-04839-5. +Cortesi, N., V., Torralba, N., González-Reviriego, A., Soret, and +F.J., Doblas-Reyes (2019). Characterization of European wind speed variability +using weather regimes. Climate Dynamics,53, 4961–4976, +\doi{10.1007/s00382-019-04839-5}. -Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools -for the development of a climate service. Thesis. Available online: \url{https://eprints.ucm.es/56841/} +Torralba, V. (2019) Seasonal climate prediction for the wind +energy sector: methods and tools for the development of a climate service. +Thesis. Available online: \url{https://eprints.ucm.es/56841/}. } \author{ Verónica Torralba - BSC, \email{veronica.torralba@bsc.es} diff --git a/man/Calibration.Rd b/man/Calibration.Rd index 8b0b5231a4abdf8666c8c19fe2d6c3540107689e..b907326aafb8c44a9e773096afb6c5434997d0d4 100644 --- a/man/Calibration.Rd +++ b/man/Calibration.Rd @@ -17,46 +17,111 @@ Calibration( alpha = NULL, memb_dim = "member", sdate_dim = "sdate", - ncores = 1 + dat_dim = NULL, + ncores = NULL ) } \arguments{ -\item{exp}{a multidimensional array with named dimensions (at least 'sdate' and 'member') containing the seasonal hindcast experiment data. The hindcast is used to calibrate the forecast in case the forecast is provided; if not, the same hindcast will be calibrated instead.} - -\item{obs}{a multidimensional array with named dimensions (at least 'sdate') containing the observed data.} - -\item{exp_cor}{an optional multidimensional array with named dimensions (at least 'sdate' and 'member') containing the seasonal forecast experiment data. If the forecast is provided, it will be calibrated using the hindcast and observations; if not, the hindcast will be calibrated instead.} - -\item{cal.method}{is the calibration method used, can be either \code{bias}, \code{evmos}, \code{mse_min}, \code{crps_min} or \code{rpc-based}. Default value is \code{mse_min}.} - -\item{eval.method}{is the sampling method used, can be either \code{in-sample} or \code{leave-one-out}. Default value is the \code{leave-one-out} cross validation. In case the forecast is provided, any chosen eval.method is over-ruled and a third option is used.} - -\item{multi.model}{is a boolean that is used only for the \code{mse_min} method. If multi-model ensembles or ensembles of different sizes are used, it must be set to \code{TRUE}. By default it is \code{FALSE}. Differences between the two approaches are generally small but may become large when using small ensemble sizes. Using multi.model when the calibration method is \code{bias}, \code{evmos} or \code{crps_min} will not affect the result.} - -\item{na.fill}{is a boolean that indicates what happens in case calibration is not possible or will yield unreliable results. This happens when three or less forecasts-observation pairs are available to perform the training phase of the calibration. By default \code{na.fill} is set to true such that NA values will be returned. If \code{na.fill} is set to false, the uncorrected data will be returned.} - -\item{na.rm}{is a boolean that indicates whether to remove the NA values or not. The default value is \code{TRUE}.} - -\item{apply_to}{is a character string that indicates whether to apply the calibration to all the forecast (\code{"all"}) or only to those where the correlation between the ensemble mean and the observations is statistically significant (\code{"sign"}). Only useful if \code{cal.method == "rpc-based"}.} - -\item{alpha}{is a numeric value indicating the significance level for the correlation test. Only useful if \code{cal.method == "rpc-based" & apply_to == "sign"}.} - -\item{memb_dim}{is a character string indicating the name of the member dimension. By default, it is set to 'member'.} - -\item{sdate_dim}{is a character string indicating the name of the start date dimension. By default, it is set to 'sdate'.} - -\item{ncores}{is an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one.} +\item{exp}{A multidimensional array with named dimensions (at least 'sdate' +and 'member') containing the seasonal hindcast experiment data. The hindcast +is used to calibrate the forecast in case the forecast is provided; if not, +the same hindcast will be calibrated instead.} + +\item{obs}{A multidimensional array with named dimensions (at least 'sdate') +containing the observed data.} + +\item{exp_cor}{An optional multidimensional array with named dimensions (at +least 'sdate' and 'member') containing the seasonal forecast experiment +data. If the forecast is provided, it will be calibrated using the hindcast +and observations; if not, the hindcast will be calibrated instead. If there +is only one corrected dataset, it should not have dataset dimension. If there +is a corresponding corrected dataset for each 'exp' forecast, the dataset +dimension must have the same length as in 'exp'. The default value is NULL.} + +\item{cal.method}{A character string indicating the calibration method used, +can be either \code{bias}, \code{evmos}, \code{mse_min}, \code{crps_min} +or \code{rpc-based}. Default value is \code{mse_min}.} + +\item{eval.method}{A character string indicating the sampling method used, +can be either \code{in-sample} or \code{leave-one-out}. Default value is +the \code{leave-one-out} cross validation. In case the forecast is +provided, any chosen eval.method is over-ruled and a third option is +used.} + +\item{multi.model}{A boolean that is used only for the \code{mse_min} +method. If multi-model ensembles or ensembles of different sizes are used, +it must be set to \code{TRUE}. By default it is \code{FALSE}. Differences +between the two approaches are generally small but may become large when +using small ensemble sizes. Using multi.model when the calibration method +is \code{bias}, \code{evmos} or \code{crps_min} will not affect the result.} + +\item{na.fill}{A boolean that indicates what happens in case calibration is +not possible or will yield unreliable results. This happens when three or +less forecasts-observation pairs are available to perform the training phase +of the calibration. By default \code{na.fill} is set to true such that NA +values will be returned. If \code{na.fill} is set to false, the uncorrected +data will be returned.} + +\item{na.rm}{A boolean that indicates whether to remove the NA values or +not. The default value is \code{TRUE}.} + +\item{apply_to}{A character string that indicates whether to apply the +calibration to all the forecast (\code{"all"}) or only to those where the +correlation between the ensemble mean and the observations is statistically +significant (\code{"sign"}). Only useful if \code{cal.method == "rpc-based"}.} + +\item{alpha}{A numeric value indicating the significance level for the +correlation test. Only useful if \code{cal.method == "rpc-based" & apply_to == +"sign"}.} + +\item{memb_dim}{A character string indicating the name of the member +dimension. By default, it is set to 'member'.} + +\item{sdate_dim}{A character string indicating the name of the start date +dimension. By default, it is set to 'sdate'.} + +\item{dat_dim}{A character string indicating the name of dataset dimension. +The length of this dimension can be different between 'exp' and 'obs'. +The default value is NULL.} + +\item{ncores}{An integer that indicates the number of cores for parallel +computation using multiApply function. The default value is NULL (one core).} } \value{ -an array containing the calibrated forecasts with the same dimensions as the \code{exp} array. +An array containing the calibrated forecasts with the dimensions +nexp, nobs and same dimensions as in the 'exp' array. nexp is the number of +experiment (i.e., 'dat_dim' in exp), and nobs is the number of observation +(i.e., 'dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. +If 'exp_cor' is provided the returned array will be with the same dimensions as +'exp_cor'. } \description{ -Five types of member-by-member bias correction can be performed. The \code{"bias"} method corrects the bias only, the \code{"evmos"} method applies a variance inflation technique to ensure the correction of the bias and the correspondence of variance between forecast and observation (Van Schaeybroeck and Vannitsem, 2011). The ensemble calibration methods \code{"mse_min"} and \code{"crps_min"} correct the bias, the overall forecast variance and the ensemble spread as described in Doblas-Reyes et al. (2005) and Van Schaeybroeck and Vannitsem (2015), respectively. While the \code{"mse_min"} method minimizes a constrained mean-squared error using three parameters, the \code{"crps_min"} method features four parameters and minimizes the Continuous Ranked Probability Score (CRPS). The \code{"rpc-based"} method adjusts the forecast variance ensuring that the ratio of predictable components (RPC) is equal to one, as in Eade et al. (2014). - -Both in-sample or our out-of-sample (leave-one-out cross validation) calibration are possible. +Five types of member-by-member bias correction can be performed. +The \code{"bias"} method corrects the bias only, the \code{"evmos"} method +applies a variance inflation technique to ensure the correction of the bias +and the correspondence of variance between forecast and observation (Van +Schaeybroeck and Vannitsem, 2011). The ensemble calibration methods +\code{"mse_min"} and \code{"crps_min"} correct the bias, the overall forecast +variance and the ensemble spread as described in Doblas-Reyes et al. (2005) +and Van Schaeybroeck and Vannitsem (2015), respectively. While the +\code{"mse_min"} method minimizes a constrained mean-squared error using three +parameters, the \code{"crps_min"} method features four parameters and +minimizes the Continuous Ranked Probability Score (CRPS). The +\code{"rpc-based"} method adjusts the forecast variance ensuring that the +ratio of predictable components (RPC) is equal to one, as in Eade et al. +(2014). Both in-sample or our out-of-sample (leave-one-out cross +validation) calibration are possible. } \details{ -Both the \code{na.fill} and \code{na.rm} parameters can be used to indicate how the function has to handle the NA values. The \code{na.fill} parameter checks whether there are more than three forecast-observations pairs to perform the computation. In case there are three or less pairs, the computation is not carried out, and the value returned by the function depends on the value of this parameter (either NA if \code{na.fill == TRUE} or the uncorrected value if \code{na.fill == TRUE}). On the other hand, \code{na.rm} is used to indicate the function whether to remove the missing values during the computation of the parameters needed to perform the calibration. +Both the \code{na.fill} and \code{na.rm} parameters can be used to +indicate how the function has to handle the NA values. The \code{na.fill} +parameter checks whether there are more than three forecast-observations pairs +to perform the computation. In case there are three or less pairs, the +computation is not carried out, and the value returned by the function depends +on the value of this parameter (either NA if \code{na.fill == TRUE} or the +uncorrected value if \code{na.fill == TRUE}). On the other hand, \code{na.rm} +is used to indicate the function whether to remove the missing values during +the computation of the parameters needed to perform the calibration. } \examples{ mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) @@ -64,16 +129,26 @@ dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) a <- Calibration(exp = mod1, obs = obs1) -str(a) + } \references{ -Doblas-Reyes F.J, Hagedorn R, Palmer T.N. The rationale behind the success of multi-model ensembles in seasonal forecasting-II calibration and combination. Tellus A. 2005;57:234-252. doi:10.1111/j.1600-0870.2005.00104.x - -Eade, R., Smith, D., Scaife, A., Wallace, E., Dunstone, N., Hermanson, L., & Robinson, N. (2014). Do seasonal-to-decadal climate predictions underestimate the predictability of the read world? Geophysical Research Letters, 41(15), 5620-5628. doi: 10.1002/2014GL061146 - -Van Schaeybroeck, B., & Vannitsem, S. (2011). Post-processing through linear regression. Nonlinear Processes in Geophysics, 18(2), 147. doi:10.5194/npg-18-147-2011 - -Van Schaeybroeck, B., & Vannitsem, S. (2015). Ensemble post-processing using member-by-member approaches: theoretical aspects. Quarterly Journal of the Royal Meteorological Society, 141(688), 807-818. doi:10.1002/qj.2397 +Doblas-Reyes F.J, Hagedorn R, Palmer T.N. The rationale behind the +success of multi-model ensembles in seasonal forecasting-II calibration and +combination. Tellus A. 2005;57:234-252. doi:10.1111/j.1600-0870.2005.00104.x + +Eade, R., Smith, D., Scaife, A., Wallace, E., Dunstone, N., +Hermanson, L., & Robinson, N. (2014). Do seasonal-to-decadal climate +predictions underestimate the predictability of the read world? Geophysical +Research Letters, 41(15), 5620-5628. \doi{10.1002/2014GL061146} + +Van Schaeybroeck, B., & Vannitsem, S. (2011). Post-processing +through linear regression. Nonlinear Processes in Geophysics, 18(2), +147. \doi{10.5194/npg-18-147-2011} + +Van Schaeybroeck, B., & Vannitsem, S. (2015). Ensemble +post-processing using member-by-member approaches: theoretical aspects. +Quarterly Journal of the Royal Meteorological Society, 141(688), 807-818. +\doi{10.1002/qj.2397} } \seealso{ \code{\link{CST_Load}} diff --git a/man/CategoricalEnsCombination.Rd b/man/CategoricalEnsCombination.Rd index 2f5ad14d69ba9db0ea970da8bc340ec73f66e2a1..c3599d032e4b76643c844d01b2ebac2a9dd20869 100644 --- a/man/CategoricalEnsCombination.Rd +++ b/man/CategoricalEnsCombination.Rd @@ -2,41 +2,66 @@ % Please edit documentation in R/CST_CategoricalEnsCombination.R \name{CategoricalEnsCombination} \alias{CategoricalEnsCombination} -\title{Make categorical forecast based on a multi-model forecast with potential for calibrate} +\title{Make categorical forecast based on a multi-model forecast with potential for +calibrate} \usage{ CategoricalEnsCombination(fc, obs, cat.method, eval.method, amt.cat, ...) } \arguments{ -\item{fc}{a multi-dimensional array with named dimensions containing the seasonal forecast experiment data in the element named \code{$data}. The amount of forecasting models is equal to the size of the \code{dataset} dimension of the data array. The amount of members per model may be different. The size of the \code{member} dimension of the data array is equal to the maximum of the ensemble members among the models. Models with smaller ensemble sizes have residual indices of \code{member} dimension in the data array filled with NA values.} +\item{fc}{A multi-dimensional array with named dimensions containing the +seasonal forecast experiment data in the element named \code{$data}. The +amount of forecasting models is equal to the size of the \code{dataset} +dimension of the data array. The amount of members per model may be +different. The size of the \code{member} dimension of the data array is +equal to the maximum of the ensemble members among the models. Models with +smaller ensemble sizes have residual indices of \code{member} dimension in +the data array filled with NA values.} -\item{obs}{a multidimensional array with named dimensions containing the observed data in the element named \code{$data}.} +\item{obs}{A multidimensional array with named dimensions containing the +observed data in the element named \code{$data}.} -\item{cat.method}{method used to produce the categorical forecast, can be either \code{pool}, \code{comb}, \code{mmw} or \code{obs}. The method pool assumes equal weight for all ensemble members while the method comb assumes equal weight for each model. The weighting method is descirbed in Rajagopalan et al. (2002), Robertson et al. (2004) and Van Schaeybroeck and Vannitsem (2019). Finally, the \code{obs} method classifies the observations into the different categories and therefore contains only 0 and 1 values.} +\item{cat.method}{Method used to produce the categorical forecast, can be +either \code{pool}, \code{comb}, \code{mmw} or \code{obs}. The method pool +assumes equal weight for all ensemble members while the method comb assumes +equal weight for each model. The weighting method is descirbed in +Rajagopalan et al. (2002), Robertson et al. (2004) and Van Schaeybroeck and +Vannitsem (2019). Finally, the \code{obs} method classifies the observations +into the different categories and therefore contains only 0 and 1 values.} -\item{eval.method}{is the sampling method used, can be either \code{"in-sample"} or \code{"leave-one-out"}. Default value is the \code{"leave-one-out"} cross validation.} +\item{eval.method}{Is the sampling method used, can be either +\code{"in-sample"} or \code{"leave-one-out"}. Default value is the +\code{"leave-one-out"} cross validation.} -\item{amt.cat}{is the amount of categories. Equally-sized quantiles will be calculated based on the amount of categories.} +\item{amt.cat}{Is the amount of categories. Equally-sized quantiles will be +calculated based on the amount of categories.} -\item{...}{other parameters to be passed on to the calibration procedure.} +\item{...}{Other parameters to be passed on to the calibration procedure.} } \value{ -an array containing the categorical forecasts in the element called \code{$data}. The first two dimensions of the returned object are named dataset and member and are both of size one. An additional dimension named category is introduced and is of size amt.cat. +An array containing the categorical forecasts in the element called +\code{$data}. The first two dimensions of the returned object are named +dataset and member and are both of size one. An additional dimension named +category is introduced and is of size amt.cat. } \description{ -This function converts a multi-model ensemble forecast -into a categorical forecast by giving the probability -for each category. Different methods are available to combine -the different ensemble forecasting models into -probabilistic categorical forecasts. +This function converts a multi-model ensemble forecast into a +categorical forecast by giving the probability for each category. Different +methods are available to combine the different ensemble forecasting models +into probabilistic categorical forecasts. See details in ?CST_CategoricalEnsCombination } \references{ -Rajagopalan, B., Lall, U., & Zebiak, S. E. (2002). Categorical climate forecasts through regularization and optimal combination of multiple GCM ensembles. Monthly Weather Review, 130(7), 1792-1811. +Rajagopalan, B., Lall, U., & Zebiak, S. E. (2002). Categorical +climate forecasts through regularization and optimal combination of multiple +GCM ensembles. Monthly Weather Review, 130(7), 1792-1811. -Robertson, A. W., Lall, U., Zebiak, S. E., & Goddard, L. (2004). Improved combination of multiple atmospheric GCM ensembles for seasonal prediction. Monthly Weather Review, 132(12), 2732-2744. +Robertson, A. W., Lall, U., Zebiak, S. E., & Goddard, L. (2004). +Improved combination of multiple atmospheric GCM ensembles for seasonal +prediction. Monthly Weather Review, 132(12), 2732-2744. -Van Schaeybroeck, B., & Vannitsem, S. (2019). Postprocessing of Long-Range Forecasts. In Statistical Postprocessing of Ensemble Forecasts (pp. 267-290). +Van Schaeybroeck, B., & Vannitsem, S. (2019). Postprocessing of +Long-Range Forecasts. In Statistical Postprocessing of Ensemble Forecasts (pp. 267-290). } \author{ Bert Van Schaeybroeck, \email{bertvs@meteo.be} diff --git a/man/DynBiasCorrection.Rd b/man/DynBiasCorrection.Rd index e6de373c814712165465b15d2f7802167b52982a..b329ac920d4f86c2e01da4fd16f639ef8e942df5 100644 --- a/man/DynBiasCorrection.Rd +++ b/man/DynBiasCorrection.Rd @@ -16,32 +16,32 @@ DynBiasCorrection( ) } \arguments{ -\item{exp}{a multidimensional array with named dimensions with the -experiment data} +\item{exp}{A multidimensional array with named dimensions with the +experiment data.} -\item{obs}{a multidimensional array with named dimensions with the -observation data} +\item{obs}{A multidimensional array with named dimensions with the +observation data.} -\item{method}{a character string indicating the method to apply bias +\item{method}{A character string indicating the method to apply bias correction among these ones: -"PTF","RQUANT","QUANT","SSPLIN"} +"PTF", "RQUANT", "QUANT", "SSPLIN".} -\item{wetday}{logical indicating whether to perform wet day correction +\item{wetday}{Logical indicating whether to perform wet day correction or not OR a numeric threshold below which all values are set to zero (by default is set to 'FALSE').} -\item{proxy}{a character string indicating the proxy for local dimension +\item{proxy}{A character string indicating the proxy for local dimension 'dim' or inverse of persistence 'theta' to apply the dynamical conditioned bias correction method.} -\item{quanti}{a number lower than 1 indicating the quantile to perform the -computation of local dimension and theta} +\item{quanti}{A number lower than 1 indicating the quantile to perform the +computation of local dimension and theta.} -\item{ncores}{The number of cores to use in parallel computation} +\item{ncores}{The number of cores to use in parallel computation.} } \value{ -a multidimensional array with named dimensions with a bias correction -performed conditioned by local dimension 'dim' or inverse of persistence 'theta' +A multidimensional array with named dimensions with a bias correction +performed conditioned by local dimension 'dim' or inverse of persistence 'theta'. } \description{ This function perform a bias correction conditioned by the @@ -60,13 +60,13 @@ dim (expL) <- c(time =100,lat = 4, lon = 5) obsL <- c(rnorm(1:1980),expL[1,,]*1.2) dim (obsL) <- c(time = 100,lat = 4, lon = 5) dynbias <- DynBiasCorrection(exp = expL, obs = obsL, method='QUANT', - proxy= "dim", quanti = 0.6) + proxy= "dim", quanti = 0.6) } \references{ Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale atmospheric predictability.Nature Communications, 10(1), 1316. -DOI = https://doi.org/10.1038/s41467-019-09305-8 " +\doi{10.1038/s41467-019-09305-8}" Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. diff --git a/man/EnsClustering.Rd b/man/EnsClustering.Rd index e9409932e27021e890683f6806d9a487e5b0228f..17915b3f2b258637b08c6e25e941fc803076e5d2 100644 --- a/man/EnsClustering.Rd +++ b/man/EnsClustering.Rd @@ -21,57 +21,67 @@ EnsClustering( ) } \arguments{ -\item{data}{A matrix of dimensions 'dataset member sdate ftime lat lon' containing the variables to be analysed.} +\item{data}{A matrix of dimensions 'dataset member sdate ftime lat lon' +containing the variables to be analysed. Latitudinal dimension accepted +names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal +dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'.} \item{lat}{Vector of latitudes.} \item{lon}{Vector of longitudes.} -\item{time_moment}{Decides the moment to be applied to the time dimension. Can be either 'mean' (time mean), -'sd' (standard deviation along time) or 'perc' (a selected percentile on time). -If 'perc' the keyword 'time_percentile' is also used.} +\item{time_moment}{Decides the moment to be applied to the time dimension. Can +be either 'mean' (time mean), 'sd' (standard deviation along time) or 'perc' +(a selected percentile on time). If 'perc' the keyword 'time_percentile' is +also used.} -\item{numclus}{Number of clusters (scenarios) to be calculated. -If set to NULL the number of ensemble members divided by 10 is used, with a minimum of 2 and a maximum of 8.} +\item{numclus}{Number of clusters (scenarios) to be calculated. If set to NULL +the number of ensemble members divided by 10 is used, with a minimum of 2 +and a maximum of 8.} \item{lon_lim}{List with the two longitude margins in `c(-180,180)` format.} \item{lat_lim}{List with the two latitude margins.} -\item{variance_explained}{variance (percentage) to be explained by the set of EOFs. -Defaults to 80. Not used if numpcs is specified.} +\item{variance_explained}{variance (percentage) to be explained by the set of +EOFs. Defaults to 80. Not used if numpcs is specified.} \item{numpcs}{Number of EOFs retained in the analysis (optional).} -\item{time_percentile}{Set the percentile in time you want to analyse (used for `time_moment = "perc").} +\item{time_percentile}{Set the percentile in time you want to analyse (used +for `time_moment = "perc").} -\item{time_dim}{String or character array with name(s) of dimension(s) over which to compute statistics. -If omitted c("ftime", "sdate", "time") are searched in this order.} +\item{time_dim}{String or character array with name(s) of dimension(s) over +which to compute statistics. If omitted c("ftime", "sdate", "time") are +searched in this order.} -\item{cluster_dim}{Dimension along which to cluster. Typically "member" or "sdate". -This can also be a list like c("member", "sdate").} +\item{cluster_dim}{Dimension along which to cluster. Typically "member" or +"sdate". This can also be a list like c("member", "sdate").} \item{verbose}{Logical for verbose output} } \value{ A list with elements \code{$cluster} (cluster assigned for each member), - \code{$freq} (relative frequency of each cluster), \code{$closest_member} - (representative member for each cluster), \code{$repr_field} (list of fields - for each representative member), \code{composites} (list of mean fields for each cluster), - \code{$lon} (selected longitudes of output fields), - \code{$lat} (selected longitudes of output fields). +\code{$freq} (relative frequency of each cluster), \code{$closest_member} +(representative member for each cluster), \code{$repr_field} (list of fields for +each representative member), \code{composites} (list of mean fields for each +cluster), \code{$lon} (selected longitudes of output fields), \code{$lat} +(selected longitudes of output fields). } \description{ This function performs a clustering on members/starting dates -and returns a number of scenarios, with representative members for each of them. -The clustering is performed in a reduced EOF space. +and returns a number of scenarios, with representative members for each of +them. The clustering is performed in a reduced EOF space. } \examples{ -\donttest{ -exp <- lonlat_temp$exp -res <- EnsClustering(exp$data, exp$lat, exp$lon, numclus = 3, - cluster_dim = c("member", "dataset", "sdate")) -} +exp <- array(abs(rnorm(1152))*275, dim = c(dataset = 1, member = 4, + sdate = 6, ftime = 3, + lat = 4, lon = 4)) +lon <- seq(0, 3) +lat <- seq(48, 45) +res <- EnsClustering(exp, lat = lat, lon = lon, numclus = 2, + cluster_dim = c("member", "dataset", "sdate")) + } \author{ Federico Fabiano - ISAC-CNR, \email{f.fabiano@isac.cnr.it} diff --git a/man/MergeDims.Rd b/man/MergeDims.Rd index 7539ef6ef08a128c11591c5a98f4654dfb7f1f4e..b6c70c5872d6f1d1b25292f9bfbf696c75c4e599 100644 --- a/man/MergeDims.Rd +++ b/man/MergeDims.Rd @@ -12,19 +12,23 @@ MergeDims( ) } \arguments{ -\item{data}{an n-dimensional array with named dimensions} +\item{data}{An n-dimensional array with named dimensions} -\item{merge_dims}{a character vector indicating the names of the dimensions to merge} +\item{merge_dims}{A character vector indicating the names of the dimensions to +merge.} -\item{rename_dim}{a character string indicating the name of the output dimension. If left at NULL, the first dimension name provided in parameter \code{merge_dims} will be used.} +\item{rename_dim}{A character string indicating the name of the output +dimension. If left at NULL, the first dimension name provided in parameter +\code{merge_dims} will be used.} -\item{na.rm}{a logical indicating if the NA values should be removed or not.} +\item{na.rm}{A logical indicating if the NA values should be removed or not.} } \description{ -This function merges two dimensions of an array into one. The user can select the dimensions to merge and provide the final name of the dimension. The user can select to remove NA values or keep them. +This function merges two dimensions of an array into one. The +user can select the dimensions to merge and provide the final name of the +dimension. The user can select to remove NA values or keep them. } \examples{ - data <- 1 : 20 dim(data) <- c(time = 10, lat = 2) new_data <- MergeDims(data, merge_dims = c('time', 'lat')) diff --git a/man/MultiEOF.Rd b/man/MultiEOF.Rd index dd0fc7fe59e6a1c6606c2de2cbbceca6f0b4b15d..04963e1ac27d7d68a464dc0c8bb62291a53360e1 100644 --- a/man/MultiEOF.Rd +++ b/man/MultiEOF.Rd @@ -2,7 +2,8 @@ % Please edit documentation in R/CST_MultiEOF.R \name{MultiEOF} \alias{MultiEOF} -\title{EOF analysis of multiple variables starting from an array (reduced version)} +\title{EOF analysis of multiple variables starting from an array (reduced +version)} \usage{ MultiEOF( data, @@ -19,9 +20,12 @@ MultiEOF( ) } \arguments{ -\item{data}{A multidimensional array with dimension \code{"var"}, -containing the variables to be analysed. The other diemnsions follow the same structure as the -\code{"exp"} element of a 's2dv_cube' object.} +\item{data}{A multidimensional array with dimension \code{"var"}, containing +the variables to be analysed. The other diemnsions follow the same structure +as the \code{"exp"} element of a 's2dv_cube' object. Latitudinal +dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. +Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', +'nav_lon'.} \item{lon}{Vector of longitudes.} @@ -29,27 +33,48 @@ containing the variables to be analysed. The other diemnsions follow the same st \item{time}{Vector or matrix of dates in POSIXct format.} -\item{lon_dim}{String with dimension name of longitudinal coordinate} +\item{lon_dim}{String with dimension name of longitudinal coordinate.} -\item{lat_dim}{String with dimension name of latitudinal coordinate} +\item{lat_dim}{String with dimension name of latitudinal coordinate.} -\item{neof_max}{Maximum number of single eofs considered in the first decomposition} +\item{neof_max}{Maximum number of single eofs considered in the first +decomposition.} -\item{neof_composed}{Number of composed eofs to return in output} +\item{neof_composed}{Number of composed eofs to return in output.} -\item{minvar}{Minimum variance fraction to be explained in first decomposition} +\item{minvar}{Minimum variance fraction to be explained in first decomposition.} -\item{lon_lim}{Vector with longitudinal range limits for the calculation for all input variables} +\item{lon_lim}{Vector with longitudinal range limits for the calculation for +all input variables.} -\item{lat_lim}{Vector with latitudinal range limits for the calculation for all input variables} +\item{lat_lim}{Vector with latitudinal range limits for the calculation for +all input variables.} } \value{ -A list with elements \code{$coeff} (an array of time-varying principal component coefficients), - \code{$variance} (a matrix of explained variances), - \code{eof_pattern} (a matrix of EOF patterns obtained by regression for each variable). +A list with elements \code{$coeff} (an array of time-varying principal +component coefficients), \code{$variance} (a matrix of explained variances), +\code{eof_pattern} (a matrix of EOF patterns obtained by regression for each +variable). } \description{ -This function performs EOF analysis over multiple variables, accepting in input an array with a dimension \code{"var"} for each variable to analyse. Based on Singular Value Decomposition. For each field the EOFs are computed and the corresponding PCs are standardized (unit variance, zero mean); the minimum number of principal components needed to reach the user-defined variance is retained. The function weights the input data for the latitude cosine square root. +This function performs EOF analysis over multiple variables, +accepting in input an array with a dimension \code{"var"} for each variable to +analyse. Based on Singular Value Decomposition. For each field the EOFs are +computed and the corresponding PCs are standardized (unit variance, zero mean); +the minimum number of principal components needed to reach the user-defined +variance is retained. The function weights the input data for the latitude +cosine square root. +} +\examples{ +exp <- array(runif(1280)*280, dim = c(dataset = 2, member = 2, sdate = 3, + ftime = 3, lat = 4, lon = 4, var = 1)) +lon <- seq(0, 3) +lat <- seq(47, 44) +dates <- c("2000-11-01", "2000-12-01", "2001-01-01", "2001-11-01", + "2001-12-01", "2002-01-01", "2002-11-01", "2002-12-01", "2003-01-01") +Dates <- as.POSIXct(dates, format = "\%Y-\%m-\%d") +dim(Dates) <- c(ftime = 3, sdate = 3) +cal <- MultiEOF(data = exp, lon = lon, lat = lat, time = Dates) } \author{ Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} diff --git a/man/MultiMetric.Rd b/man/MultiMetric.Rd index 252f735c475b17a765c26db830aebefcd8da0687..1fbba692f6140a5e0285933d864187d92f844293 100644 --- a/man/MultiMetric.Rd +++ b/man/MultiMetric.Rd @@ -15,34 +15,57 @@ MultiMetric( ) } \arguments{ -\item{exp}{a multidimensional array with named dimensions.} +\item{exp}{A multidimensional array with named dimensions.} -\item{obs}{a multidimensional array with named dimensions.} +\item{obs}{A multidimensional array with named dimensions.} -\item{metric}{a character string giving the metric for computing the maximum skill. This must be one of the strings 'correlation', 'rms' or 'rmsss.} +\item{metric}{A character string giving the metric for computing the maximum +skill. This must be one of the strings 'correlation', 'rms' or 'rmsss.} -\item{multimodel}{a logical value indicating whether a Multi-Model Mean should be computed.} +\item{multimodel}{A logical value indicating whether a Multi-Model Mean should +be computed.} -\item{time_dim}{name of the temporal dimension where a mean will be applied. It can be NULL, the default value is 'ftime'.} +\item{time_dim}{Name of the temporal dimension where a mean will be applied. +It can be NULL, the default value is 'ftime'.} -\item{memb_dim}{name of the member dimension. It can be NULL, the default value is 'member'.} +\item{memb_dim}{Name of the member dimension. It can be NULL, the default +value is 'member'.} -\item{sdate_dim}{name of the start date dimension or a dimension name identifiying the different forecast. It can be NULL, the default value is 'sdate'.} +\item{sdate_dim}{Name of the start date dimension or a dimension name +identifiying the different forecast. It can be NULL, the default value is +'sdate'.} } \value{ -a list of arrays containing the statistics of the selected metric in the element \code{$data} which is a list of arrays: for the metric requested and others for statistics about its signeificance. The arrays have two dataset dimensions equal to the 'dataset' dimension in the \code{exp$data} and \code{obs$data} inputs. If \code{multimodel} is TRUE, the greatest position in the first dimension correspons to the Multi-Model Mean. +A list of arrays containing the statistics of the selected metric in +the element \code{$data} which is a list of arrays: for the metric requested +and others for statistics about its signeificance. The arrays have two dataset +dimensions equal to the 'dataset' dimension in the \code{exp$data} and +\code{obs$data} inputs. If \code{multimodel} is TRUE, the greatest position in +the first dimension correspons to the Multi-Model Mean. } \description{ -This function calculates correlation (Anomaly Correlation Coefficient; ACC), root mean square error (RMS) and the root mean square error skill score (RMSSS) of individual anomaly models and multi-models mean (if desired) with the observations on arrays with named dimensions. +This function calculates correlation (Anomaly Correlation +Coefficient; ACC), root mean square error (RMS) and the root mean square error +skill score (RMSSS) of individual anomaly models and multi-models mean (if +desired) with the observations on arrays with named dimensions. } \examples{ -res <- MultiMetric(lonlat_temp$exp$data, lonlat_temp$obs$data) +exp <- array(rnorm(2*2*4*5*2*2), + dim = c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, + lon = 2)) +obs <- array(rnorm(1*1*4*5*2*2), + dim = c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, + lon = 2)) +res <- MultiMetric(exp = exp, obs = obs) } \references{ -Mishra, N., Prodhomme, C., & Guemas, V. (n.d.). Multi-Model Skill Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, 29-31.\url{https://link.springer.com/article/10.1007/s00382-018-4404-z} +Mishra, N., Prodhomme, C., & Guemas, V. (n.d.). Multi-Model Skill +Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, +29-31. \doi{10.1007/s00382-018-4404-z} } \seealso{ -\code{\link[s2dv]{Corr}}, \code{\link[s2dv]{RMS}}, \code{\link[s2dv]{RMSSS}} and \code{\link{CST_Load}} +\code{\link[s2dv]{Corr}}, \code{\link[s2dv]{RMS}}, +\code{\link[s2dv]{RMSSS}} and \code{\link{CST_Load}} } \author{ Mishra Niti, \email{niti.mishra@bsc.es} diff --git a/man/PDFIndexHind.Rd b/man/PDFIndexHind.Rd new file mode 100644 index 0000000000000000000000000000000000000000..79a6424d02a70b631c86dcae2ca35af48aa6e94b --- /dev/null +++ b/man/PDFIndexHind.Rd @@ -0,0 +1,74 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/BEI_PDFBest.R +\name{PDFIndexHind} +\alias{PDFIndexHind} +\title{Computing the Index PDFs for a dataset of SFSs for a hindcats period.} +\usage{ +PDFIndexHind( + index_hind, + index_obs, + method = "ME", + time_dim_name = "time", + na.rm = FALSE +) +} +\arguments{ +\item{index_hind}{Index (e.g. NAO index) array from SFSs +with at least two dimensions (time , member) or (time, statistic). +The temporal dimension, by default 'time', must be greater than 2. +The dimension 'member' must be greater than 1. +The dimension 'statistic' must be equal to 2, for containing the two +paramenters of a normal distribution (mean and sd) representing the ensemble +of a SFS. It is not possible to have the dimension 'member' and 'statistic' +together.} + +\item{index_obs}{Index (e.g. NAO index) array from an observational database +or reanalysis with at least a temporal dimension (by default 'time'), +which must be greater than 2.} + +\item{method}{A character string indicating which methodology is applied +to compute the PDFs. One of "ME" (default) or "LMEV".} + +\item{time_dim_name}{A character string indicating the name of the temporal +dimension, by default 'time'.} + +\item{na.rm}{Logical (default = FALSE). Should missing values be removed?} +} +\value{ +An array with at least two dimensions (time, statistic = 4). The firt +statistic is the parameter 'mean' of the PDF with not bias corrected. +The second statistic is the parameter 'standard deviation' of the PDF with not +bias corrected. The third statistic is the parameter 'mean' of the PDF with +bias corrected. The fourth statistic is the parameter 'standard deviation' of +the PDF with bias corrected. +} +\description{ +This function implements the computation to obtain the index PDFs +(e.g. NAO index) to improve the index estimate from SFSs for a hindcast period. +} +\examples{ +# Example for the PDFIndexHind function +# Example 1 +index_obs <- 1 : (5 * 3 ) +dim(index_obs) <- c(time = 5, season = 3) +index_hind <- 1 : (5 * 4 * 3) +dim(index_hind) <- c(time = 5, member = 4, season = 3) +res <- PDFIndexHind(index_hind, index_obs) +dim(res) +# time statistic season +# 5 4 3 +# Example 2 +index_obs <- 1 : (5 * 3) +dim(index_obs) <- c(time = 5, season = 3) +index_hind <- 1 : (5 * 2 * 3) +dim(index_hind) <- c(time = 5, statistic = 2, season = 3) +res <- PDFIndexHind(index_hind, index_obs) +} +\references{ +Regionally improved seasonal forecast of precipitation through Best +estimation of winter NAO, Sanchez-Garcia, E. et al., +Adv. Sci. Res., 16, 165174, 2019, \doi{10.5194/asr-16-165-2019} +} +\author{ +Eroteida Sanchez-Garcia - AEMET, \email{esanchezg@aemet.es} +} diff --git a/man/PlotForecastPDF.Rd b/man/PlotForecastPDF.Rd index c6442bbbc1aae7a3458d45db409cbea9d72fb9ee..31e3001fce60c33ce81730189eec426c55f5c5ce 100644 --- a/man/PlotForecastPDF.Rd +++ b/man/PlotForecastPDF.Rd @@ -14,7 +14,7 @@ PlotForecastPDF( var.name = "Varname (units)", fcst.names = NULL, add.ensmemb = c("above", "below", "no"), - color.set = c("ggplot", "s2s4e", "hydro"), + color.set = c("ggplot", "s2s4e", "hydro", "vitigeoss"), memb_dim = "member" ) } @@ -55,9 +55,9 @@ individual forecast.} or \code{'below'} the pdf, or not (\code{'no'}).} \item{color.set}{A selection of predefined color sets: use \code{'ggplot'} -(default) for blue/green/red, \code{'s2s4e'} for blue/grey/orange, or +(default) for blue/green/red, \code{'s2s4e'} for blue/grey/orange, \code{'hydro'} for yellow/gray/blue (suitable for precipitation and -inflows).} +inflows) or the \code{"vitigeoss"} color set.} \item{memb_dim}{A character string indicating the name of the member dimension.} @@ -76,14 +76,8 @@ included as hatched areas. Individual ensemble members can be plotted as jittered points. The observed value is optionally shown as a diamond. } \examples{ -fcsts <- data.frame(fcst1 = rnorm(10), fcst2 = rnorm(10, 0.5, 1.2), - fcst3 = rnorm(10, -0.5, 0.9)) +fcsts <- data.frame(fcst1 = rnorm(10), fcst2 = rnorm(10, 0.5, 1.2)) PlotForecastPDF(fcsts,c(-1,1)) -\donttest{ -fcsts2 <- array(rnorm(100), dim = c(member = 20, fcst = 5)) -PlotForecastPDF(fcsts2, c(-0.66, 0.66), extreme.limits = c(-1.2, 1.2), - fcst.names = paste0('random fcst ', 1 : 5), obs = 0.7) -} } \author{ Llorenç Lledó \email{llledo@bsc.es} diff --git a/man/PlotMostLikelyQuantileMap.Rd b/man/PlotMostLikelyQuantileMap.Rd index cc98f8f5c64adcb0fe71b8ec4435a2d392cf7107..0dde63ff0480fd076769cc77b48a14ac9ec563d2 100644 --- a/man/PlotMostLikelyQuantileMap.Rd +++ b/man/PlotMostLikelyQuantileMap.Rd @@ -16,17 +16,30 @@ PlotMostLikelyQuantileMap( ) } \arguments{ -\item{probs}{a list of bi-dimensional arrays with the named dimensions 'latitude' (or 'lat') and 'longitude' (or 'lon'), with equal size and in the same order, or a single tri-dimensional array with an additional dimension (e.g. 'bin') for the different categories. The arrays must contain probability values between 0 and 1, and the probabilities for all categories of a grid cell should not exceed 1 when added.} +\item{probs}{A list of bi-dimensional arrays with the named dimensions +'latitude' (or 'lat') and 'longitude' (or 'lon'), with equal size and in the +same order, or a single tri-dimensional array with an additional dimension +(e.g. 'bin') for the different categories. The arrays must contain +probability values between 0 and 1, and the probabilities for all categories +of a grid cell should not exceed 1 when added.} -\item{lon}{a numeric vector with the longitudes of the map grid, in the same order as the values along the corresponding dimension in \code{probs}.} +\item{lon}{A numeric vector with the longitudes of the map grid, in the same +order as the values along the corresponding dimension in \code{probs}.} -\item{lat}{a numeric vector with the latitudes of the map grid, in the same order as the values along the corresponding dimension in \code{probs}.} +\item{lat}{A numeric vector with the latitudes of the map grid, in the same +order as the values along the corresponding dimension in \code{probs}.} -\item{cat_dim}{the name of the dimension along which the different categories are stored in \code{probs}. This only applies if \code{probs} is provided in the form of 3-dimensional array. The default expected name is 'bin'.} +\item{cat_dim}{The name of the dimension along which the different categories +are stored in \code{probs}. This only applies if \code{probs} is provided in +the form of 3-dimensional array. The default expected name is 'bin'.} -\item{bar_titles}{vector of character strings with the names to be drawn on top of the color bar for each of the categories. As many titles as categories provided in \code{probs} must be provided.} +\item{bar_titles}{Vector of character strings with the names to be drawn on +top of the color bar for each of the categories. As many titles as +categories provided in \code{probs} must be provided.} -\item{col_unknown_cat}{character string with a colour representation of the colour to be used to paint the cells for which no category can be clearly assigned. Takes the value 'white' by default.} +\item{col_unknown_cat}{Character string with a colour representation of the +colour to be used to paint the cells for which no category can be clearly +assigned. Takes the value 'white' by default.} \item{drawleg}{Where to draw the common colour bar. Can take values TRUE, FALSE or:\cr @@ -35,10 +48,20 @@ FALSE or:\cr 'right', 'r', 'R', 'east', 'e', 'E'\cr 'left', 'l', 'L', 'west', 'w', 'W'} -\item{...}{additional parameters to be sent to \code{PlotCombinedMap} and \code{PlotEquiMap}.} +\item{...}{Additional parameters to be sent to \code{PlotCombinedMap} and +\code{PlotEquiMap}.} } \description{ -This function receives as main input (via the parameter \code{probs}) a collection of longitude-latitude maps, each containing the probabilities (from 0 to 1) of the different grid cells of belonging to a category. As many categories as maps provided as inputs are understood to exist. The maps of probabilities must be provided on a common rectangular regular grid, and a vector with the longitudes and a vector with the latitudes of the grid must be provided. The input maps can be provided in two forms, either as a list of multiple two-dimensional arrays (one for each category) or as a three-dimensional array, where one of the dimensions corresponds to the different categories. +This function receives as main input (via the parameter +\code{probs}) a collection of longitude-latitude maps, each containing the +probabilities (from 0 to 1) of the different grid cells of belonging to a +category. As many categories as maps provided as inputs are understood to +exist. The maps of probabilities must be provided on a common rectangular +regular grid, and a vector with the longitudes and a vector with the latitudes +of the grid must be provided. The input maps can be provided in two forms, +either as a list of multiple two-dimensional arrays (one for each category) or +as a three-dimensional array, where one of the dimensions corresponds to the +different categories. } \examples{ # Simple example @@ -127,11 +150,11 @@ PlotMostLikelyQuantileMap(bins, lons, lats, mask = 1 - (w1 + w2 / max(c(w1, w2))), brks = 20, width = 10, height = 8) } - } \seealso{ \code{PlotCombinedMap} and \code{PlotEquiMap} } \author{ -Veronica Torralba, \email{veronica.torralba@bsc.es}, Nicolau Manubens, \email{nicolau.manubens@bsc.es} +Veronica Torralba, \email{veronica.torralba@bsc.es}, Nicolau Manubens, +\email{nicolau.manubens@bsc.es} } diff --git a/man/PlotPDFsOLE.Rd b/man/PlotPDFsOLE.Rd index ff3c568e638a93870726d36d58d29d236c1aa2aa..e2c6606eb20822b0b29d40c99411dbb0de451710 100644 --- a/man/PlotPDFsOLE.Rd +++ b/man/PlotPDFsOLE.Rd @@ -25,7 +25,7 @@ to combining.} \item{pdf_2}{A numeric array with a dimension named 'statistic', containg two parameters: mean' and 'standard deviation' of the second gaussian pdf - to combining.} +to combining.} \item{nsigma}{(optional) A numeric value for setting the limits of X axis. (Default nsigma = 3).} @@ -68,20 +68,6 @@ attr(pdf_2, "name") <- "NAO2" dim(pdf_2) <- c(statistic = 2) PlotPDFsOLE(pdf_1, pdf_2) - -# Example 2 -Glosea5PDF <- c(2.25, 0.67) -attr(Glosea5PDF, "name") <- "Glosea5" -dim(Glosea5PDF) <- c(statistic = 2) -ECMWFPDF <- c(2.38, 0.61) -attr(ECMWFPDF, "name") <- "ECMWF" -dim(ECMWFPDF) <- c(statistic = 2) -MFPDF <- c(4.52, 0.34) -attr(MFPDF, "name") <- "MF" -dim(MFPDF) <- c(statistic = 2) -PlotPDFsOLE(pdf_1 = Glosea5PDF, pdf_2 = ECMWFPDF, legendPos = 'left') -PlotPDFsOLE(pdf_1 = Glosea5PDF, pdf_2 = MFPDF, legendPos = 'top') -PlotPDFsOLE(pdf_1 = ECMWFPDF, pdf_2 = MFPDF, legendSize = 1.2) } \author{ Eroteida Sanchez-Garcia - AEMET, //email{esanchezg@aemet.es} diff --git a/man/PlotTriangles4Categories.Rd b/man/PlotTriangles4Categories.Rd index 7329553c235056df4cd65cdd8794540943832f7e..8356da995e7813acb4de0bb3ac2392080b2054c9 100644 --- a/man/PlotTriangles4Categories.Rd +++ b/man/PlotTriangles4Categories.Rd @@ -33,7 +33,7 @@ PlotTriangles4Categories( ) } \arguments{ -\item{data}{array with three named dimensions: 'dimx', 'dimy', 'dimcat', +\item{data}{Array with three named dimensions: 'dimx', 'dimy', 'dimcat', containing the values to be displayed in a coloured image with triangles.} \item{brks}{A vector of the color bar intervals. The length must be one more @@ -45,17 +45,16 @@ default values.} \item{toptitle}{A string of the title of the grid. Set NULL as default.} -\item{sig_data}{logical array with the same dimensions as 'data' to add layers +\item{sig_data}{Logical array with the same dimensions as 'data' to add layers to the plot. A value of TRUE at a grid cell will draw a dot/symbol on the corresponding triangle of the plot. Set NULL as default.} -\item{pch_sig}{symbol to be used to represent sig_data. Takes 18 -(diamond) by default. See 'pch' in par() for additional -accepted options.} +\item{pch_sig}{Symbol to be used to represent sig_data. Takes 18 +(diamond) by default. See 'pch' in par() for additional accepted options.} -\item{col_sig}{colour of the symbol to represent sig_data.} +\item{col_sig}{Colour of the symbol to represent sig_data.} -\item{cex_sig}{parameter to increase/reduce the size of the symbols used +\item{cex_sig}{Parameter to increase/reduce the size of the symbols used to represent sig_data.} \item{xlab}{A logical value (TRUE) indicating if xlabels should be plotted} @@ -81,14 +80,16 @@ Set TRUE as default.} category (i.e. triangle). Set the sequence from 1 to the length of the categories (2 or 4).} -\item{cex_leg}{a number to indicate the increase/reductuion of the lab_legend used -to represent sig_data.} +\item{cex_leg}{A number to indicate the increase/reductuion of the lab_legend +used to represent sig_data.} -\item{col_leg}{color of the legend (triangles).} +\item{col_leg}{Color of the legend (triangles).} -\item{cex_axis}{a number to indicate the increase/reduction of the axis labels.} +\item{cex_axis}{A number to indicate the increase/reduction of the axis labels.} -\item{mar}{A numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the plot.} +\item{mar}{A numerical vector of the form c(bottom, left, top, right) which +gives the number of lines of margin to be specified on the four sides of the +plot.} \item{fileout}{A string of full directory path and file name indicating where to save the plot. If not specified (default), a graphics device will pop up.} @@ -97,8 +98,9 @@ to save the plot. If not specified (default), a graphics device will pop up.} (file or window) to plot in. Set 'px' as default. See ?Devices and the creator function of the corresponding device.} -\item{res}{A positive number indicating resolution of the device (file or window) -to plot in. See ?Devices and the creator function of the corresponding device.} +\item{res}{A positive number indicating resolution of the device (file or +window) to plot in. See ?Devices and the creator function of the +corresponding device.} \item{figure.width}{a numeric value to control the width of the plot.} @@ -110,24 +112,23 @@ A figure in popup window by default, or saved to the specified path. } \description{ This function converts a 3-d numerical data array into a coloured -grid with triangles. It is useful for a slide or article to present tabular results as -colors instead of numbers. This can be used to compare the outputs of two or four categories ( -e.g. modes of variability, clusters, or forecast systems). +grid with triangles. It is useful for a slide or article to present tabular +results as colors instead of numbers. This can be used to compare the outputs +of two or four categories (e.g. modes of variability, clusters, or forecast +systems). } \examples{ -#Example with random data -arr1<- arr1<- array(runif(n = 12 * 7 * 4, min=-1, max=1),dim = c(12,7,4)) -names(dim(arr1)) <- c('dimx','dimy','dimcat') -arr2<- array(TRUE,dim = dim(arr1)) -arr2[which(arr1 < 0.3)] = FALSE +# Example with random data +arr1 <- array(runif(n = 4 * 5 * 4, min = -1, max = 1), dim = c(4,5,4)) +names(dim(arr1)) <- c('dimx', 'dimy', 'dimcat') +arr2 <- array(TRUE, dim = dim(arr1)) +arr2[which(arr1 < 0.3)] <- FALSE PlotTriangles4Categories(data = arr1, - cols = c('white','#fef0d9','#fdd49e','#fdbb84','#fc8d59', - '#e34a33','#b30000', '#7f0000'), - brks = c(-1, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 1), - lab_legend = c('NAO+', 'BL','AR','NAO-'), - xtitle = "Target month", ytitle = "Lead time", - xlabels = c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", - "Aug", "Sep", "Oct", "Nov", "Dec")) + cols = c('white','#fef0d9','#fdd49e','#fdbb84','#fc8d59'), + brks = c(-1, 0, 0.1, 0.2, 0.3, 0.4), + lab_legend = c('NAO+', 'BL','AR','NAO-'), + xtitle = "Target month", ytitle = "Lead time", + xlabels = c("Jan", "Feb", "Mar", "Apr")) } \author{ History:\cr diff --git a/man/PlotWeeklyClim.Rd b/man/PlotWeeklyClim.Rd new file mode 100644 index 0000000000000000000000000000000000000000..746c641ea4f411cfc36d5d52249f87b45ebbb5d2 --- /dev/null +++ b/man/PlotWeeklyClim.Rd @@ -0,0 +1,85 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/PlotWeeklyClim.R +\name{PlotWeeklyClim} +\alias{PlotWeeklyClim} +\title{Plots the observed weekly means and climatology of a timeseries data} +\usage{ +PlotWeeklyClim( + data, + first_date, + ref_period_ini, + ref_period_end, + time_dim = "time", + sdate_dim = "sdate", + title = "Observed weekly means and climatology", + palette = "Blues", + fileout = NULL, + device = NULL, + width = 8, + height = 6, + units = "in", + dpi = 300 +) +} +\arguments{ +\item{data}{A multidimensional array with named dimensions with at least sdate +and time dimensions containing observed daily data. It can also be a +dataframe with computed percentiles as input for ggplot. The target year +must be included in the input data.} + +\item{first_date}{The first date of the target period of timeseries. It can be +of class 'Date', 'POSIXct' or a character string in the format 'yyyy-mm-dd'. +It must be a date included in the reference period.} + +\item{ref_period_ini}{A numeric value indicating the first year of the +reference period.} + +\item{ref_period_end}{A numeric value indicating the last year of the +reference period.} + +\item{time_dim}{A character string indicating the daily time dimension name. +The default value is 'time'.} + +\item{sdate_dim}{A character string indicating the start year dimension name. +The default value is 'sdate'.} + +\item{title}{The text for the top title of the plot.} + +\item{palette}{A palette name from the R Color Brewer’s package. The default +value is 'Blues'.} + +\item{fileout}{A character string indicating the file name where to save the +plot. If not specified (default) a graphics device will pop up.} + +\item{device}{A character string indicating the device to use. Can either be +a device function (e.g. png), or one of "eps", "ps", "tex" (pictex), +"pdf", "jpeg", "tiff", "png", "bmp", "svg" or "wmf" (windows only).} + +\item{width}{A numeric value of the plot width in units ("in", "cm", "mm", or +"px"). It is set to 8 by default.} + +\item{height}{A numeric value of the plot height in units ("in", "cm", "mm", +or "px"). It is set to 6 by default.} + +\item{units}{Units of the size of the device (file or window) to plot in. +Inches (’in’) by default.} + +\item{dpi}{A numeric value of the plot resolution. It is set to 300 by +default.} +} +\value{ +A ggplot object containing the plot. +} +\description{ +This function plots the observed weekly means and climatology of +a timeseries data using ggplot package. It compares the weekly climatology in +a specified period (reference period) to the observed conditions during the +target period analyzed in the case study (included in the reference period). +} +\examples{ +data <- array(rnorm(49*20, 274, 7), dim = c(time = 49, sdate = 20)) +PlotWeeklyClim(data = data, first_date = '2010-08-09', + ref_period_ini = 1998, + ref_period_end = 2020) + +} diff --git a/man/Predictability.Rd b/man/Predictability.Rd index d37efcdcc4bf45c38ba5b4c175d919531e1e895f..04f7204eea53f35f0dad6e1ffcb75a0b86232ffe 100644 --- a/man/Predictability.Rd +++ b/man/Predictability.Rd @@ -12,9 +12,10 @@ Predictability(dim, theta, ncores = NULL) the output of CST_ProxiesAttractor or ProxiesAttractor.} \item{theta}{An array of N named dimensions containing the inverse of the -persistence 'theta' as the output of CST_ProxiesAttractor or ProxiesAttractor.} +persistence 'theta' as the output of CST_ProxiesAttractor or +ProxiesAttractor.} -\item{ncores}{The number of cores to use in parallel computation} +\item{ncores}{The number of cores to use in parallel computation.} } \value{ A list of length 2: @@ -59,7 +60,7 @@ predyn <- Predictability(dim = attractor$dim, theta = attractor$theta) Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale atmospheric predictability.Nature Communications, 10(1), 1316. -DOI = https://doi.org/10.1038/s41467-019-09305-8 " +\doi{10.1038/s41467-019-09305-8}" Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. diff --git a/man/ProxiesAttractor.Rd b/man/ProxiesAttractor.Rd index 768ba7366482ed16390158a991bce22998cf0385..ffa1b36b18654fd83119cbfec3079abd6f0dde7d 100644 --- a/man/ProxiesAttractor.Rd +++ b/man/ProxiesAttractor.Rd @@ -7,9 +7,12 @@ ProxiesAttractor(data, quanti, ncores = NULL) } \arguments{ -\item{data}{a multidimensional array with named dimensions to create the attractor. It requires a temporal dimension named 'time' and spatial dimensions called 'lat' and 'lon', or 'latitude' and 'longitude' or 'grid'.} +\item{data}{A multidimensional array with named dimensions to create the +attractor. It requires a temporal dimension named 'time' and spatial +dimensions called 'lat' and 'lon', or 'latitude' and 'longitude' or 'grid'.} -\item{quanti}{a number lower than 1 indicating the quantile to perform the computation of local dimension and theta} +\item{quanti}{A number lower than 1 indicating the quantile to perform the +computation of local dimension and theta} \item{ncores}{The number of cores to use in parallel computation.} } @@ -20,7 +23,7 @@ dim and theta This function computes two dinamical proxies of the attractor: The local dimension (d) and the inverse of the persistence (theta). These two parameters will be used as a condition for the computation of dynamical -scores to measure predictability and to compute bias correction conditioned by +scores to measure predictability and to compute bias correction conditioned by the dynamics with the function DynBiasCorrection. Funtion based on the matlab code (davide.faranda@lsce.ipsl.fr) used in: } @@ -32,19 +35,14 @@ qm <- 0.90 # imposing a threshold Attractor <- ProxiesAttractor(data = mat, quanti = qm) # to plot the result time = c(1:length(Attractor$theta)) -layout(matrix(c(1, 3, 2, 3), 2, 2)) plot(time, Attractor$dim, xlab = 'time', ylab = 'd', main = 'local dimension', type = 'l') -plot(time, Attractor$theta, xlab = 'time', ylab = 'theta', main = 'theta') -plot(Attractor$dim, Attractor$theta, col = 'blue', - main = "Proxies of the Attractor", - xlab = "local dimension", ylab = "theta", lwd = 8, 'p') - } \references{ -Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). -The hammam effect or how a warm ocean enhances large scale atmospheric predictability. -Nature Communications, 10(1), 1316. DOI = https://doi.org/10.1038/s41467-019-09305-8 " +Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and +Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale +atmospheric predictability. Nature Communications, 10(1), 1316. +\doi{10.1038/s41467-019-09305-8}" Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. diff --git a/man/QuantileMapping.Rd b/man/QuantileMapping.Rd index c251f33bff4f824a25d1a5457c2f5cef4534be61..a1ff178de2c0170e6b2a675bf7a9747beb0b4ff2 100644 --- a/man/QuantileMapping.Rd +++ b/man/QuantileMapping.Rd @@ -33,16 +33,16 @@ cross-validation would be applied when exp_cor is not provided. 'sdate' by default.} \item{memb_dim}{A character string indicating the dimension name where -ensemble members are stored in the experimental arrays. 'member' by -default.} +ensemble members are stored in the experimental arrays. It can be NULL if +there is no ensemble member dimension. It is set as 'member' by default.} \item{window_dim}{A character string indicating the dimension name where samples have been stored. It can be NULL (default) in case all samples are used.} \item{method}{A character string indicating the method to be used: 'PTF', -'DIST','RQUANT','QUANT','SSPLIN'. By default, the empirical quantile mapping -'QUANT' is used.} +'DIST', 'RQUANT', 'QUANT', 'SSPLIN'. By default, the empirical quantile +mapping 'QUANT' is used.} \item{na.rm}{A logical value indicating if missing values should be removed (FALSE by default).} @@ -73,20 +73,6 @@ dim(obs) <- c(dataset = 1, member = 1, sdate = 5, ftime = 4, lat = 3, lon = 2) res <- QuantileMapping(exp, obs) -# Use data in package -\donttest{ -exp <- lonlat_temp$exp$data[, , 1:4, , 1:2, 1:3] -dim(exp) <- c(dataset = 1, member = 15, sdate = 4, ftime = 3, - lat = 2, lon = 3) -obs <- lonlat_temp$obs$data[, , 1:4, , 1:2, 1:3] -dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, - lat = 2, lon = 3) -exp_cor <- lonlat_temp$exp$data[, 1, 5:6, , 1:2, 1:3] -dim(exp_cor) <- c(dataset = 1, member = 1, sdate = 2, ftime = 3, - lat = 2, lon = 3) -res <- QuantileMapping(exp, obs, exp_cor, window_dim = 'ftime') -} - } \seealso{ \code{\link[qmap]{fitQmap}} and \code{\link[qmap]{doQmap}} diff --git a/man/RFSlope.Rd b/man/RFSlope.Rd index 5c0c168955e9aef866a0e44c31b302e9e66756ca..ecbb52d115f7e9688085ab8d5a6e77b3ae3f9501 100644 --- a/man/RFSlope.Rd +++ b/man/RFSlope.Rd @@ -10,15 +10,15 @@ RFSlope( time_dim = NULL, lon_dim = "lon", lat_dim = "lat", - ncores = 1 + ncores = NULL ) } \arguments{ \item{data}{Array containing the spatial precipitation fields to downscale. -The input array is expected to have at least two dimensions named "lon" and "lat" by default -(these default names can be changed with the \code{lon_dim} and \code{lat_dim} parameters) -and one or more dimensions over which to average the slopes, -which can be specified by parameter \code{time_dim}.} +The input array is expected to have at least two dimensions named "lon" and +"lat" by default (these default names can be changed with the \code{lon_dim} +and \code{lat_dim} parameters) and one or more dimensions over which to +average the slopes, which can be specified by parameter \code{time_dim}.} \item{kmin}{First wavenumber for spectral slope (default \code{kmin=1}).} @@ -33,7 +33,8 @@ with more than one element is chosen.} \item{lat_dim}{Name of lat dimension ("lat" by default).} -\item{ncores}{is an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one.} +\item{ncores}{is an integer that indicates the number of cores for parallel +computations using multiApply function. The default value is one.} } \value{ RFSlope() returns spectral slopes using the RainFARM convention @@ -51,19 +52,9 @@ to be used for RainFARM stochastic precipitation downscaling method. # 3 starting dates and 20 ensemble members. pr <- 1:(4*3*8*8*20) dim(pr) <- c(ensemble = 4, sdate = 3, lon = 8, lat = 8, ftime = 20) - # Compute the spectral slopes ignoring the wavenumber # corresponding to the largest scale (the box) -slopes <- RFSlope(pr, kmin=2) -dim(slopes) -# ensemble sdate -# 4 3 -slopes -# [,1] [,2] [,3] -#[1,] 1.893503 1.893503 1.893503 -#[2,] 1.893503 1.893503 1.893503 -#[3,] 1.893503 1.893503 1.893503 -#[4,] 1.893503 1.893503 1.893503 +slopes <- RFSlope(pr, kmin = 2, time_dim = 'ftime') } \author{ Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} diff --git a/man/RFTemp.Rd b/man/RFTemp.Rd index 106ae6e218f2037ac5987d16e49641ac90c044d8..957ccc918d42b5bbdc160b21d7b2cd3f86ad8b8f 100644 --- a/man/RFTemp.Rd +++ b/man/RFTemp.Rd @@ -26,57 +26,57 @@ RFTemp( ) } \arguments{ -\item{data}{Temperature array to downscale. -The input array is expected to have at least two dimensions named -"lon" and "lat" by default -(these default names can be changed with the \code{lon_dim} and -\code{lat_dim} parameters)} +\item{data}{Temperature array to downscale. The input array is expected to +have at least two dimensions named "lon" and "lat" by default (these default +names can be changed with the \code{lon_dim} and \code{lat_dim} parameters).} \item{lon}{Vector or array of longitudes.} \item{lat}{Vector or array of latitudes.} -\item{oro}{Array containing fine-scale orography (in m) -The destination downscaling area must be contained in the orography field.} +\item{oro}{Array containing fine-scale orography (in m). The destination +downscaling area must be contained in the orography field.} \item{lonoro}{Vector or array of longitudes corresponding to the fine orography.} \item{latoro}{Vector or array of latitudes corresponding to the fine orography.} -\item{xlim}{vector with longitude bounds for downscaling; -the full input field is downscaled if `xlim` and `ylim` are not specified.} +\item{xlim}{Vector with longitude bounds for downscaling; the full input field +is downscaled if `xlim` and `ylim` are not specified.} -\item{ylim}{vector with latitude bounds for downscaling} +\item{ylim}{Vector with latitude bounds for downscaling.} -\item{lapse}{float with environmental lapse rate} +\item{lapse}{Float with environmental lapse rate.} -\item{lon_dim}{string with name of longitude dimension} +\item{lon_dim}{String with name of longitude dimension.} -\item{lat_dim}{string with name of latitude dimension} +\item{lat_dim}{String with name of latitude dimension.} -\item{time_dim}{a vector of character string indicating the name of temporal dimension. By default, it is set to NULL and it considers "ftime", "sdate" and "time" as temporal dimensions.} +\item{time_dim}{A vector of character string indicating the name of temporal +dimension. By default, it is set to NULL and it considers "ftime", "sdate" +and "time" as temporal dimensions.} -\item{nolapse}{logical, if true `oro` is interpreted as a -fine-scale climatology and used directly for bias correction} +\item{nolapse}{Logical, if true `oro` is interpreted as a fine-scale +climatology and used directly for bias correction.} -\item{verbose}{logical if to print diagnostic output} +\item{verbose}{Logical if to print diagnostic output.} -\item{compute_delta}{logical if true returns only a delta to be used for +\item{compute_delta}{Logical if true returns only a delta to be used for out-of-sample forecasts.} -\item{method}{string indicating the method used for interpolation: +\item{method}{String indicating the method used for interpolation: "nearest" (nearest neighbours followed by smoothing with a circular uniform weights kernel), "bilinear" (bilinear interpolation) The two methods provide similar results, but nearest is slightly better provided that the fine-scale grid is correctly centered as a subdivision -of the large-scale grid} +of the large-scale grid.} -\item{delta}{matrix containing a delta to be applied to the downscaled +\item{delta}{Matrix containing a delta to be applied to the downscaled input data. The grid of this matrix is supposed to be same as that of -the required output field} +the required output field.} } \value{ -CST_RFTemp() returns a downscaled CSTools object +CST_RFTemp() returns a downscaled CSTools object. RFTemp() returns a list containing the fine-scale longitudes, latitudes and the downscaled fields. @@ -99,15 +99,15 @@ dim(o) <- c(lat = 29, lon = 29) lono <- seq(3, 10, 0.25) lato <- seq(41, 48, 0.25) res <- RFTemp(t, lon, lat, o, lono, lato, xlim = c(4, 8), ylim = c(43, 46), - lapse = 6.5) + lapse = 6.5, time_dim = 'ftime') } \references{ Method described in ERA4CS MEDSCOPE milestone M3.2: -High-quality climate prediction data available to WP4 -[https://www.medscope-project.eu/the-project/deliverables-reports/]([https://www.medscope-project.eu/the-project/deliverables-reports/) +High-quality climate prediction data available to WP4 here: +\ url{https://www.medscope-project.eu/the-project/deliverables-reports/} and in H2020 ECOPOTENTIAL Deliverable No. 8.1: -High resolution (1-10 km) climate, land use and ocean change scenarios -[https://www.ecopotential-project.eu/images/ecopotential/documents/D8.1.pdf](https://www.ecopotential-project.eu/images/ecopotential/documents/D8.1.pdf) +High resolution (1-10 km) climate, land use and ocean change scenarios here: +\url{https://ec.europa.eu/research/participants/documents/downloadPublic?documentIds=080166e5b6cd2324&appId=PPGMS}. } \author{ Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} diff --git a/man/RF_Weights.Rd b/man/RF_Weights.Rd index 66e1ac5153c2a41605867e6cbbcd35c8b3120bd7..836106369843db7d917a3ca4426444524c1fb7f7 100644 --- a/man/RF_Weights.Rd +++ b/man/RF_Weights.Rd @@ -18,45 +18,55 @@ RF_Weights( ) } \arguments{ -\item{zclim}{a multi-dimensional array with named dimension containing at least one precipiation field with spatial dimensions.} +\item{zclim}{A multi-dimensional array with named dimension containing at +least one precipiation field with spatial dimensions.} -\item{latin}{a vector indicating the latitudinal coordinates corresponding to the \code{zclim} parameter.} +\item{latin}{A vector indicating the latitudinal coordinates corresponding to +the \code{zclim} parameter.} -\item{lonin}{a vector indicating the longitudinal coordinates corresponding to the \code{zclim} parameter.} +\item{lonin}{A vector indicating the longitudinal coordinates corresponding to +the \code{zclim} parameter.} -\item{nf}{Refinement factor for downscaling (the output resolution is increased by this factor).} +\item{nf}{Refinement factor for downscaling (the output resolution is +increased by this factor).} -\item{lat}{Vector of latitudes. -The number of longitudes and latitudes is expected to be even and the same. If not -the function will perform a subsetting to ensure this condition.} +\item{lat}{Vector of latitudes. The number of longitudes and latitudes is +expected to be even and the same. If not the function will perform a +subsetting to ensure this condition.} \item{lon}{Vector of longitudes.} -\item{fsmooth}{Logical to use smooth conservation (default) or large-scale box-average conservation.} +\item{fsmooth}{Logical to use smooth conservation (default) or large-scale +box-average conservation.} -\item{lonname}{a character string indicating the name of the longitudinal dimension set as 'lon' by default.} +\item{lonname}{A character string indicating the name of the longitudinal +dimension set as 'lon' by default.} -\item{latname}{a character string indicating the name of the latitudinal dimension set as 'lat' by default.} +\item{latname}{A character string indicating the name of the latitudinal +dimension set as 'lat' by default.} -\item{ncores}{an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one.} +\item{ncores}{An integer that indicates the number of cores for parallel +computations using multiApply function. The default value is one.} } \value{ -An object of class 's2dv_cube' containing in matrix \code{data} the weights with dimensions (lon, lat). +An object of class 's2dv_cube' containing in matrix \code{data} the +weights with dimensions (lon, lat). } \description{ -Compute climatological ("orographic") weights from a fine-scale precipitation climatology file. +Compute climatological ("orographic") weights from a fine-scale +precipitation climatology file. } \examples{ a <- array(1:2500, c(lat = 50, lon = 50)) res <- RF_Weights(a, seq(0.1 ,5, 0.1), seq(0.1 ,5, 0.1), - nf = 5, lat = 1:5, lon = 1:5) + nf = 5, lat = 1:5, lon = 1:5) } \references{ Terzago, S., Palazzi, E., & von Hardenberg, J. (2018). Stochastic downscaling of precipitation in complex orography: A simple method to reproduce a realistic fine-scale climatology. Natural Hazards and Earth System Sciences, 18(11), -2825-2840. http://doi.org/10.5194/nhess-18-2825-2018 . +2825-2840. \doi{10.5194/nhess-18-2825-2018}. } \author{ Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} diff --git a/man/RainFARM.Rd b/man/RainFARM.Rd index ef4485c94b1c290236206f97e2b0dcbc1ff06153..6a6e2739f1fb0d781686880d90f2206e68bd2c1b 100644 --- a/man/RainFARM.Rd +++ b/man/RainFARM.Rd @@ -24,66 +24,77 @@ RainFARM( ) } \arguments{ -\item{data}{Precipitation array to downscale. -The input array is expected to have at least two dimensions named "lon" and "lat" by default -(these default names can be changed with the \code{lon_dim} and \code{lat_dim} parameters) -and one or more dimensions over which to average these slopes, -which can be specified by parameter \code{time_dim}. -The number of longitudes and latitudes in the input data is expected to be even and the same. If not +\item{data}{Precipitation array to downscale. The input array is expected to +have at least two dimensions named "lon" and "lat" by default (these default +names can be changed with the \code{lon_dim} and \code{lat_dim} parameters) +and one or more dimensions over which to average these slopes, which can be +specified by parameter \code{time_dim}. The number of longitudes and +latitudes in the input data is expected to be even and the same. If not the function will perform a subsetting to ensure this condition.} \item{lon}{Vector or array of longitudes.} \item{lat}{Vector or array of latitudes.} -\item{nf}{Refinement factor for downscaling (the output resolution is increased by this factor).} +\item{nf}{Refinement factor for downscaling (the output resolution is +increased by this factor).} -\item{weights}{multi-dimensional array with climatological weights which can be obtained using -the \code{CST_RFWeights} function. If \code{weights=1.} (default) no weights are used. -The names of these dimensions must be at least 'lon' and 'lat'.} +\item{weights}{Multi-dimensional array with climatological weights which can +be obtained using the \code{CST_RFWeights} function. If \code{weights = 1.} +(default) no weights are used. The names of these dimensions must be at +least the same longitudinal and latitudinal dimension names as data.} -\item{nens}{Number of ensemble members to produce (default: \code{nens=1}).} +\item{nens}{Number of ensemble members to produce (default: \code{nens = 1}).} -\item{slope}{Prescribed spectral slope. The default is \code{slope=0.} -meaning that the slope is determined automatically over the dimensions specified by \code{time_dim}. A 1D array with named dimension can be provided (see details and examples)} +\item{slope}{Prescribed spectral slope. The default is \code{slope = 0.} +meaning that the slope is determined automatically over the dimensions +specified by \code{time_dim}. A 1D array with named dimension can be +provided (see details and examples).} -\item{kmin}{First wavenumber for spectral slope (default: \code{kmin=1}).} +\item{kmin}{First wavenumber for spectral slope (default: \code{kmin = 1}).} -\item{fglob}{Logical to conseve global precipitation over the domain (default: FALSE)} +\item{fglob}{Logical to conseve global precipitation over the domain +(default: FALSE).} -\item{fsmooth}{Logical to conserve precipitation with a smoothing kernel (default: TRUE)} +\item{fsmooth}{Logical to conserve precipitation with a smoothing kernel +(default: TRUE).} -\item{nprocs}{The number of parallel processes to spawn for the use for parallel computation in multiple cores. (default: 1)} +\item{nprocs}{The number of parallel processes to spawn for the use for +parallel computation in multiple cores. (default: 1)} \item{time_dim}{String or character array with name(s) of time dimension(s) (e.g. "ftime", "sdate", "time" ...) over which to compute spectral slopes. If a character array of dimension names is provided, the spectral slopes will be computed over all elements belonging to those dimensions. -If omitted one of c("ftime", "sdate", "time") -is searched and the first one with more than one element is chosen.} +If omitted one of c("ftime", "sdate", "time") is searched and the first one +with more than one element is chosen.} \item{lon_dim}{Name of lon dimension ("lon" by default).} \item{lat_dim}{Name of lat dimension ("lat" by default).} -\item{drop_realization_dim}{Logical to remove the "realization" stochastic ensemble dimension (default: FALSE) - with the following behaviour if set to TRUE: - -1) if \code{nens==1}: the dimension is dropped; - -2) if \code{nens>1} and a "member" dimension exists: - the "realization" and "member" dimensions are compacted (multiplied) and the resulting dimension is named "member"; - -3) if \code{nens>1} and a "member" dimension does not exist: the "realization" dimension is renamed to "member".} +\item{drop_realization_dim}{Logical to remove the "realization" stochastic +ensemble dimension (default: FALSE) with the following behaviour if set to +TRUE: +\enumerate{ + \item{if \code{nens == 1}: the dimension is dropped;} + \item{if \code{nens > 1} and a "member" dimension exists: the "realization" + and "member" dimensions are compacted (multiplied) and the resulting + dimension is named "member";} + \item{if \code{nens > 1} and a "member" dimension does not exist: the + "realization" dimension is renamed to "member".} +}} \item{verbose}{logical for verbose output (default: FALSE).} } \value{ -RainFARM() returns a list containing the fine-scale longitudes, latitudes -and the sequence of \code{nens} downscaled fields. -If \code{nens>1} an additional dimension named "realization" is added to the output array -after the "member" dimension (if it exists and unless \code{drop_realization_dim=TRUE} is specified). -The ordering of the remaining dimensions in the \code{exp} element of the input object is maintained. +RainFARM() Returns a list containing the fine-scale longitudes, + latitudes and the sequence of \code{nens} downscaled fields. If + \code{nens > 1} an additional dimension named "realization" is added to the + output array after the "member" dimension (if it exists and unless + \code{drop_realization_dim = TRUE} is specified). The ordering of the + remaining dimensions in the \code{exp} element of the input object is + maintained. } \description{ This function implements the RainFARM stochastic precipitation downscaling method @@ -92,47 +103,27 @@ and one or more dimension (such as "ftime", "sdate" or "time") over which to average automatically determined spectral slopes. Adapted for climate downscaling and including orographic correction. References: -Terzago, S. et al. (2018). NHESS 18(11), 2825-2840. http://doi.org/10.5194/nhess-18-2825-2018, -D'Onofrio et al. (2014), J of Hydrometeorology 15, 830-843; Rebora et. al. (2006), JHM 7, 724. +Terzago, S. et al. (2018). NHESS 18(11), 2825-2840. \doi{10.5194/nhess-18-2825-2018}, +D'Onofrio et al. (2014), J of Hydrometeorology 15, 830-843; Rebora et. al. +(2006), JHM 7, 724. } \details{ -Wether parameter 'slope' and 'weights' presents seasonality dependency, a dimension name should match between these parameters and the input data in parameter 'data'. See example 2 below where weights and slope vary with 'sdate' dimension. +Wether parameter 'slope' and 'weights' presents seasonality +dependency, a dimension name should match between these parameters and the +input data in parameter 'data'. See example 2 below where weights and slope +vary with 'sdate' dimension. } \examples{ # Example for the 'reduced' RainFARM function nf <- 8 # Choose a downscaling by factor 8 -nens <- 3 # Number of ensemble members -# create a test array with dimension 8x8 and 20 timesteps -# or provide your own read from a netcdf file -pr <- rnorm(8 * 8 * 20) -dim(pr) <- c(lon = 8, lat = 8, ftime = 20) -lon_mat <- seq(10, 13.5, 0.5) # could also be a 2d matrix -lat_mat <- seq(40, 43.5, 0.5) +exp <- 1 : (2 * 3 * 4 * 8 * 8) +dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) +lon <- seq(10, 13.5, 0.5) +lat <- seq(40, 43.5, 0.5) # Create a test array of weights ww <- array(1., dim = c(lon = 8 * nf, lat = 8 * nf)) -# or create proper weights using an external fine-scale climatology file -# Specify a weightsfn filename if you wish to save the weights -\dontrun{ -ww <- CST_RFWeights("./worldclim.nc", nf, lon = lon_mat, lat = lat_mat, - fsmooth = TRUE) -} -# downscale using weights (ww=1. means do not use weights) -res <- RainFARM(pr, lon_mat, lat_mat, nf, - fsmooth = TRUE, fglob = FALSE, - weights = ww, nens = 2, verbose = TRUE) -str(res) -#List of 3 -# $ data: num [1:3, 1:20, 1:64, 1:64] 0.186 0.212 0.138 3.748 0.679 ... -# $ lon : num [1:64] 9.78 9.84 9.91 9.97 10.03 ... -# $ lat : num [1:64] 39.8 39.8 39.9 40 40 ... -dim(res$data) -# lon lat ftime realization -# 64 64 20 2 -# Example 2: -slo <- array(c(0.1, 0.5, 0.7), c(sdate= 3)) -wei <- array(rnorm(8*8*3), c(lon = 8, lat = 8, sdate = 3)) -res <- RainFARM(lonlat_prec$data, lon = lonlat_prec$lon, - lat = lonlat_prec$lat, weights = wei, slope = slo, nf = 2) +res <- RainFARM(data = exp, lon = lon, lat = lat, nf = nf, + weights = ww, nens = 3, time_dim = 'ftime') } \author{ Jost von Hardenberg - ISAC-CNR, \email{j.vonhardenberg@isac.cnr.it} diff --git a/man/RegimesAssign.Rd b/man/RegimesAssign.Rd index f66cfed15cc81928e0bb3209bbc3794e39b099dd..797367c8edcbed5933f8f2f12fd04b16ac5e2c97 100644 --- a/man/RegimesAssign.Rd +++ b/man/RegimesAssign.Rd @@ -16,45 +16,62 @@ RegimesAssign( ) } \arguments{ -\item{data}{an array containing anomalies with named dimensions: dataset, member, sdate, ftime, lat and lon.} +\item{data}{An array containing anomalies with named dimensions: dataset, +member, sdate, ftime, lat and lon.} -\item{ref_maps}{array with 3-dimensions ('lon', 'lat', 'cluster') containing the maps/clusters that will be used as a reference for the matching.} +\item{ref_maps}{Array with 3-dimensions ('lon', 'lat', 'cluster') containing +the maps/clusters that will be used as a reference for the matching.} -\item{lat}{a vector of latitudes corresponding to the positions provided in data and ref_maps.} +\item{lat}{A vector of latitudes corresponding to the positions provided in +data and ref_maps.} -\item{method}{whether the matching will be performed in terms of minimum distance (default = 'distance') or -the maximum spatial correlation (method = 'ACC') between the maps.} +\item{method}{Whether the matching will be performed in terms of minimum +distance (default = 'distance') or the maximum spatial correlation +(method = 'ACC') between the maps.} -\item{composite}{a logical parameter indicating if the composite maps are computed or not (default = FALSE).} +\item{composite}{A logical parameter indicating if the composite maps are +computed or not (default = FALSE).} -\item{memb}{a logical value indicating whether to compute composites for separate members (default FALSE) or as unique ensemble (TRUE). -This option is only available for when parameter 'composite' is set to TRUE and the data object has a dimension named 'member'.} +\item{memb}{A logical value indicating whether to compute composites for +separate members (default FALSE) or as unique ensemble (TRUE). This option +is only available for when parameter 'composite' is set to TRUE and the data +object has a dimension named 'member'.} -\item{ncores}{the number of multicore threads to use for parallel computation.} +\item{ncores}{The number of multicore threads to use for parallel computation.} } \value{ -A list with elements \code{$composite} (3-d array (lon, lat, k) containing the composites k=1,..,K for case (*1) - \code{$pvalue} ( array with the same structure as \code{$composite} containing the pvalue of the composites obtained through a t-test - that accounts for the serial dependence of the data with the same structure as Composite.) (only if composite='TRUE'), - \code{$cluster} (array with the same dimensions as data (except latitude and longitude which are removed) indicating the ref_maps to which each point is allocated.) , - \code{$frequency} (A vector of integers (from k = 1, ... k n reference maps) indicating the percentage of assignations corresponding to each map.), +A list with elements \code{$composite} (3-d array (lon, lat, k) +containing the composites k = 1,..,K for case (*1) or only k = 1 for any specific +cluster, i.e., case (*2)) (only if composite = 'TRUE'), \code{$pvalue} (array +with the same structure as \code{$composite} containing the pvalue of the +composites obtained through a t-test that accounts for the serial dependence +of the data with the same structure as Composite.) (only if composite='TRUE'), +\code{$cluster} (array with the same dimensions as data (except latitude and +longitude which are removed) indicating the ref_maps to which each point is +allocated.), \code{$frequency} (A vector of integers (from k = 1, ... k n +reference maps) indicating the percentage of assignations corresponding to +each map.), } \description{ -This function performs the matching between a field of anomalies and a set -of maps which will be used as a reference. The anomalies will be assigned to the reference map -for which the minimum Eucledian distance (method = 'distance') or highest spatial correlation -(method = 'ACC') is obtained. +This function performs the matching between a field of anomalies +and a set of maps which will be used as a reference. The anomalies will be +assigned to the reference map for which the minimum Eucledian distance +(method = 'distance') or highest spatial correlation (method = 'ACC') is +obtained. } \examples{ -\dontrun{ -regimes <- WeatherRegime(data = lonlat_temp$obs$data, lat = lonlat_temp$obs$lat, +data <- array(abs(rnorm(1280, 282.7, 6.4)), dim = c(dataset = 2, member = 2, + sdate = 3, ftime = 3, + lat = 4, lon = 4)) +regimes <- WeatherRegime(data = data, lat = seq(47, 44), EOFs = FALSE, ncenters = 4)$composite -res1 <- RegimesAssign(data = lonlat_temp$exp$data, ref_maps = drop(regimes), - lat = lonlat_temp$exp$lat, composite = FALSE) -} +res1 <- RegimesAssign(data = data, ref_maps = drop(regimes), + lat = seq(47, 44), composite = FALSE) } \references{ -Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools for the development of a climate service. Thesis. Available online: \url{https://eprints.ucm.es/56841/} +Torralba, V. (2019) Seasonal climate prediction for the wind +energy sector: methods and tools for the development of a climate service. +Thesis. Available online: \url{https://eprints.ucm.es/56841/} } \author{ Verónica Torralba - BSC, \email{veronica.torralba@bsc.es} diff --git a/man/SaveExp.Rd b/man/SaveExp.Rd index f1ace22c6e7b3a200883686b1531b5923e766ecb..d0418131c288acd2626fb90b3b76c58cbca8a0ec 100644 --- a/man/SaveExp.Rd +++ b/man/SaveExp.Rd @@ -2,74 +2,127 @@ % Please edit documentation in R/CST_SaveExp.R \name{SaveExp} \alias{SaveExp} -\title{Save an experiment in a format compatible with CST_Load} +\title{Save a multidimensional array with metadata to data in NetCDF format} \usage{ SaveExp( data, - lon, - lat, - Dataset, - var_name, - units, - startdates, - Dates, - cdo_grid_name, - projection, - destination, + destination = "./", + Dates = NULL, + coords = NULL, + varname = NULL, + metadata = NULL, + Datasets = NULL, + startdates = NULL, + dat_dim = "dataset", + sdate_dim = "sdate", + ftime_dim = "time", + var_dim = "var", + memb_dim = "member", + single_file = FALSE, extra_string = NULL ) } \arguments{ -\item{data}{an multi-dimensional array with named dimensions (longitude, latitude, time, member, sdate)} +\item{data}{A multi-dimensional array with named dimensions.} -\item{lon}{vector of logitud corresponding to the longitudinal dimension in data} +\item{destination}{A character string indicating the path where to store the +NetCDF files.} -\item{lat}{vector of latitud corresponding to the latitudinal dimension in data} +\item{Dates}{A named array of dates with the corresponding sdate and forecast +time dimension.} -\item{Dataset}{a vector of character string indicating the names of the datasets} +\item{coords}{A named list with elements of the coordinates corresponding to +the dimensions of the data parameter. The names and length of each element +must correspond to the names of the dimensions. If any coordinate is not +provided, it is set as an index vector with the values from 1 to the length +of the corresponding dimension.} -\item{var_name}{a character string indicating the name of the variable to be saved} +\item{varname}{A character string indicating the name of the variable to be +saved.} -\item{units}{a character string indicating the units of the variable} +\item{metadata}{A named list where each element is a variable containing the +corresponding information. The information must be contained in a list of +lists for each variable.} -\item{startdates}{a vector of dates indicating the initialization date of each simulations} +\item{Datasets}{A vector of character string indicating the names of the +datasets.} -\item{Dates}{a matrix of dates with two dimension 'time' and 'sdate'.} +\item{startdates}{A vector of dates indicating the initialization date of each +simulations.} -\item{cdo_grid_name}{a character string indicating the name of the grid e.g.: 'r360x181'} +\item{dat_dim}{A character string indicating the name of dataset dimension. +By default, it is set to 'dataset'. It can be NULL if there is no dataset +dimension.} -\item{projection}{a character string indicating the projection name} +\item{sdate_dim}{A character string indicating the name of the start date +dimension. By default, it is set to 'sdate'. It can be NULL if there is no +start date dimension.} -\item{destination}{a character string indicating the path where to store the NetCDF files} +\item{ftime_dim}{A character string indicating the name of the forecast time +dimension. By default, it is set to 'time'. It can be NULL if there is no +forecast time dimension.} -\item{extra_string}{a character string to be include as part of the file name, for instance, to identify member or realization.} +\item{var_dim}{A character string indicating the name of variable dimension. +By default, it is set to 'var'. It can be NULL if there is no variable +dimension.} + +\item{memb_dim}{A character string indicating the name of the member dimension. +By default, it is set to 'member'. It can be NULL if there is no member +dimension.} + +\item{single_file}{A logical value indicating if all object is saved in a +unique file (TRUE) or in separated directories (FALSE). When it is FALSE, +the array is separated for Datasets, variable and start date. It is FALSE +by default.} + +\item{extra_string}{A character string to be include as part of the file name, +for instance, to identify member or realization. It would be added to the +file name between underscore characters.} } \value{ -the function creates as many files as sdates per dataset. Each file could contain multiple members. It would be added to the file name between underscore characters. -The path will be created with the name of the variable and each Datasets. +Multiple or single NetCDF files containing the data array.\cr +\item{\code{single_file = TRUE}}{ + All data is saved in a single file located in the specified destination + path with the following name: + ___.nc. Multiple + variables are saved separately in the same file. The forecast time units + is extracted from the frequency of the time steps (hours, days, months). + The first value of forecast time is 1. If no frequency is found, the units + will be 'hours since' each start date and the time steps are assumed to be + equally spaced. +} +\item{\code{single_file = FALSE}}{ + The data array is subset and stored into multiple files. Each file + contains the data subset for each start date, variable and dataset. Files + with different variables and Datasets are stored in separated directories + within the following directory tree: destination/Dataset/variable/. + The name of each file will be: + __.nc. +} } \description{ -This function is created for compatibility with CST_Load/Load for saving post-processed datasets such as those calibrated of downscaled with CSTools functions +This function allows to save a data array with metadata into a +NetCDF file, allowing to reload the saved data using \code{Start} function +from StartR package. If the original 's2dv_cube' object has been created from +\code{CST_Load()}, then it can be reloaded with \code{Load()}. } \examples{ \dontrun{ data <- lonlat_temp$exp$data -lon <- lonlat_temp$exp$lon -lat <- lonlat_temp$exp$lat -Dataset <- 'XXX' -var_name <- 'tas' -units <- 'k' -startdates <- lapply(1:length(lonlat_temp$exp$Datasets), - function(x) { - lonlat_temp$exp$Datasets[[x]]$InitializationDates[[1]]})[[1]] -Dates <- lonlat_temp$exp$Dates$start -dim(Dates) <- c(time = length(Dates)/length(startdates), sdate = length(startdates)) -cdo_grid_name = attr(lonlat_temp$exp$lon, 'cdo_grid_name') -projection = attr(lonlat_temp$exp$lon, 'projection') -destination = './path/' -SaveExp(data, lon, lat, Dataset, var_name, units, startdates, Dates, - cdo_grid_name, projection, destination) +lon <- lonlat_temp$exp$coords$lon +lat <- lonlat_temp$exp$coords$lat +coords <- list(lon = lon, lat = lat) +Datasets <- lonlat_temp$exp$attrs$Datasets +varname <- 'tas' +Dates <- lonlat_temp$exp$attrs$Dates +destination = './' +metadata <- lonlat_temp$exp$attrs$Variable$metadata +SaveExp(data = data, destination = destination, coords = coords, + Datasets = Datasets, varname = varname, Dates = Dates, + metadata = metadata, single_file = TRUE, ftime_dim = 'ftime', + var_dim = NULL) } + } \author{ Perez-Zanon Nuria, \email{nuria.perez@bsc.es} diff --git a/man/SplitDim.Rd b/man/SplitDim.Rd index a49043062c8e2dea743b9f391f53f9929bd41530..a0dc8bc69bc65b0e729491a0e27c132f2854ee76 100644 --- a/man/SplitDim.Rd +++ b/man/SplitDim.Rd @@ -13,21 +13,29 @@ SplitDim( ) } \arguments{ -\item{data}{an n-dimensional array with named dimensions} +\item{data}{An n-dimensional array with named dimensions.} -\item{split_dim}{a character string indicating the name of the dimension to split} +\item{split_dim}{A character string indicating the name of the dimension to +split.} -\item{indices}{a vector of numeric indices or dates} +\item{indices}{A vector of numeric indices or dates.} -\item{freq}{a character string indicating the frequency: by 'day', 'month' and 'year' or 'monthly' (by default). 'month' identifies months between 1 and 12 independetly of the year they belong to, while 'monthly' differenciates months from different years. Parameter 'freq' can also be numeric indicating the length in which to subset the dimension.} +\item{freq}{A character string indicating the frequency: by 'day', 'month' and +'year' or 'monthly' (by default). 'month' identifies months between 1 and 12 +independetly of the year they belong to, while 'monthly' differenciates +months from different years. Parameter 'freq' can also be numeric indicating +the length in which to subset the dimension.} -\item{new_dim_name}{a character string indicating the name of the new dimension.} +\item{new_dim_name}{A character string indicating the name of the new +dimension.} } \description{ -This function split a dimension in two. The user can select the dimension to split and provide indices indicating how to split that dimension or dates and the frequency expected (monthly or by day, month and year). The user can also provide a numeric frequency indicating the length of each division. +This function split a dimension in two. The user can select the +dimension to split and provide indices indicating how to split that dimension +or dates and the frequency expected (monthly or by day, month and year). The +user can also provide a numeric frequency indicating the length of each division. } \examples{ - data <- 1 : 20 dim(data) <- c(time = 10, lat = 2) indices <- c(rep(1,5), rep(2,5)) diff --git a/man/WeatherRegimes.Rd b/man/WeatherRegimes.Rd index 2f59b191133aede7e7da14153a942743e56907b4..8164d705140c309a347bc41917eea696dc08105e 100644 --- a/man/WeatherRegimes.Rd +++ b/man/WeatherRegimes.Rd @@ -19,60 +19,80 @@ WeatherRegime( ) } \arguments{ -\item{data}{an array containing anomalies with named dimensions with at least start date 'sdate', forecast time 'ftime', latitude 'lat' and longitude 'lon'.} +\item{data}{An array containing anomalies with named dimensions with at least +start date 'sdate', forecast time 'ftime', latitude 'lat' and longitude +'lon'.} -\item{ncenters}{Number of clusters to be calculated with the clustering function.} +\item{ncenters}{Number of clusters to be calculated with the clustering +function.} -\item{EOFs}{Whether to compute the EOFs (default = 'TRUE') or not (FALSE) to filter the data.} +\item{EOFs}{Whether to compute the EOFs (default = 'TRUE') or not (FALSE) to +filter the data.} -\item{neofs}{number of modes to be kept only if EOFs = TRUE has been selected. (default = 30).} +\item{neofs}{Number of modes to be kept only if EOFs = TRUE has been selected. +(default = 30).} -\item{varThreshold}{Value with the percentage of variance to be explained by the PCs. -Only sufficient PCs to explain this much variance will be used in the clustering.} +\item{varThreshold}{Value with the percentage of variance to be explained by +the PCs. Only sufficient PCs to explain this much variance will be used in +the clustering.} \item{lon}{Vector of longitudes.} \item{lat}{Vector of latitudes.} -\item{method}{Different options to estimate the clusters. The most traditional approach is the k-means analysis (default=’kmeans’) -but the function also support the different methods included in the hclust . These methods are: -"ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). -For more details about these methods see the hclust function documentation included in the stats package.} +\item{method}{Different options to estimate the clusters. The most traditional +approach is the k-means analysis (default=’kmeans’) but the function also +support the different methods included in the hclust . These methods are: +"ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" +(= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). For more details +about these methods see the hclust function documentation included in the +stats package.} -\item{iter.max}{Parameter to select the maximum number of iterations allowed (Only if method='kmeans' is selected).} +\item{iter.max}{Parameter to select the maximum number of iterations allowed +(Only if method = 'kmeans' is selected).} -\item{nstart}{Parameter for the cluster analysis determining how many random sets to choose (Only if method='kmeans' is selected).} +\item{nstart}{Parameter for the cluster analysis determining how many random +sets to choose (Only if method='kmeans' is selected).} \item{ncores}{The number of multicore threads to use for parallel computation.} } \value{ -A list with elements \code{$composite} (array with at least 3-d ('lat', 'lon', 'cluster') containing the composites k=1,..,K for case (*1) - \code{pvalue} (array with at least 3-d ('lat','lon','cluster') with the pvalue of the composites obtained through a t-test that accounts for the serial - \code{cluster} (A matrix or vector with integers (from 1:k) indicating the cluster to which each time step is allocated.), - \code{persistence} (Percentage of days in a month/season before a cluster is replaced for a new one (only if method=’kmeans’ has been selected.)), - \code{frequency} (Percentage of days in a month/season belonging to each cluster (only if method=’kmeans’ has been selected).), +A list with elements \code{$composite} (array with at least 3-d ('lat', +'lon', 'cluster') containing the composites k = 1,..,K for case (*1) or only k = 1 +for any specific cluster, i.e., case (*2)), \code{pvalue} (array with at least +3-d ('lat','lon','cluster') with the pvalue of the composites obtained through +a t-test that accounts for the serial dependence of the data with the same +structure as Composite.), \code{cluster} (A matrix or vector with integers +(from 1:k) indicating the cluster to which each time step is allocated.), +\code{persistence} (Percentage of days in a month/season before a cluster is +replaced for a new one (only if method=’kmeans’ has been selected.)), +\code{frequency} (Percentage of days in a month/season belonging to each +cluster (only if method=’kmeans’ has been selected).), } \description{ This function computes the weather regimes from a cluster analysis. -It can be applied over the dataset with dimensions -c(year/month, month/day, lon, lat), or by using PCs obtained from the application of the -EOFs analysis to filter the dataset. -The cluster analysis can be performed with the traditional k-means or those methods -included in the hclust (stats package). +It can be applied over the dataset with dimensions c(year/month, month/day, +lon, lat), or by using PCs obtained from the application of the EOFs analysis +to filter the dataset. The cluster analysis can be performed with the +traditional k-means or those methods included in the hclust (stats package). } \examples{ -\dontrun{ -res <- WeatherRegime(data = lonlat_temp$obs$data, lat = lonlat_temp$obs$lat, +data <- array(abs(rnorm(1280, 283.7, 6)), dim = c(dataset = 2, member = 2, + sdate = 3, ftime = 3, + lat = 4, lon = 4)) +lat <- seq(47, 44) +res <- WeatherRegime(data = data, lat = lat, EOFs = FALSE, ncenters = 4) } -} \references{ -Cortesi, N., V., Torralba, N., González-Reviriego, A., Soret, and F.J., Doblas-Reyes (2019). -Characterization of European wind speed variability using weather regimes. Climate Dynamics,53, -4961–4976, doi:10.1007/s00382-019-04839-5. +Cortesi, N., V., Torralba, N., González-Reviriego, A., Soret, and +F.J., Doblas-Reyes (2019). Characterization of European wind speed variability +using weather regimes. Climate Dynamics,53, 4961–4976, +\doi{10.1007/s00382-019-04839-5}. -Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools -for the development of a climate service. Thesis. Available online: \url{https://eprints.ucm.es/56841/} +Torralba, V. (2019) Seasonal climate prediction for the wind +energy sector: methods and tools for the development of a climate service. +Thesis. Available online: \url{https://eprints.ucm.es/56841/} } \author{ Verónica Torralba - BSC, \email{veronica.torralba@bsc.es} diff --git a/man/as.s2dv_cube.Rd b/man/as.s2dv_cube.Rd index 23a761e60ca53e01070bd39c7814bc152fd1409b..e80044f5dcd8455e91a1527b8ddbe57934df90e1 100644 --- a/man/as.s2dv_cube.Rd +++ b/man/as.s2dv_cube.Rd @@ -4,44 +4,92 @@ \alias{as.s2dv_cube} \title{Conversion of 'startR_array' or 'list' objects to 's2dv_cube'} \usage{ -as.s2dv_cube(object) +as.s2dv_cube(object, remove_attrs_coords = FALSE, remove_null = FALSE) } \arguments{ -\item{object}{an object of class 'startR_array' generated from function \code{Start} from startR package (version 0.1.3 from earth.bsc.es/gitlab/es/startR) or a list output from function \code{Load} from s2dv package.} +\item{object}{An object of class 'startR_array' generated from function +\code{Start} from startR package or a list output from function \code{Load} +from s2dv package. Any other object class will not be accepted.} + +\item{remove_attrs_coords}{A logical value indicating whether to remove the +attributes of the coordinates (TRUE) or not (FALSE). The default value is +FALSE.} + +\item{remove_null}{Optional. A logical value indicating whether to remove the +elements that are NULL (TRUE) or not (FALSE) of the output object. It is +only used when the object is an output from function \code{Load}. The +default value is FALSE.} } \value{ -The function returns a 's2dv_cube' object to be easily used with functions \code{CST} from CSTools package. +The function returns an 's2dv_cube' object to be easily used with +functions with the prefix \code{CST} from CSTools and CSIndicators packages. +The object is mainly a list with the following elements:\cr +\itemize{ + \item{'data', array with named dimensions.} + \item{'dims', named vector of the data dimensions.} + \item{'coords', named list with elements of the coordinates corresponding to + the dimensions of the data parameter. If any coordinate is not provided, it + is set as an index vector with the values from 1 to the length of the + corresponding dimension. The attribute 'indices' indicates wether the + coordinate is an index vector (TRUE) or not (FALSE).} + \item{'attrs', named list with elements: + \itemize{ + \item{'Dates', array with named temporal dimensions of class 'POSIXct' + from time values in the data.} + \item{'Variable', has the following components: + \itemize{ + \item{'varName', character vector of the short variable name. It is + usually specified in the parameter 'var' from the functions + Start and Load.} + \item{'metadata', named list of elements with variable metadata. + They can be from coordinates variables (e.g. longitude) or + main variables (e.g. 'var').} + } + } + \item{'Datasets', character strings indicating the names of the + datasets.} + \item{'source_files', a vector of character strings with complete paths + to all the found files involved in loading the data.} + \item{'when', a time stamp of the date issued by the Start() or Load() + call to obtain the data.} + \item{'load_parameters', it contains the components used in the + arguments to load the data from Start() or Load() functions.} + } + } +} } \description{ -This function converts data loaded using startR package or s2dv Load function into a 's2dv_cube' object. +This function converts data loaded using Start function from startR package or +Load from s2dv into an 's2dv_cube' object. } \examples{ \dontrun{ +# Example 1: convert an object from startR::Start function to 's2dv_cube' library(startR) repos <- '/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc' data <- Start(dat = repos, var = 'tas', sdate = c('20170101', '20180101'), - ensemble = indices(1:20), + ensemble = indices(1:5), time = 'all', - latitude = 'all', - longitude = indices(1:40), + latitude = indices(1:5), + longitude = indices(1:5), return_vars = list(latitude = 'dat', longitude = 'dat', time = 'sdate'), retrieve = TRUE) data <- as.s2dv_cube(data) -class(data) +# Example 2: convert an object from s2dv::Load function to 's2dv_cube' startDates <- c('20001101', '20011101', '20021101', - '20031101', '20041101', '20051101') + '20031101', '20041101', '20051101') data <- Load(var = 'tas', exp = 'system5c3s', - nmember = 15, sdates = startDates, - leadtimemax = 3, latmin = 27, latmax = 48, - lonmin = -12, lonmax = 40, output = 'lonlat') + nmember = 2, sdates = startDates, + leadtimemax = 3, latmin = 10, latmax = 30, + lonmin = -10, lonmax = 10, output = 'lonlat') data <- as.s2dv_cube(data) -class(data) } } \seealso{ -\code{\link{s2dv_cube}}, \code{\link[s2dv]{Load}}, \code{\link[startR]{Start}} and \code{\link{CST_Load}} +\code{\link{s2dv_cube}}, \code{\link[s2dv]{Load}}, +\code{\link[startR]{Start}} and \code{\link{CST_Load}} } \author{ Perez-Zanon Nuria, \email{nuria.perez@bsc.es} diff --git a/man/s2dv_cube.Rd b/man/s2dv_cube.Rd index 327f3d17c0d95c2e10fe88281dff15dd0ced2c5d..ff302ccd93252143b387eb102991c3d1edb29caa 100644 --- a/man/s2dv_cube.Rd +++ b/man/s2dv_cube.Rd @@ -6,60 +6,88 @@ \usage{ s2dv_cube( data, - lon = NULL, - lat = NULL, - Variable = NULL, + coords = NULL, + varName = NULL, + metadata = NULL, Datasets = NULL, Dates = NULL, - time_dims = NULL, when = NULL, - source_files = NULL + source_files = NULL, + ... ) } \arguments{ -\item{data}{an array with any number of named dimensions, typically an object -output from CST_Load, with the following dimensions: dataset, member, sdate, -ftime, lat and lon.} +\item{data}{A multidimensional array with named dimensions, typically with +dimensions: dataset, member, sdate, ftime, lat and lon.} -\item{lon}{an array with one dimension containing the longitudes and -attributes: dim, cdo_grid_name, data_across_gw, array_across_gw, first_lon, -last_lon and projection.} +\item{coords}{A named list with elements of the coordinates corresponding to +the dimensions of the data parameter. The names and length of each element +must correspond to the names of the dimensions. If any coordinate is not +provided, it is set as an index vector with the values from 1 to the length +of the corresponding dimension.} -\item{lat}{an array with one dimension containing the latitudes and -attributes: dim, cdo_grid_name, first_lat, last_lat and projection.} +\item{varName}{A character string indicating the abbreviation of the variable +name.} -\item{Variable}{a list of two elements: \code{varName} a character string -indicating the abbreviation of a variable name and \code{level} a character -string indicating the level (e.g., "2m"), if it is not required it could be -set as NULL.} +\item{metadata}{A named list where each element is a variable containing the +corresponding information. The information can be contained in a list of +lists for each variable.} -\item{Datasets}{a named list with the dataset model with two elements: -\code{InitiatlizationDates}, containing a list of the start dates for each -member named with the names of each member, and \code{Members} containing a -vector with the member names (e.g., "Member_1")} +\item{Datasets}{Character strings indicating the names of the dataset. It +there are multiple datasets it can be a vector of its names or a list of +lists with additional information.} -\item{Dates}{a named list of one to two elements: The first element, -\code{start}, is an array of dimensions (sdate, time) with the POSIX initial -date of each forecast time of each starting date. The second element, -\code{end} (optional), is an array of dimensions (sdate, time) with the POSIX} +\item{Dates}{A POSIXct array of time dimensions containing the Dates.} -\item{time_dims}{a vector of strings containing the names of the temporal -dimensions found in \code{data}.} +\item{when}{A time stamp of the date when the data has been loaded. This +parameter is also found in Load() and Start() functions output.} -\item{when}{a time stamp of the date issued by the Load() call to obtain the -data.} +\item{source_files}{A vector of character strings with complete paths to all +the found files involved in loading the data.} -\item{source_files}{a vector of character strings with complete paths to all -the found files involved in the Load() call.} +\item{\dots}{Additional elements to be added in the object. They will be +stored in the end of 'attrs' element. Multiple elements are accepted.} } \value{ -The function returns an object of class 's2dv_cube'. +The function returns an object of class 's2dv_cube' with the following +elements in the structure:\cr +\itemize{ + \item{'data', array with named dimensions.} + \item{'dims', named vector of the data dimensions.} + \item{'coords', named list with elements of the coordinates corresponding to + the dimensions of the data parameter. If any coordinate is not provided, it + is set as an index vector with the values from 1 to the length of the + corresponding dimension. The attribute 'indices' indicates wether the + coordinate is an index vector (TRUE) or not (FALSE).} + \item{'attrs', named list with elements: + \itemize{ + \item{'Dates', array with named temporal dimensions of class 'POSIXct' from + time values in the data.} + \item{'Variable', has the following components: + \itemize{ + \item{'varName', with the short name of the loaded variable as specified + in the parameter 'var'.} + \item{''metadata', named list of elements with variable metadata. + They can be from coordinates variables (e.g. longitude) or + main variables (e.g. 'var').} + } + } + \item{'Datasets', character strings indicating the names of the dataset.} + \item{'source_files', a vector of character strings with complete paths to + all the found files involved in loading the data.} + \item{'when', a time stamp of the date issued by the Start() or Load() call to + obtain the data.} + \item{'load_parameters', it contains the components used in the arguments to + load the data from Start() or Load() functions.} + } + } +} } \description{ -This function allows to create a 's2dv_cube' object by passing +This function allows to create an 's2dv_cube' object by passing information through its parameters. This function will be needed if the data hasn't been loaded using CST_Load or has been transformed with other methods. -A 's2dv_cube' object has many different components including metadata. This +An 's2dv_cube' object has many different components including metadata. This function will allow to create 's2dv_cube' objects even if not all elements are defined and for each expected missed parameter a warning message will be returned. @@ -69,44 +97,42 @@ exp_original <- 1:100 dim(exp_original) <- c(lat = 2, time = 10, lon = 5) exp1 <- s2dv_cube(data = exp_original) class(exp1) -exp2 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50)) +coords <- list(lon = seq(-10, 10, 5), lat = c(45, 50)) +exp2 <- s2dv_cube(data = exp_original, coords = coords) class(exp2) -exp3 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50), - Variable = list(varName = 'tas', level = '2m')) +metadata <- list(tas = list(level = '2m')) +exp3 <- s2dv_cube(data = exp_original, coords = coords, + varName = 'tas', metadata = metadata) class(exp3) -exp4 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50), - Variable = list(varName = 'tas', level = '2m'), - Dates = list(start = paste0(rep("01", 10), rep("01", 10), 1990:1999), - end = paste0(rep("31", 10), rep("01", 10), 1990:1999))) +Dates = as.POSIXct(paste0(rep("01", 10), rep("01", 10), 1990:1999), format = "\%d\%m\%Y") +dim(Dates) <- c(time = 10) +exp4 <- s2dv_cube(data = exp_original, coords = coords, + varName = 'tas', metadata = metadata, + Dates = Dates) class(exp4) -exp5 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50), - Variable = list(varName = 'tas', level = '2m'), - Dates = list(start = paste0(rep("01", 10), rep("01", 10), 1990:1999), - end = paste0(rep("31", 10), rep("01", 10), 1990:1999)), - when = "2019-10-23 19:15:29 CET") +exp5 <- s2dv_cube(data = exp_original, coords = coords, + varName = 'tas', metadata = metadata, + Dates = Dates, when = "2019-10-23 19:15:29 CET") class(exp5) -exp6 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50), - Variable = list(varName = 'tas', level = '2m'), - Dates = list(start = paste0(rep("01", 10), rep("01", 10), 1990:1999), - end = paste0(rep("31", 10), rep("01", 10), 1990:1999)), +exp6 <- s2dv_cube(data = exp_original, coords = coords, + varName = 'tas', metadata = metadata, + Dates = Dates, when = "2019-10-23 19:15:29 CET", - source_files = c("/path/to/file1.nc", "/path/to/file2.nc")) + source_files = c("/path/to/file1.nc", "/path/to/file2.nc")) class(exp6) -exp7 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50), - Variable = list(varName = 'tas', level = '2m'), - Dates = list(start = paste0(rep("01", 10), rep("01", 10), 1990:1999), - end = paste0(rep("31", 10), rep("01", 10), 1990:1999)), +exp7 <- s2dv_cube(data = exp_original, coords = coords, + varName = 'tas', metadata = metadata, + Dates = Dates, when = "2019-10-23 19:15:29 CET", source_files = c("/path/to/file1.nc", "/path/to/file2.nc"), Datasets = list( exp1 = list(InitializationsDates = list(Member_1 = "01011990", Members = "Member_1")))) class(exp7) -dim(exp_original) <- c(dataset = 1, member = 1, sdate = 2, ftime = 5, lat = 2, lon = 5) -exp8 <- s2dv_cube(data = exp_original, lon = seq(-10, 10, 5), lat = c(45, 50), - Variable = list(varName = 'tas', level = '2m'), - Dates = list(start = paste0(rep("01", 10), rep("01", 10), 1990:1999), - end = paste0(rep("31", 10), rep("01", 10), 1990:1999))) +dim(exp_original) <- c(dataset = 1, member = 1, time = 10, lat = 2, lon = 5) +exp8 <- s2dv_cube(data = exp_original, coords = coords, + varName = 'tas', metadata = metadata, + Dates = Dates, original_dates = Dates) class(exp8) } \seealso{ diff --git a/man/training_analogs.Rd b/man/training_analogs.Rd index 447f8b0c27ea0f3a2b1c5feef79c75165a3dd1a2..2cf32e7b60be11f30ce4c5814cec9d4567146293 100644 --- a/man/training_analogs.Rd +++ b/man/training_analogs.Rd @@ -19,22 +19,25 @@ training_analogs( ) } \arguments{ -\item{pred}{List of matrix reanalysis data in a synoptic domain. The list -has to contain reanalysis atmospheric variables (instantaneous 12h data) -that must be indentify by parenthesis name. -For precipitation: -- u component of wind at 500 hPa (u500) in m/s -- v component of wind at 500 hPa (v500) in m/s -- temperature at 500 hPa (t500) in K -- temperature at 850 hPa (t850) in K -- temperature at 1000 hPa (t1000) in K -- geopotential height at 500 hPa (z500) in m -- geopotential height at 1000 hPa (z1000) in m -- sea level pressure (slp) in hPa -- specific humidity at 700 hPa (q700) in g/kg +\item{pred}{List of matrix reanalysis data in a synoptic domain. The list has +to contain reanalysis atmospheric variables (instantaneous 12h data) that +must be indentify by parenthesis name. For precipitation: +\itemize{ + \item{u component of wind at 500 hPa (u500) in m/s} + \item{v component of wind at 500 hPa (v500) in m/s} + \item{temperature at 500 hPa (t500) in K} + \item{temperature at 850 hPa (t850) in K} + \item{temperature at 1000 hPa (t1000) in K} + \item{geopotential height at 500 hPa (z500) in m} + \item{geopotential height at 1000 hPa (z1000) in m} + \item{sea level pressure (slp) in hPa} + \item{specific humidity at 700 hPa (q700) in g/kg} +} For maximum and minimum temperature: -- temperature at 1000 hPa (t1000) in K -- sea level pressure (slp) in hPa +\itemize{ + \item{temperature at 1000 hPa (t1000) in K} + \item{sea level pressure (slp) in hPa} +} All matrix must have [time,gridpoint] dimensions. (time = number of training days, gridpoint = number of synoptic gridpoints).} @@ -42,88 +45,111 @@ All matrix must have [time,gridpoint] dimensions. (instantaneous 12h data)(hPa). It has the same resolution as 'pred' parameter but with an extended domain. This domain contains extra degrees (most in the north and west part) compare to synoptic domain. The matrix must have -[time,gridpoint] dimensions. -(time = number of training days, gridpoint = number of extended gridpoints).} +[time,gridpoint] dimensions. (time = number of training days, +gridpoint = number of extended gridpoints).} \item{lon}{Vector of the synoptic longitude (from (-180º) to 180º), The vector must go from west to east.} -\item{lat}{Vector of the synoptic latitude. The vector must go from north to south.} +\item{lat}{Vector of the synoptic latitude. The vector must go from north to +south.} -\item{slp_lon}{Vector of the extended longitude (from (-180º) to 180º) +\item{slp_lon}{Vector of the extended longitude (from (-180º) to 180º). The vector must go from west to east.} -\item{slp_lat}{Vector of the extended latitude. The vector must go from north to south.} +\item{slp_lat}{Vector of the extended latitude. The vector must go from north +to south.} \item{var}{Variable name to downscale. There are two options: 'prec' for precipitation and 'temp' for maximum and minimum temperature.} \item{HR_path}{Local path of HR observational files (maestro and pcp/tmx-tmn). -For precipitation can be downloaded from http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/v2/Serie_AEMET_v2_pcp_1951a202006_txt.tar.gz -For maximum and minimum temperature can be downloaded from http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/temperatura/v1/tmax/Serie_AEMET_v1_tmax_1951a202006_txt.tar.gz and http://www.aemet.es/documentos/es/serviciosclimaticos/cambio_climat/datos_diarios/dato_observacional/rejilla_5km/temperatura/v1/tmin/Serie_AEMET_v1_tmin_1951a202006_txt.tar.gz respetively. -Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), longitude (lon), latitude (lat) and -altitude (alt) in columns (vector structure). -Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km resolution spanish daily data -(precipitation or maximum and minimum temperature from january 1951 to june 2020. See README -file for more information. -IMPORTANT!: HR observational period must be the same as for reanalysis variables. -It is assumed that the training period is smaller than the HR original one (1951-2020), so it is -needed to make a new ascii file with the new period and the same structure as original, -specifying the training dates ('tdates' parameter) in the name -(e.g. 'pcp_red_SPAIN_19810101-19961231.txt' for '19810101-19961231' period).} +For precipitation and temperature can be downloaded from the following link: +\url{https://www.aemet.es/en/serviciosclimaticos/cambio_climat/datos_diarios?w=2} +respetively. Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), +longitude (lon), latitude (lat) and altitude (alt) in columns (vector +structure). Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km +resolution spanish daily data (precipitation or maximum and minimum +temperature from january 1951 to june 2020. See README file for more +information. IMPORTANT!: HR observational period must be the same as for +reanalysis variables. It is assumed that the training period is smaller than +the HR original one (1951-2020), so it is needed to make a new ascii file +with the new period and the same structure as original, specifying the +training dates ('tdates' parameter) in the name (e.g. +'pcp_red_SPAIN_19810101-19961231.txt' for '19810101-19961231' period).} -\item{tdates}{Training period dates in format YYYYMMDD(start)-YYYYMMDD(end) (e.g. 19810101-19961231).} +\item{tdates}{Training period dates in format YYYYMMDD(start)-YYYYMMDD(end) +(e.g. 19810101-19961231).} } \value{ -matrix list (e.g. restrain) as a result of characterize the past synoptic -situations and the significant predictors needed to downscale seasonal forecast variables. -For precipitation the output includes: -um: u component of geostrophic wind in all period (numeric matrix with [time,gridpoint] dimensions) -vm: v component of geostrophic wind in all period (numeric matrix with [time,gridpoint] dimensions) -nger: number of synoptic situations (integer) -gu92: u component of geostrophic wind for each synoptic situation (numeric matrix with - [nger,gridpoint] dimensions) -gv92: v component of geostrophic wind for each synoptic situation (numeric matrix with - [nger,gridpoint] dimensions) -gu52: u component of wind at 500 hPa for each synotic situation (numeric matrix with - [nger,gridpoint] dimensions) -gv52: v component of wind at 500 hPa for each synotic situation (numeric matrix with - [nger,gridpoint] dimensions) -neni: number of reference centers where predictors are calculated (integer) -vdmin: minimum distances between each HR gridpoint and the four nearest synoptic - gridpoints (numeric matrix with [nptos,4] dimensions) (nptos = number of HR gridpoints) -vref: four nearest synoptic gridpoints to each HR gridpoint (integer matrix with - [nptos,4] dimensions) -ccm: multiple correlation coeficients (numeric matrix with [nger,nptos] dimensions) -indices: - - lab_pred: numeric labels of selected predictors (integer matrix - with [nger,nptos,11,1] dimensions) - - cor_pred: partial correlation of selected predictors (numeric matrix with - [nger,nptos,11,2] dimensions) +A matrix list (e.g. restrain) as a result of characterize the past +synoptic situations and the significant predictors needed to downscale +seasonal forecast variables. For precipitation the output includes: +\itemize{ + \item{'um': u component of geostrophic wind in all period (numeric matrix + with [time, gridpoint] dimensions).} + \item{'vm': v component of geostrophic wind in all period (numeric matrix + with [time,gridpoint] dimensions).} + \item{'nger': number of synoptic situations (integer).} + \item{'gu92': u component of geostrophic wind for each synoptic situation + (numeric matrix with [nger,gridpoint] dimensions).} + \item{'gv92': v component of geostrophic wind for each synoptic situation + (numeric matrix with [nger, gridpoint] dimensions).} + \item{'gu52': u component of wind at 500 hPa for each synotic situation + (numeric matrix with [nger, gridpoint] dimensions).} + \item{'gv52': v component of wind at 500 hPa for each synotic situation + (numeric matrix with [nger, gridpoint] dimensions).} + \item{'neni': number of reference centers where predictors are calculated + (integer).} + \item{'vdmin': minimum distances between each HR gridpoint and the four + nearest synoptic gridpoints (numeric matrix with [nptos,4] dimensions) + (nptos = number of HR gridpoints).} + \item{'vref': four nearest synoptic gridpoints to each HR gridpoint (integer + matrix with [nptos, 4] dimensions).} + \item{'ccm': multiple correlation coeficients (numeric matrix with [nger, nptos] + dimensions) indices: + \itemize{ + \item{'lab_pred': numeric labels of selected predictors (integer matrix + with [nger,nptos,11,1] dimensions).} + \item{'cor_pred': partial correlation of selected predictors (numeric + matrix with [nger,nptos,11,2] dimensions).} + } + } +} For maximum and minimum temperature the output includes: -um: u component of geostrophic wind in all training period (numeric matrix with [time,gridpoint] dimensions) -vm: v component of geostrophic wind in all training period (numeric matrix with [time,gridpoint] dimensions) -insol: insolation in all training period (numeric vector with [time] dimension) -neni: number of reference centers where predictors are calculated (integer) -vdmin: minimum distances between each HR gridpoint and the four nearest synoptic - gridpoints (numeric matrix with [nptos,4] dimensions) (nptos = number of HR gridpoints) -vref: four nearest synoptic gridpoints to each HR gridpoint (integer matrix with - [nptos,4] dimensions) - +\itemize{ + \item{'um': u component of geostrophic wind in all training period (numeric + matrix with [time,gridpoint] dimensions).} + \item{'vm': v component of geostrophic wind in all training period (numeric + matrix with [time,gridpoint] dimensions).} + \item{'insol': insolation in all training period (numeric vector with [time] + dimension).} + \item{'neni': number of reference centers where predictors are calculated + (integer).} + \item{'vdmin': minimum distances between each HR gridpoint and the four + nearest synoptic gridpoints (numeric matrix with [nptos,4] dimensions) + (nptos = number of HR gridpoints).} + \item{'vref': four nearest synoptic gridpoints to each HR gridpoint (integer + matrix with [nptos,4] dimensions).} +} The output can directly use as argument to 'CST_AnalogsPredictors' function -(e.g. resdowns <- CST_AnalogsPredictors(...,restrain)) +(e.g. resdowns <- CST_AnalogsPredictors(...,restrain)). } \description{ -This function caracterizes the synoptic situations in a past period based on -low resolution reanalysis data (e.g, ERAInterim 1.5º x 1.5º) and an observational high -resolution (HR) dataset (AEMET 5 km gridded daily precipitation and maximum and -minimum temperature) (Peral et al., 2017)). -The method uses three domains: -- peninsular Spain and Balearic Islands domain (5 km resolution): HR domain -- synoptic domain (low resolution): it should be centered over Iberian Peninsula and - cover enough extension to detect as much synoptic situations as possible. -- extended domain (low resolution): it is an extension of the synoptic - domain. It is used for 'slp_ext' parameter (see 'slp_lon' and 'slp_lat' below). +This function caracterizes the synoptic situations in a past +period based on low resolution reanalysis data (e.g, ERAInterim 1.5º x 1.5º) +and an observational high resolution (HR) dataset (AEMET 5 km gridded daily +precipitation and maximum and minimum temperature) (Peral et al., 2017)). +The method uses three domains: +\itemize{ + \item{peninsular Spain and Balearic Islands domain (5 km resolution): HR domain} + \item{synoptic domain (low resolution): it should be centered over Iberian + Peninsula and cover enough extension to detect as much synoptic + situations as possible.} + \item{extended domain (low resolution): it is an extension of the synoptic + domain. It is used for 'slp_ext' parameter (see 'slp_lon' and 'slp_lat' + below).} +} } \author{ Marta Dominguez Alonso - AEMET, \email{mdomingueza@aemet.es} diff --git a/tests/testthat.R b/tests/testthat.R index c069d7531dda417fa398a6bb43fd87b9fb96cf9b..557efdc65b526f5579109f103393d9f03529bcd2 100644 --- a/tests/testthat.R +++ b/tests/testthat.R @@ -1,5 +1,6 @@ library(testthat) library(CSTools) +library(startR) test_check("CSTools") diff --git a/tests/testthat/test-CST_Analogs.R b/tests/testthat/test-CST_Analogs.R new file mode 100644 index 0000000000000000000000000000000000000000..66e1c88a8026878285ee3a705014184e0319f88a --- /dev/null +++ b/tests/testthat/test-CST_Analogs.R @@ -0,0 +1,167 @@ +context("CSTools::CST_Analogs tests") + +############################################## + +# dat1 +exp1 <- rnorm(1:20) +dim(exp1) <- c(lat = 4, lon = 5) + +obs1 <- c(rnorm(1:180), exp1 * 1.2) +dim(obs1) <- c(time = 10, lat = 4, lon = 5) + +time_obsL1 <- paste(rep("01", 10), rep("01", 10), 1994 : 2003, sep = "-") +time_expL1 <- "01-01-1994" +lon1 <- seq(0, 20, 5) +lat1 <- seq(0, 15, 4) +coords = list(lat = lat1, lon = lon1) +attrs_expL <- list(Dates = time_expL1) +attrs_obsL <- list(Dates = time_obsL1) +exp <- list(data = exp1, coords = coords, attrs = attrs_expL) +obs <- list(data = obs1, coords = coords, attrs = attrs_obsL) +attr(exp, 'class') <- 's2dv_cube' +attr(obs, 'class') <- 's2dv_cube' + +# dat2 +obs2 <- obs +obs2$coords <- NULL +obs2_2 <- obs2 +obs2_2$coords <- list(long = seq(1:4), lati = seq(1:4)) + +############################################## +test_that("1. Input checks: CST_Analogs", { + # Check 's2dv_cube' + expect_error( + CST_Analogs(expL = 1, obsL = 1), + paste0("Parameter 'expL' and 'obsL' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + ) + expect_error( + CST_Analogs(expL = exp, obsL = obs, expVar = 1), + paste0("Parameter 'expVar' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + ) + expect_error( + CST_Analogs(expL = exp, obsL = obs, obsVar = 1), + paste0("Parameter 'obsVar' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + ) + + # Check 'obsL' object structure + expect_error( + CST_Analogs(expL = exp, obsL = obs2), + paste0("Parameter 'obsL' must have 'data', 'coords' and 'attrs' elements ", + "within the 's2dv_cube' structure.") + ) + expect_error( + CST_Analogs(expL = exp, obsL = obs2_2), + paste0("Spatial coordinate names of parameter 'obsL' do not match any ", + "of the names accepted by the package.") + ) + # Check 'obsVar' object structure + expect_error( + CST_Analogs(expL = exp, obsL = obs, expVar = exp, obsVar = obs2), + paste0("Parameter 'obsVar' must have 'data', 'coords' and 'attrs' elements ", + "within the 's2dv_cube' structure.") + ) + expect_error( + CST_Analogs(expL = exp, obsL = obs, expVar = exp, obsVar = obs2_2), + paste0("Spatial coordinate names of parameter 'obsVar' do not match any ", + "of the names accepted by the package.") + ) +}) + +############################################## +test_that("2. Input checks: Analogs", { + # expL, obsL + expect_error( + Analogs(expL = 1), + "Parameter 'expL' must be a numeric array." + ) + expect_error( + Analogs(expL = exp1, obsL = 1), + "Parameter 'obsL' must be a numeric array." + ) + expect_error( + Analogs(expL = array(1:10), obsL = obs1), + "Parameter 'expL' must have dimension names." + ) + expect_error( + Analogs(expL = exp1, obsL = array(1:10)), + "Parameter 'obsL' must have dimension names." + ) + exp$data[1, 2] <- NA + expect_warning( + Analogs(expL = exp$data, obsL = obs1, time_obsL = time_obsL1, + time_expL = time_expL1), + "Parameter 'expL' contains NA values." + ) + expect_error( + Analogs(expL = array(1:10, dim = c(time = 10)), obsL = obs1), + "Parameter 'expL' and 'obsL' must have longitudinal dimension." + ) + expect_error( + Analogs(expL = exp1, obsL = array(1:10, dim = c(lon = 10))), + "Parameter 'expL' and 'obsL' must have latitudinal dimension." + ) + # criteria + expect_error( + Analogs(expL = exp1, obsL = obs1, criteria = 1), + "Parameter 'criteria' can only be: 'Large_dist', 'Local_dist' or 'Local_cor'." + ) + # lonL, latL, lonVar, latVar + expect_error( + Analogs(expL = exp1, obsL = obs1, criteria = "Local_dist"), + "Parameters 'lonL' and 'latL' cannot be NULL." + ) + expect_error( + Analogs(expL = exp1, obsL = obs1, criteria = "Local_dist", lonL = 'a', latL = 'b'), + "Parameters 'lonL' and 'latL' must be numeric." + ) + expect_error( + Analogs(expL = exp1, obsL = obs1, criteria = "Local_cor", + lonL = array(1:10, dim = c(2,5)), latL = lat1), + "Parameters 'lonL' and 'latL' need to be a vector." + ) + suppressWarnings( + expect_error( + Analogs(expL = exp1, obsL = obs1, criteria = "Local_cor", lonL = lon1, + time_expL = time_expL1, time_obsL = NULL, latL = lat1), + "Parameters 'lonVar' and 'latVar' cannot be NULL." + ) + ) + suppressWarnings( + expect_error( + Analogs(expL = exp1, obsL = obs1, criteria = "Local_cor", lonL = lon1, + time_obsL = time_obsL1, latL = lat1, lonVar = lon1, latVar = lat1), + "Parameter 'time_expL' cannot be NULL." + ) + ) +}) + +############################################## + +test_that("3. Output checks" , { + suppressWarnings( + res <- CST_Analogs(expL = exp, obsL = obs) + ) + expect_equal( + names(res), + c('data', 'coords', 'attrs') + ) + expect_equal( + dim(res$data), + c(nAnalogs = 1, lat = 4, lon = 5) + ) + suppressWarnings( + res1 <- CST_Analogs(expL = exp, obsL = obs, expVar = exp, obsVar = obs, + AnalogsInfo = TRUE) + ) + expect_equal( + names(res1$data), + c('fields', 'analogs', 'metric', 'dates') + ) + expect_equal( + dim(res1$data$fields), + c(nAnalogs = 1, lat = 4, lon = 5) + ) +}) diff --git a/tests/testthat/test-CST_Anomaly.R b/tests/testthat/test-CST_Anomaly.R index b4137015bd494d3cf1b911e5f214e87b38d5db97..a4c7f3f57778c5cfde40bb4b25a1a1490143d757 100644 --- a/tests/testthat/test-CST_Anomaly.R +++ b/tests/testthat/test-CST_Anomaly.R @@ -63,10 +63,9 @@ test_that("1. Input checks", { "Parameter 'exp' and 'obs' must have same dimension names in element 'data'." ) # dim_anom - expect_warning( + expect_error( CST_Anomaly(exp = exp, obs = obs, dim_anom = 3), - paste0("Parameter 'dim_anom' must be a character string and a numeric value will not be ", - "accepted in the next release. The corresponding dimension name is assigned.") + paste0("Parameter 'dim_anom' must be a character string.") ) expect_error( CST_Anomaly(exp = exp3, obs = obs3), @@ -113,7 +112,6 @@ test_that("1. Input checks", { ############################################## test_that("2. Output checks: dat", { - expect_equal( names(CST_Anomaly(exp, obs)), c("exp", "obs") diff --git a/tests/testthat/test-CST_BEI_Weighting.R b/tests/testthat/test-CST_BEI_Weighting.R index e3686fead7a00b03e722b73b1a505272195a5603..3a70f38aa20452884e95d338711c0daea432fb62 100644 --- a/tests/testthat/test-CST_BEI_Weighting.R +++ b/tests/testthat/test-CST_BEI_Weighting.R @@ -1,166 +1,196 @@ -context("Generic tests") -test_that("basic use case", { - - var_exp <- 1 : (2 * 4 * 3 * 2) - dim(var_exp) <- c(time = 2, member = 4, lat = 3, lon = 2) - aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3, 0.1, 0.2, 0.4, 0.4, 0.1, 0.2, 0.4, 0.2) - dim(aweights) <- c(time = 2, member = 4, dataset = 2) - var_exp <- list(data = var_exp) - class(var_exp) <- 's2dv_cube' - - result <- array(c(4.4, 5.4, 12.4, 13.4, 20.4, 21.4, - 28.4, 29.4, 36.4, 37.4, 44.4, 45.4, - 4.6, 4.8, 12.6, 12.8, 20.6, 20.8, - 28.6, 28.8, 36.6, 36.8, 44.6, 44.8), - dim = c(time = 2, lat = 3, lon = 2, dataset =2)) - expect_equal(CST_BEI_Weighting(var_exp, aweights, type = 'ensembleMean', - time_dim_name = 'time')$data, result, tolerance=1e-4) - - var_exp <- 1 : (2 * 3 * 1 * 2) - dim(var_exp) <- c(time = 2, member = 3, lat = 1, lon = 2) - aweights <- c(0.2, 0.1, 0.3, 0.4, 0.5, 0.5) - dim(aweights) <- c(time = 2, member = 3) - var_exp <- list(data = var_exp) - class(var_exp) <- 's2dv_cube' - result <- array(c(0.5, 0.1, 0.0, 0.4, 0.5, 0.5, 0.5, 0.1, 0.0, 0.4, 0.5, 0.5), - dim = c(time = 2, tercil = 3, lat = 1, lon = 2)) - expect_equal(CST_BEI_Weighting(var_exp, aweights, type = 'probs', - time_dim_name = 'time')$data, result, tolerance=1e-4) - - var_exp <- 1 : (2 * 3 * 1 * 2) - dim(var_exp) <- c(sdate = 2, member = 3, lat = 1, lon = 2) - aweights <- c(0.2, 0.1, 0.3, 0.4, 0.5, 0.5) - dim(aweights) <- c(sdate = 2, member = 3) - var_exp <- list(data = var_exp) - class(var_exp) <- 's2dv_cube' - - result <- array(c(0.5, 0.1, 0.0, 0.4, 0.5, 0.5, - 0.5, 0.1, 0.0, 0.4, 0.5, 0.5), - dim = c(sdate = 2, tercil = 3, lat = 1, lon = 2)) - - expect_equal(CST_BEI_Weighting(var_exp, aweights, type = 'probs', - time_dim_name = 'sdate')$data, result, tolerance=1e-4) - - var_exp <- 1 : (2 * 3) - dim(var_exp) <- c(sdate = 2, member = 3) - aweights <- c(0.2, 0.1, 0.3, 0.4, 0.5, 0.5) - dim(aweights) <- c(sdate = 2, member = 3) - var_exp <- list(data = var_exp) - class(var_exp) <- 's2dv_cube' - - result <- array(c(0.5, 0.1, 0.0, 0.4, 0.5, 0.5), - dim = c(sdate = 2, tercil = 3)) - - expect_equal(CST_BEI_Weighting(var_exp, aweights, type = 'probs', - time_dim_name = 'sdate')$data, result, tolerance=1e-4) -}) +context("CSTools::CST_CST_BEI_Weighting tests") + +# dat +var_exp <- array(1:6, dim = c(sdate = 2, member = 3)) +var_exp <- list(data = var_exp) +class(var_exp) <- 's2dv_cube' +aweights <- c(0.2, 0.1, 0.3, 0.4, 0.5, 0.5) +dim(aweights) <- c(sdate = 2, member = 3) +terciles <- rep(c(35, 45)) +dim(terciles) <- c(tercil = 2) + +# dat1 +var_exp1 <- array(1:6, dim = c(time = 2, member = 3)) +aweights1 <- c(0.2, 0.1, 0.3, 0.4, 0.5, 0.5) +dim(aweights1) <- c(time = 2, member = 3) + +# dat2 +var_exp2 <- array(1:(2*4*3*2), dim = c(time = 2, member = 4, lat = 3, lon = 2)) +aweights2 <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3, + 0.1, 0.2, 0.4, 0.4, 0.1, 0.2, 0.4, 0.2) +dim(aweights2) <- c(time = 2, member = 4, dataset = 2) +var_exp2 <- list(data = var_exp2) +class(var_exp2) <- 's2dv_cube' +terciles2 <- rep(c(35, 45), 3*2) +dim(terciles2) <- c(tercil = 2, lat = 3, lon = 2) -test_that("Sanity checks", { +result2 <- array(c(4.4, 5.4, 12.4, 13.4, 20.4, 21.4, + 28.4, 29.4, 36.4, 37.4, 44.4, 45.4, + 4.6, 4.8, 12.6, 12.8, 20.6, 20.8, + 28.6, 28.8, 36.6, 36.8, 44.6, 44.8), + dim = c(time = 2, lat = 3, lon = 2, dataset = 2)) + +############################################## + +test_that("1. Input checks", { + # s2dv_cube expect_error( - CST_BEI_Weighting(var_exp, aweights, type = 'probs', time_dim_name = 1), - paste0("Parameter 'time_dim_name' must be a character string indicating", - " the name of the temporal dimension.")) - + CST_BEI_Weighting(var_exp = 1, aweights = 1, type = 'probs', time_dim_name = 1), + paste0("Parameter 'var_exp' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + ) + # type expect_error( - CST_BEI_Weighting(var_exp, aweights, type = 2), + CST_BEI_Weighting(var_exp = var_exp, aweights = aweights, type = 1, + time_dim_name = 1), paste0("Parameter 'type' must be a character string, 'probs' or ", - "'ensembleMean', indicating the type of output.")) - + "'ensembleMean', indicating the type of output.") + ) expect_error( - CST_BEI_Weighting(var_exp = 1, aweights), - paste0("Parameter 'var_exp' must be of the class 's2dv_cube', ", - "as output by CSTools::CST_Load.")) - - var_exp <- 1 : (2 * 3) - dim(var_exp) <- c(sdate = 2, member = 3) - var_exp <- list(data = var_exp) - class(var_exp) <- 's2dv_cube' - expect_error( - CST_BEI_Weighting(var_exp, aweights = 2), - "Parameter 'aweights' must be an array.") - - var_exp <- 1 : (2 * 3) - dim(var_exp) <- c(2, 3) - aweights <- c(0.2, 0.1, 0.3, 0.4, 0.5, 0.5) - dim(aweights) <- c(2, 3) - var_exp <- list(data = var_exp) - class(var_exp) <- 's2dv_cube' - - expect_error( - CST_BEI_Weighting(var_exp, aweights), - paste0("Element 'data' from parameter 'var_exp' and parameter 'aweights'", - " should have dimmension names.")) - - var_exp <- 1 : (2 * 3) - dim(var_exp) <- c(sdate = 2, member = 3) - aweights <- c(0.2, 0.1, 0.3, 0.4, 0.5, 0.5) - dim(aweights) <- c(sdate = 2, member = 3) - var_exp <- list(data = var_exp) - class(var_exp) <- 's2dv_cube' - - expect_error( - CST_BEI_Weighting(var_exp, aweights), - paste0("Element 'data' from parameter 'var_exp' must have ", - "temporal dimension.")) - - var_exp <- 1 : (2 * 3) - dim(var_exp) <- c(time = 2, member = 3) - aweights <- c(0.2, 0.1, 0.3, 0.4, 0.5, 0.5) - dim(aweights) <- c(sdate = 2, member = 3) - var_exp <- list(data = var_exp) - class(var_exp) <- 's2dv_cube' - - expect_error( - CST_BEI_Weighting(var_exp, aweights), - "Parameter 'aweights' must have temporal dimension.") - - var_exp <- 1 : (2 * 3) - dim(var_exp) <- c(time = 2, season = 3) - aweights <- c(0.2, 0.1, 0.3, 0.4, 0.5, 0.5) - dim(aweights) <- c(time = 2, member = 3) - var_exp <- list(data = var_exp) - class(var_exp) <- 's2dv_cube' - - expect_error( - CST_BEI_Weighting(var_exp, aweights), - paste0("Element 'data' from parameter 'var_exp' must have ", - "dimension 'member'.")) - - var_exp <- 1 : (2 * 3) - dim(var_exp) <- c(time = 2, member = 3) - aweights <- c(0.2, 0.1, 0.3, 0.4, 0.5, 0.5) - dim(aweights) <- c(time = 2, season = 3) - var_exp <- list(data = var_exp) - class(var_exp) <- 's2dv_cube' - - expect_error( - CST_BEI_Weighting(var_exp, aweights), - paste0("Parameter 'aweights' must have ", - "dimension 'member'.")) - - var_exp <- 1 : (3 * 3) - dim(var_exp) <- c(time = 3, member = 3) - aweights <- c(0.2, 0.1, 0.3, 0.4, 0.5, 0.5) - dim(aweights) <- c(time = 2, member = 3) - var_exp <- list(data = var_exp) - class(var_exp) <- 's2dv_cube' - - expect_error( - CST_BEI_Weighting(var_exp, aweights), - paste0("Length of temporal dimensions ", - "of element 'data' from parameter 'var_exp' and parameter ", - "'aweights' must be equals.")) - - var_exp <- 1 : (3 * 4) - dim(var_exp) <- c(time = 3, member = 4) - aweights <- c(0.2, 0.1, 0.3, 0.4, 0.5, 0.5) - dim(aweights) <- c(time = 2, member = 3) - var_exp <- list(data = var_exp) - class(var_exp) <- 's2dv_cube' - - expect_error( - CST_BEI_Weighting(var_exp, aweights), - paste0("Length of temporal dimensions of element 'data' from ", - "parameter 'var_exp' and parameter 'aweights' must be equals.")) - + CST_BEI_Weighting(var_exp = var_exp, aweights = aweights, type = c('a'), + time_dim_name = 1), + paste0("Parameter 'type' must be a character string ('probs' or ", + "'ensembleMean'), indicating the type of output."), + fixed = TRUE + ) + # var_exp + expect_error( + BEI_EMWeighting(var_exp = 1, aweights = 2), + "Parameter 'var_exp' must be an array." + ) + expect_error( + BEI_ProbsWeighting(var_exp = 1, aweights = 2), + "Parameter 'var_exp' must be an array." + ) + # aweights + expect_error( + BEI_EMWeighting(var_exp = var_exp$data, aweights = 2), + "Parameter 'aweights' must be an array." + ) + # aweights + expect_error( + BEI_ProbsWeighting(var_exp = var_exp$data, aweights = 2), + "Parameter 'aweights' must be an array." + ) + # terciles + expect_error( + BEI_ProbsWeighting(var_exp = var_exp$data, aweights = aweights, terciles = NULL), + "Parameter 'terciles' cannot be null." + ) + expect_error( + BEI_ProbsWeighting(var_exp = var_exp$data, aweights = aweights, + terciles = 1), + "Parameter 'terciles' must be an array." + ) + expect_error( + BEI_ProbsWeighting(var_exp = var_exp$data, aweights = aweights, + terciles = array(1:10, c(10))), + "Parameter 'terciles' should have dimension names." + ) + expect_error( + BEI_ProbsWeighting(var_exp = var_exp$data, aweights = aweights, + terciles = array(1:10, c(time = 10))), + "Parameter 'terciles' must have dimension 'tercil'." + ) + expect_error( + BEI_ProbsWeighting(var_exp = var_exp$data, aweights = aweights, + terciles = array(1:10, c(tercil = 10))), + paste0("Length of dimension 'tercil' ", + "of parameter 'terciles' must be equal to 2.") + ) + # time_dim_name + expect_error( + BEI_EMWeighting(var_exp = var_exp$data, aweights = aweights, time_dim_name = 1), + paste0("Parameter 'time_dim_name' must be a character string indicating", + " the name of the temporal dimension.") + ) + expect_error( + BEI_ProbsWeighting(var_exp = var_exp$data, aweights = aweights, + terciles = terciles, time_dim_name = 1), + paste0("Parameter 'time_dim_name' must be a character string indicating", + " the name of the temporal dimension.") + ) + # memb_dim + expect_error( + BEI_EMWeighting(var_exp = var_exp$data, aweights = aweights, memb_dim = 1), + paste0("Parameter 'memb_dim' must be a character string indicating", + " the name of the member dimension.") + ) + expect_error( + BEI_ProbsWeighting(var_exp = var_exp$data, aweights = aweights, + terciles = terciles, memb_dim = 1), + paste0("Parameter 'memb_dim' must be a character string indicating", + " the name of the member dimension.") + ) + # var_exp, aweights (2) + expect_error( + BEI_EMWeighting(var_exp = array(10), aweights = array(10)), + "Parameters 'var_exp' and 'aweights' should have dimension names." + ) + expect_error( + BEI_EMWeighting(var_exp = var_exp$data, aweights = aweights), + "Parameter 'var_exp' must have temporal dimension." + ) + expect_error( + BEI_EMWeighting(var_exp = var_exp1, aweights = aweights), + "Parameter 'aweights' must have temporal dimension." + ) + expect_error( + BEI_EMWeighting(var_exp = array(1:6, dim = c(time = 2, members = 3)), + aweights = aweights1), + "Parameter 'var_exp' must have member dimension." + ) + expect_error( + BEI_EMWeighting(var_exp = var_exp1, + aweights = array(1:6, dim = c(time = 2, members = 3))), + "Parameter 'aweights' must have member dimension." + ) + expect_error( + BEI_EMWeighting(var_exp = array(1:6, dim = c(time = 1, member = 3)), + aweights = array(1:6, dim = c(time = 2, member = 3))), + paste0("Length of temporal dimension ", + "of parameter 'var_exp' and 'aweights' must be equal.") + ) + expect_error( + BEI_EMWeighting(var_exp = array(1:6, dim = c(time = 2, member = 2)), + aweights = array(1:6, dim = c(time = 2, member = 3))), + paste0("Length of member dimension ", + "of parameter 'var_exp' and 'aweights' must be equal.") + ) }) + +############################################## + +test_that("basic use case", { + expect_equal( + CST_BEI_Weighting(var_exp2, aweights2, type = 'ensembleMean')$data, + result2, + tolerance = 1e-4 + ) + expect_equal( + as.vector(CST_BEI_Weighting(var_exp2, aweights2)$data[, , 2, 2]), + c(28.6, 28.8, 36.6, 36.8, 44.6, 44.8), + tolerance = 1e-4 + ) + res <- CST_BEI_Weighting(var_exp2, aweights2, type = 'probs', + terciles = terciles2)$data + expect_equal( + dim(res), + c(time = 2, tercil = 3, lat = 3, lon = 2, dataset = 2), + tolerance = 1e-4 + ) + expect_equal( + res[, , 2, 2, 2], + array(c(0.5, 0.2, 0.5, 0.8, 0, 0), dim = c(time = 2, tercil = 3)), + tolerance = 1e-4 + ) + expect_equal( + BEI_EMWeighting(var_exp = array(1:6, dim = c(times = 2, members = 3)), + aweights = array(1:6, dim = c(times = 2, members = 3)), + time_dim_name = 'times', memb_dim = 'members'), + array(c(35, 56), dim = c(times = 2)), + tolerance = 1e-4 + ) +}) + +############################################## \ No newline at end of file diff --git a/tests/testthat/test-CST_BiasCorrection.R b/tests/testthat/test-CST_BiasCorrection.R index 826fcf127806fb77413e7d5cd7e78c49d7c09359..46f57ace7e5e861925901f05657a328fca5ed17b 100644 --- a/tests/testthat/test-CST_BiasCorrection.R +++ b/tests/testthat/test-CST_BiasCorrection.R @@ -11,23 +11,20 @@ dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) -exp <- list(data = mod, lat = lat, lon = lon) -obs <- list(data = obs, lat = lat, lon = lon) +coords <- list(lat = lat, lon = lon) +exp <- list(data = mod, coords = coords) +obs <- list(data = obs, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' exp1 <- list(data = array(1:20, dim = c(time = 20))) class(exp1) <- 's2dv_cube' - obs1 <- list(data = array(1:20, dim = c(time = 20))) class(obs1) <- 's2dv_cube' - exp1_2 <- list(data = array(1:20, dim = c(20))) class(exp1_2) <- 's2dv_cube' - obs1_2 <- list(data = array(1:20, dim = c(20))) class(obs1_2) <- 's2dv_cube' - exp_cor1 <- list(data = array(1:20, dim = c(20))) class(exp_cor1) <- 's2dv_cube' @@ -49,19 +46,42 @@ obs4 <- array(1:200, dim = c(time = 5, members = 1, lat = 2, lon = 5)) obs4_1 <- obs4 obs4_1[1,1,1,1] <- NA +# dat5 +set.seed(1) +exp5 <- array(rnorm(80), dim = c(member = 2, sdate = 10, lat = 2, dataset = 2)) +set.seed(2) +obs5 <- array(rnorm(60), dim = c(sdate = 10, lat = 2, dataset = 3)) +set.seed(3) +exp_cor5 <- array(rnorm(20), dim = c(member = 2, sdate = 10, lat = 2)) + +# dat6 +set.seed(1) +exp6 <- array(rnorm(20), dim = c(member = 2, sdate = 10)) +exp6_1 <- array(exp6, dim = c(member = 2, sdate = 10, dataset = 1)) +exp6_2 <- exp6_1 +exp6_2[1] <- NA +set.seed(2) +obs6 <- array(rnorm(10), dim = c(member = 1, sdate = 10)) +obs6_1 <- array(obs6, dim = c(member = 1, sdate = 10, dataset = 1)) +obs6_2 <- obs6_1 +obs6_2[c(1, 3)] <- NA +set.seed(3) +exp_cor6 <- array(rnorm(20), dim = c(member = 2, sdate = 10)) + +# dat7 +exp_cor7 <- array(rnorm(400), dim = c(member = 10, sdate = 10, lat = 2, dataset = 2)) + ############################################## -test_that("1. Inpput checks", { +test_that("1. Input checks", { # s2dv_cube expect_error( CST_BiasCorrection(exp = 1), - paste0("Parameter 'exp' and 'obs' must be of the class 's2dv_cube', ", - "as output by CSTools::CST_Load.") + paste0("Parameter 'exp' and 'obs' must be of the class 's2dv_cube'.") ) expect_error( CST_BiasCorrection(exp = exp, obs = 1), - paste0("Parameter 'exp' and 'obs' must be of the class 's2dv_cube', ", - "as output by CSTools::CST_Load.") + paste0("Parameter 'exp' and 'obs' must be of the class 's2dv_cube'.") ) expect_error( CST_BiasCorrection(exp = exp1), @@ -69,8 +89,7 @@ test_that("1. Inpput checks", { ) expect_error( CST_BiasCorrection(exp = exp1, obs = obs1, exp_cor = 1), - paste0("Parameter 'exp_cor' must be of the class 's2dv_cube', as output ", - "by CSTools::CST_Load.") + paste0("Parameter 'exp_cor' must be of the class 's2dv_cube'.") ) # exp and obs expect_error( @@ -121,6 +140,40 @@ test_that("1. Inpput checks", { BiasCorrection(exp = exp3, obs = obs3_3), paste0("If parameter 'obs' has dimension 'memb_dim' its length must be equal to 1.") ) + ## dat_dim + expect_error( + BiasCorrection(exp = exp3, obs = obs3, dat_dim = 1), + paste0("Parameter 'dat_dim' must be a character string.") + ) + expect_error( + BiasCorrection(exp = exp3, obs = obs3, dat_dim = 'dataset'), + paste0("Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension.", + " Set it as NULL if there is no dataset dimension.") + ) + ## exp, obs, and exp_cor (2) + expect_error( + BiasCorrection(exp = array(1:6, c(sdate = 3, member = 2, dataset = 2, lon = 1)), + obs = array(1:6, c(sdate = 3, member = 1, dataset = 3, lon = 2)), + dat_dim = 'dataset'), + paste0("Parameter 'exp' and 'obs' must have same length of all dimensions", + " except 'memb_dim' and 'dat_dim'.") + ) + expect_error( + BiasCorrection(exp = array(1:6, c(sdate = 3, member = 2, dataset = 2, lon = 1)), + obs = array(1:6, c(sdate = 3, member = 1, dataset = 3, lon = 1)), + exp_cor = array(1:6, c(sdate = 3, member = 1, dataset = 3, lon = 1)), + dat_dim = 'dataset'), + paste0("If parameter 'exp_cor' has dataset dimension, it must be", + " equal to dataset dimension of 'exp'.") + ) + expect_error( + BiasCorrection(exp = array(1:6, c(sdate = 3, member = 2, dataset = 2, lon = 1)), + obs = array(1:6, c(sdate = 3, member = 1, dataset = 3, lon = 1)), + exp_cor = array(1:6, c(sdate = 3, member = 1, lon = 3)), + dat_dim = 'dataset'), + paste0("Parameter 'exp' and 'exp_cor' must have the same length of all common dimensions", + " except 'dat_dim', 'sdate_dim' and 'memb_dim'.") + ) ## na.rm expect_warning( CST_BiasCorrection(exp = exp, obs = obs, na.rm = 1), @@ -143,31 +196,33 @@ test_that("2. Output checks: dat1", { bc <- CST_BiasCorrection(exp = exp, obs = obs) expect_equal( length(bc), - 3 + 2 ) expect_equal( dim(bc$data), c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) ) expect_equal( - bc$lat, + bc$coords$lat, lat ) expect_equal( - bc$lon, + bc$coords$lon, lon ) expect_equal( - round(BiasCorrection(exp = exp3, obs = obs3, exp_cor = exp3), 2), - array(c(2.66, 4.27, 3.2, 4.8, 3.73, 5.34), c(member = 2, sdate = 3)) + as.vector(BiasCorrection(exp = exp3, obs = obs3, exp_cor = exp3)), + c(2.663694, 3.198216, 3.732739, 4.267261, 4.801784, 5.336306), + tolerance = 1e-6 ) expect_equal( - round(BiasCorrection(exp = exp3, obs = obs3_2, exp_cor = exp3), 2), - array(c(2.66, 4.27, 3.2, 4.8, 3.73, 5.34), c(member = 2, sdate = 3)) + as.vector(BiasCorrection(exp = exp3, obs = obs3_2, exp_cor = exp3)), + c(2.663694, 3.198216, 3.732739, 4.267261, 4.801784, 5.336306), + tolerance = 1e-6 ) expect_equal( dim(BiasCorrection(exp = exp4, obs = obs4, sdate_dim = 'time', memb_dim = 'members')), - c(members = 5, time = 5, lat = 2, lon = 5) + c(time = 5, members = 5, lat = 2, lon = 5) ) suppressWarnings( expect_equal( @@ -182,3 +237,99 @@ test_that("2. Output checks: dat1", { ) ) }) + +############################################## + +test_that("3. Output checks: dat5", { + expect_equal( + dim(BiasCorrection(exp5, obs5, memb_dim = 'member', dat_dim = 'dataset')), + c(member = 2, sdate = 10, lat = 2, nexp = 2, nobs = 3) + ) + expect_equal( + dim(BiasCorrection(exp5, obs5, exp_cor5, memb_dim = 'member', dat_dim = 'dataset')), + c(member = 2, sdate = 10, lat = 2, nexp = 2, nobs = 3) + ) + expect_equal( + as.vector(BiasCorrection(exp5, obs5, memb_dim = 'member', dat_dim = 'dataset'))[5:10], + c(0.1466060, -0.9764600, 0.6914021, 0.9330733, 0.6567210, -0.3036642), + tolerance = 0.0001 + ) + expect_equal( + as.vector(BiasCorrection(exp5, obs5, exp_cor5, memb_dim = 'member', dat_dim = 'dataset'))[5:10], + c(0.21682367, 0.03815268, 0.09778966, 1.20997987, -1.30893321, 1.37258011), + tolerance = 0.0001 + ) + expect_equal( + as.vector(BiasCorrection(exp5[, , , 1], obs5[, , 1], memb_dim = 'member'))[1:5], + as.vector(BiasCorrection(exp5, obs5, memb_dim = 'member', dat_dim = 'dataset')[, , , 1, 1][1:5]) + ) + expect_equal( + as.vector(BiasCorrection(exp5[, , , 1], obs5[, , 1], exp_cor5, memb_dim = 'member'))[1:5], + as.vector(BiasCorrection(exp5, obs5, exp_cor5, memb_dim = 'member', dat_dim = 'dataset')[, , , 1, 1][1:5]) + ) + expect_equal( + as.vector(BiasCorrection(exp5, obs5, exp_cor5, memb_dim = 'member', dat_dim = 'dataset', na.rm = TRUE))[1:5], + c(-1.0318284, -0.3098404, 0.2847780, -1.2369666, 0.2168237), + tolerance = 0.0001 + ) +}) + +############################################## +test_that("4. Output checks: dat6", { + expect_equal( + dim(BiasCorrection(exp6, obs6)), + c(member = 2, sdate = 10) + ) + expect_equal( + as.vector(BiasCorrection(exp6, obs6))[1:5], + c(-0.5430181, 0.2807323, -0.9954539, 1.9298249, 0.1466060), + tolerance = 0.0001 + ) + expect_equal( + as.vector(BiasCorrection(exp6, obs6, exp_cor6))[1:5], + c(-1.0318284, -0.3098404, 0.2847780, -1.2369666, 0.2168237), + tolerance = 0.0001 + ) + expect_equal( + dim(BiasCorrection(exp6_1, obs6_1, dat_dim = 'dataset')), + c(member = 2, sdate = 10, nexp = 1, nobs = 1) + ) + expect_equal( + as.vector(BiasCorrection(exp6_1, obs6_1, dat_dim = 'dataset')), + as.vector(BiasCorrection(exp6, obs6)), + tolerance = 0.0001 + ) + expect_equal( + as.vector(BiasCorrection(exp6_1, obs6_1, exp_cor6, dat_dim = 'dataset')), + as.vector(BiasCorrection(exp6, obs6, exp_cor6)), + tolerance = 0.0001 + ) + expect_equal( + suppressWarnings( + as.vector(BiasCorrection(exp6_1, obs6_2, dat_dim = 'dataset')) + ), + rep(as.numeric(NA), 20) + ) + expect_equal( + suppressWarnings( + as.vector(BiasCorrection(exp6_1, obs6_2, dat_dim = 'dataset', na.rm = T))[5:10] + ), + c(0.2644706, -0.8392515, 0.6458045, 0.8511290, 0.5959483, -0.2908764), + tolerance = 0.0001 + ) + expect_equal( + suppressWarnings( + as.vector(BiasCorrection(exp6_2, obs6_2, exp_cor6, dat_dim = 'dataset', na.rm = T))[5:10] + ), + c(0.14077312, -0.02076059, 0.03315629, 1.03867041, -1.23864029, 1.18567478), + tolerance = 0.0001 + ) +}) + +############################################## +test_that("6. Output checks: dat4", { + expect_equal( + dim(BiasCorrection(exp5, obs5, exp_cor7, dat_dim = 'dataset')), + c(member = 10, sdate = 10, lat = 2, nexp = 2, nobs = 3) + ) +}) diff --git a/tests/testthat/test-CST_Calibration.R b/tests/testthat/test-CST_Calibration.R index e832a539e3d26a3f2ee936ab4f0aed5f50bd3b1e..e6aef6f641559e7a35a379bb0bd24d294a6110fc 100644 --- a/tests/testthat/test-CST_Calibration.R +++ b/tests/testthat/test-CST_Calibration.R @@ -1,43 +1,525 @@ -context("Generic tests") -test_that("Sanity checks", { +context("CSTools::CST_Calibration tests") + +############################################## + +# dat +data_exp = array(1:20, dim = c(dataset = 1, member = 15, sdate = 6, ftime = 3, lat = 5, lon = 5)) +lon <- seq(0, 4) +lat <- seq(1, 5) +coords <- list(lon = lon, lat = lat) +attrs_exp = list(Datasets = 'exp_sample_data') +exp <- list(data = data_exp, coords = coords, attrs = attrs_exp) +class(exp) <- 's2dv_cube' +data_obs <- array(1:20, dim = c(dataset = 1, member = 1, sdate = 6, ftime = 3, lat = 5, lon = 5)) +attrs_obs = list(Datasets = 'obs_sample_data') +obs <- list(data = data_obs, coords = coords, attrs = attrs_obs) +class(obs) <- 's2dv_cube' + +# dat1 +exp1 <- list(data = array(1:20, dim = c(time = 20))) +class(exp1) <- 's2dv_cube' +obs1 <- list(data = array(1:20, dim = c(time = 20))) +class(obs1) <- 's2dv_cube' +exp1_2 <- list(data = array(1:20, dim = c(20))) +class(exp1_2) <- 's2dv_cube' +obs1_2 <- list(data = array(1:20, dim = c(20))) +class(obs1_2) <- 's2dv_cube' +exp_cor1 <- list(data = array(1:20, dim = c(20))) +class(exp_cor1) <- 's2dv_cube' + +# dat2 +exp2 <- exp +exp2$data[1, 2, 1, 1, 1, 1] <- NA +obs2 <- obs +obs2$data[1, 1, 2, 1, 1, 1] <- NA + +# dat3 +set.seed(1) +exp3 <- array(rnorm(400), dim = c(member = 10, sdate = 10, lat = 2, dataset = 2)) +set.seed(2) +obs3 <- array(rnorm(60), dim = c(sdate = 10, lat = 2, dataset = 3)) +set.seed(3) +exp_cor3 <- array(rnorm(200), dim = c(member = 10, sdate = 10, lat = 2)) + +# dat4 +set.seed(1) +exp4 <- array(rnorm(200), dim = c(member = 10, sdate = 20)) +exp4_1 <- array(exp4, dim = c(member = 10, sdate = 20, dataset = 1)) +exp4_2 <- exp4_1 +exp4_2[1] <- NA +set.seed(2) +obs4 <- array(rnorm(20), dim = c(sdate = 20)) +obs4_1 <- array(obs4, dim = c(sdate = 20, dataset = 1)) +obs4_2 <- obs4_1 +obs4_2[c(1, 3)] <- NA +set.seed(3) +exp_cor4 <- array(rnorm(200), dim = c(member = 10, sdate = 20)) + +# dat5 +exp_cor5 <- array(rnorm(400), dim = c(member = 10, sdate = 10, lat = 2, dataset = 2)) + +############################################## + +test_that("1. Input checks", { + # s2dv_cube expect_error( CST_Calibration(exp = 1), - "Parameter 'exp' and 'obs' must be of the class 's2dv_cube', " + paste0("Parameter 'exp' and 'obs' must be of the class 's2dv_cube'.") + ) + expect_error( + CST_Calibration(exp = exp, obs = 1), + paste0("Parameter 'exp' and 'obs' must be of the class 's2dv_cube'.") + ) + expect_error( + CST_Calibration(exp = exp1), + 'argument "obs" is missing, with no default' + ) + expect_error( + CST_Calibration(exp = exp1, obs = obs1, exp_cor = 1), + paste0("Parameter 'exp_cor' must be of the class 's2dv_cube'.") + ) + # exp and obs + expect_error( + Calibration(exp = 1, obs = obs1), + "Parameter 'exp' must be a numeric array." + ) + expect_error( + CST_Calibration(exp = exp1_2, obs = obs1), + "Parameter 'exp' must have dimension names." + ) + expect_error( + CST_Calibration(exp = exp1, obs = obs1_2), + "Parameter 'obs' must have dimension names." + ) + expect_warning( + CST_Calibration(exp = exp2, obs = obs2, exp_cor = exp2), + "Parameter 'obs' contains NA values", + "Parameter 'exp' contains NA values.", + "Parameter 'exp_cor' contains NA values." + ) + # exp_cor + expect_error( + CST_Calibration(exp = exp1, obs = obs1, exp_cor = exp_cor1, sdate_dim = 'time'), + "Parameter 'exp_cor' must have dimension names." + ) + # dat_dim + expect_error( + Calibration(exp = exp3, obs = obs3, dat_dim = 1), + paste0("Parameter 'dat_dim' must be a character string.") + ) + expect_error( + CST_Calibration(exp = exp, obs = obs, dat_dim = 'dat'), + paste0("Parameter 'dat_dim' is not found in 'exp' or 'obs' dimension.", + " Set it as NULL if there is no dataset dimension.") + ) + # sdate_dim, memb_dim + expect_error( + CST_Calibration(exp = exp1, obs = obs1, sdate_dim = 1), + paste0("Parameter 'sdate_dim' should be a character string indicating the", + "name of the dimension where start dates are stored in 'exp'.") + ) + expect_warning( + CST_Calibration(exp = exp, obs = obs, sdate_dim = c('sdate', 'time')), + paste0("Parameter 'sdate_dim' has length greater than 1 and only", + " the first element will be used.") + ) + expect_error( + CST_Calibration(exp = exp1, obs = obs1, memb_dim = 1), + paste0("Parameter 'memb_dim' should be a character string indicating the", + "name of the dimension where members are stored in 'exp'.") + ) + expect_error( + CST_Calibration(exp = exp, obs = obs, sdate_dim = 'time'), + paste0("Parameter 'exp' requires 'sdate_dim' and 'memb_dim' dimensions.") + ) + expect_warning( + CST_Calibration(exp = exp, obs = obs, memb_dim = c('member', 'memb')), + paste0("Parameter 'memb_dim' has length greater than 1 and only", + " the first element will be used.") + ) + expect_error( + CST_Calibration(exp = exp1, obs = obs1, sdate_dim = 'time'), + paste0("Parameter 'exp' requires 'sdate_dim' and 'memb_dim' dimensions.") + ) + expect_error( + Calibration(exp = array(1:20, dim = c(time = 1, member = 1)), + obs = array(1:20, dim = c(member = 1)), sdate_dim = 'time'), + paste0("Parameter 'obs' must have the dimension defined in sdate_dim ", + "parameter.") + ) + expect_warning( + Calibration(exp = array(abs(rnorm(400)), dim = c(sdate = 4, member = 10)), + obs = array(abs(rnorm(8)), dim = c(sdate = 4, member = 2))), + paste0("Parameter 'obs' has dimension 'memb_dim' with length larger", + " than 1. Only the first member dimension will be used.") + ) + # exp, obs, and exp_cor (2) + expect_error( + Calibration(exp = array(1:20, dim = c(time = 1, member = 1)), + obs = array(1:20, dim = c(time = 2, member = 1)), sdate_dim = 'time'), + paste0("Parameter 'exp' and 'obs' must have same length of all dimensions", + " except 'memb_dim' and 'dat_dim'.") + ) + expect_error( + Calibration(exp = array(1:6, c(sdate = 3, member = 2, dataset = 2, lon = 1)), + obs = array(1:6, c(sdate = 3, member = 1, dataset = 3, lon = 2)), + dat_dim = 'dataset'), + paste0("Parameter 'exp' and 'obs' must have same length of all dimensions", + " except 'memb_dim' and 'dat_dim'.") + ) + expect_error( + Calibration(exp = array(1:6, c(sdate = 3, member = 2, dataset = 2, lon = 1)), + obs = array(1:6, c(sdate = 3, member = 1, dataset = 3, lon = 1)), + exp_cor = array(1:6, c(sdate = 3, member = 1, dataset = 3, lon = 1)), + dat_dim = 'dataset'), + paste0("If parameter 'exp_cor' has dataset dimension, it must be", + " equal to dataset dimension of 'exp'.") + ) + expect_error( + Calibration(exp = array(1:6, c(sdate = 3, member = 2, dataset = 2, lon = 1)), + obs = array(1:6, c(sdate = 3, member = 1, dataset = 3, lon = 1)), + exp_cor = array(1:6, c(sdate = 3, member = 2, lon = 2)), dat_dim = 'dataset'), + paste0("Parameter 'exp' and 'exp_cor' must have the same length of ", + "all common dimensions except 'dat_dim', 'sdate_dim' and 'memb_dim'.") + ) + # ncores + expect_error( + CST_Calibration(exp = exp, obs = obs, ncores = TRUE), + "Parameter 'ncores' must be either NULL or a positive integer." + ) + # na.rm + expect_error( + CST_Calibration(exp = exp, obs = obs, na.rm = 1), + "Parameter 'na.rm' must be a logical value." + ) + expect_warning( + CST_Calibration(exp = exp, obs = obs, na.rm = c(T,F)), + "Paramter 'na.rm' has length greater than 1, and only the fist element is used." + ) + # cal.method + expect_error( + Calibration(exp4, obs4, cal.method = 'biass'), + "Parameter 'cal.method' must be a character string indicating the calibration method used." + ) + expect_warning( + Calibration(exp4, obs4, cal.method = 'rpc-based'), + paste0("Parameter 'apply_to' cannot be NULL for 'rpc-based' method so it ", + "has been set to 'sign', as in Eade et al. (2014)."), + fixed = TRUE + ) + expect_warning( + Calibration(exp4, obs4, cal.method = 'rpc-based', apply_to = 'sign'), + paste0("Parameter 'alpha' cannot be NULL for 'rpc-based' method so it ", + "has been set to 0.1, as in Eade et al. (2014)."), + fixed = TRUE ) expect_error( - CST_Calibration(obs = 1), - c("argument \"exp\" is missing, with no default") + Calibration(exp4, obs4, cal.method = 'rpc-based', apply_to = 'sign', alpha = 'a'), + "Parameter 'alpha' must be a number between 0 and 1." ) - library(zeallot) - c(exp, obs) %<-% lonlat_temp + # eval.method + expect_error( + Calibration(exp4, obs4, eval.method = 'biass'), + paste0("Parameter 'eval.method' must be a character string indicating ", + "the sampling method used ('in-sample', 'leave-one-out' or ", + "'hindcast-vs-forecast')."), + fixed = TRUE + ) + # multi.model + expect_error( + Calibration(exp4, obs4, multi.model = 'biass'), + "Parameter 'multi.model' must be a logical value." + ) + expect_warning( + Calibration(exp4, obs4, multi.model = TRUE, cal.method = 'bias'), + paste0("The 'multi.model' parameter is ignored when using the ", + "calibration method 'bias'.") + ) +}) + +############################################## + +test_that("2. Output checks: dat1", { + # dat_dim = NULL cal <- CST_Calibration(exp = exp, obs = obs) - expect_equal(length(cal), 9) - expect_equal(as.numeric(dim(cal$data)), as.numeric(dim(exp$data))) - expect_equal(cal$lat, exp$lat) - expect_equal(cal$lat, obs$lat) - expect_equal(cal$lon, exp$lon) - expect_equal(cal$lon, obs$lon) - # expect_error( - # CST_Calibration(exp = exp, obs = exp), - # "The length of the dimension 'member' in the component 'data' " - # ) + expect_equal( + length(cal), + 3 + ) + expect_equal( + as.numeric(dim(cal$data)), + as.numeric(dim(exp$data)) + ) + expect_equal( + cal$coords$lat, + exp$coords$lat + ) + expect_equal( + cal$coords$lat, + obs$coords$lat + ) + expect_equal( + cal$coords$lon, + exp$coords$lon + ) + expect_equal( + cal$coords$lon, + obs$coords$lon + ) + expect_equal( + dim(cal$data), + c(dataset = 1, member =15, sdate = 6, ftime = 3, lat = 5, lon = 5) + ) + expect_equal( + as.vector(cal$data)[1:5], + c(1.594311, 1.861058, 2.127805, 2.394551, 2.661298), + tolerance = 0.0001 + ) +}) - exp2 <- exp - exp2$data[1, 2, 1, 1, 1, 1] <- NA - expect_warning( - CST_Calibration(exp = exp2, obs = obs), - "Parameter 'exp' contains NA values." +############################################## + +test_that("3. Output checks: dat3", { + # dat_dim = 'dataset' + # exp_cor = NULL + expect_equal( + dim(Calibration(exp3, obs3, memb_dim = 'member', dat_dim = 'dataset')), + c(member = 10, sdate = 10, lat = 2, nexp = 2, nobs = 3) + ) + expect_equal( + dim(Calibration(exp3, obs3, exp_cor3, memb_dim = 'member', dat_dim = 'dataset')), + c(member = 10, sdate = 10, lat = 2, nexp = 2, nobs = 3) + ) + expect_equal( + as.vector(Calibration(exp3, obs3, memb_dim = 'member', dat_dim = 'dataset'))[5:10], + c(0.5462052, -0.6882557, 0.7157284, 0.9850565, 0.8105716, -0.1353346), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp3, obs3, exp_cor3, memb_dim = 'member', dat_dim = 'dataset'))[5:10], + c(0.4908992, 0.3054365, 0.3673404, 1.5218074, -1.0928551, 1.6905885), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp3[, , , 1], obs3[, , 1], memb_dim = 'member'))[1:5], + c(-0.4799875, 0.3896246, -0.7045297, 1.9049701, 0.5462052), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp3[, , , 1], obs3[, , 1], memb_dim = 'member'))[1:5], + as.vector(Calibration(exp3, obs3, memb_dim = 'member', dat_dim = 'dataset')[, , , 1, 1][1:5]), + tolerance = 0.0001 ) + expect_equal( + as.vector(Calibration(exp3[, , , 1], obs3[, , 1], exp_cor3, eval.method = "hindcast-vs-forecast"))[1:5], + c(-0.80521700, -0.05578454, 0.56143657, -1.01815286, 0.49089916), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp3[, , , 1], obs3[, , 1], exp_cor3, memb_dim = 'member'))[1:5], + as.vector(Calibration(exp3, obs3, exp_cor3, memb_dim = 'member', dat_dim = 'dataset')[, , , 1, 1][1:5]) + ) + expect_equal( + as.vector(Calibration(exp3, obs3, exp_cor3, memb_dim = 'member', dat_dim = 'dataset', na.rm = TRUE))[1:5], + c(-0.80521700, -0.05578454, 0.56143657, -1.01815286, 0.49089916), + tolerance = 0.0001 + ) +}) - obs2 <- obs - obs2$data[1, 1, 2, 1, 1, 1] <- NA - expect_warning( - CST_Calibration(exp = exp, obs = obs2), - "Parameter 'obs' contains NA values." +############################################## + +test_that("4. Output checks: dat3", { + # Check ca.method + expect_equal( + as.vector(Calibration(exp3, obs3, memb_dim = 'member', dat_dim = 'dataset', cal.method = 'bias'))[1:7], + c(-0.3984805, 0.4116167, -0.6076553, 1.8232541, 0.5574811, -0.5924950, 0.7154024), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp3, obs3, exp_cor3, memb_dim = 'member', dat_dim = 'dataset', cal.method = 'evmos'))[1:7], + c(-0.9631369, -0.2290479, 0.3755366, -1.1717133, 0.3064434, 0.1247777, 0.1854143), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp3, obs3, memb_dim = 'member', dat_dim = 'dataset', cal.method = 'mse_min', multi.model = TRUE))[5:10], + c(0.5364620, -0.6412113, 0.6981868, 0.9551252, 0.7886670, -0.1137257), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp3, obs3, exp_cor3, memb_dim = 'member', dat_dim = 'dataset', cal.method = 'crps_min'))[5:10], + c(0.4975732, 0.3078346, 0.3717602, 1.5607014, -1.1259947, 1.7316233), + tolerance = 0.0001 ) + expect_equal( + as.vector(Calibration(exp3, obs3, exp_cor3, memb_dim = 'member', dat_dim = 'dataset', cal.method = 'rpc-based', alpha = 0.05, apply_to = 'all'))[5:10], + c(0.4178384, 0.2323757, 0.2942796, 1.4487467, -1.1659159, 1.6175277), + tolerance = 0.0001 + ) +}) + +############################################## + +test_that("5. Output checks: dat4", { + expect_equal( + dim(Calibration(exp4, obs4)), + c(member = 10, sdate = 20) + ) + expect_equal( + as.vector(Calibration(exp4, obs4))[1:5], + c(-0.6372505, 0.3261267, -0.8860036, 2.0048627, 0.4995904), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp4, obs4, exp_cor4, eval.method = "hindcast-vs-forecast"))[1:5], + c(-0.88657225, -0.08128503, 0.58193721, -1.11537810, 0.50614269), + tolerance = 0.0001 + ) + + expect_equal( + as.vector(Calibration(exp4_1, obs4_1))[1:6], + c(-0.6372505, 0.3261267, -0.8860036, 2.0048627, 0.4995904, -0.8679749), + tolerance = 0.0001 + ) + # exp_cor4 doesn't have dat_dim + expect_equal( + as.vector(Calibration(exp4_1, obs4_1, exp_cor4, dat_dim = 'dataset')), + as.vector(Calibration(exp4, obs4, exp_cor4)), + tolerance = 0.0001 + ) + expect_equal( + suppressWarnings( + as.vector(Calibration(exp4_1, obs4_2, dat_dim = 'dataset', na.rm = FALSE)) + ), + rep(as.numeric(NA), 200) + ) + expect_equal( + suppressWarnings( + as.vector(Calibration(exp4_1, obs4_2, dat_dim = 'dataset', na.rm = T))[5:10] + ), + c(0.4343443, -0.9018916, 0.6178439, 0.9093767, 0.7205064, -0.3033850), + tolerance = 0.0001 + ) + expect_equal( + suppressWarnings( + as.vector(Calibration(exp4_2, obs4_2, exp_cor4, dat_dim = 'dataset', na.rm = T))[5:10] + ), + c(0.4583975, 0.2645130, 0.3292279, 1.5361189, -1.1972745, 1.7125642), + tolerance = 0.0001 + ) + expect_equal( + dim(Calibration(exp4_1, obs4_1, dat_dim = 'dataset')), + c(member = 10, sdate = 20, nexp = 1, nobs = 1) + ) + # exp_cor5 with dat_dim + expect_equal( + dim(Calibration(exp3, obs3, exp_cor5, dat_dim = 'dataset')), + c(member = 10, sdate = 10, lat = 2, nexp = 2, nobs = 3) + ) +}) + +############################################## + +test_that("6. Output checks: dat4", { + # dat_dim = NULL + expect_equal( + as.vector(Calibration(exp4, obs4, cal.method = 'bias'))[1:7], + c(-0.4039514, 0.4061457, -0.6131262, 1.8177832, 0.5520101, -0.5979660, 0.7099314), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp4, obs4, cal.method = 'evmos'))[1:7], + c(-0.4860875, 0.4252999, -0.7214164, 2.0134410, 0.5894025, -0.7043606, 0.7670694), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp4, obs4, cal.method = 'mse_min', multi.model = TRUE))[1:7], + c(-0.5932137, 0.3231407, -0.8298251, 1.9199372, 0.4881377, -0.8126764, 0.6667729), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp4, obs4, cal.method = 'crps_min'))[1:7], + c(-0.6381684, 0.3261889, -0.8871746, 2.0066329, 0.4998291, -0.8691275, 0.6878220), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp4, obs4, cal.method = 'rpc-based', alpha = 0.5, apply_to = 'sign'))[1:7], + c(-0.8597528, 0.1036243, -1.1085060, 1.7823603, 0.2770880, -1.0904773, 0.4648899), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp4, obs4, exp_cor4, cal.method = 'rpc-based', alpha = 0.5, apply_to = 'sign', eval.method = "hindcast-vs-forecast"))[1:7], + c(-1.0464936, -0.2412064, 0.4220158, -1.2752995, 0.3462213, 0.1469362, 0.2134538), + tolerance = 0.0001 + ) + expect_equal( + as.vector(Calibration(exp4, obs4, eval.method = "in-sample"))[1:5], + c(-0.7119142, 0.2626203, -0.9635483, 1.9607986, 0.4380930), + tolerance = 0.0001 + ) +}) +############################################## + +test_that("7. Output checks: dat4", { expect_warning( - CST_Calibration(exp = exp2, obs = obs2), - "Parameter 'obs' contains NA values", "Parameter 'exp' contains NA values." + Calibration(exp = array(1, dim = c(member = 1, sdate = 1)), + obs = array(1, dim = c(sdate = 1))), + "Some forecast data could not be corrected due to data lack and is replaced with NA values." ) -}) + expect_warning( + Calibration(exp = array(1, dim = c(member = 1, sdate = 1)), + obs = array(1, dim = c(sdate = 1)), na.fill = FALSE), + "Some forecast data could not be corrected due to data lack and is replaced with uncorrected values." + ) + suppressWarnings( + expect_equal( + dim(Calibration(exp = array(1, dim = c(dataset = 1, member = 1, sdate = 1)), + obs = array(1, dim = c(dataset = 1, sdate = 1)), + dat_dim = 'dataset', na.fill = FALSE)), + c(member = 1, sdate = 1, nexp = 1, nobs = 1) + ) + ) + suppressWarnings( + expect_equal( + dim(Calibration(exp = array(1, dim = c(dataset = 3, member = 1, sdate = 1)), + obs = array(1, dim = c(dataset = 2, sdate = 1)), + exp_cor = array(1, dim = c(dataset = 3, member = 1, sdate = 1)), + dat_dim = 'dataset', na.fill = FALSE)), + c(member = 1, sdate = 1, nexp = 3, nobs = 2) + ) + ) + suppressWarnings( + expect_equal( + dim(Calibration(exp = array(1, dim = c(dataset = 1, member = 1, sdate = 1)), + obs = array(1, dim = c(dataset = 2, sdate = 1)), + exp_cor = array(1, dim = c(member = 1, sdate = 1)), + dat_dim = 'dataset', na.fill = FALSE)), + c(member = 1, sdate = 1, nexp = 1, nobs = 2) + ) + ) + # Check values + suppressWarnings( + expect_equal( + Calibration(exp = array(1:6, dim = c(member = 2, sdate = 3)), + obs = array(1:3, dim = c(sdate = 3)), + na.fill = FALSE), + array(1:6, dim = c(member = 2, sdate = 3)) + ) + ) + suppressWarnings( + expect_equal( + Calibration(exp = array(1:6, dim = c(member = 2, sdate = 3)), + obs = array(1:3, dim = c(sdate = 3))), + array(dim = c(member = 2, sdate = 3)) + ) + ) + suppressWarnings( + expect_equal( + Calibration(exp = array(1:6, dim = c(member = 2, sdate = 3)), + obs = array(1:3, dim = c(sdate = 3)), + exp_cor = array(6:12, dim = c(member = 2, sdate = 3)), + na.fill = FALSE), + array(1:6, dim = c(member = 2, sdate = 3)) + ) + ) +}) diff --git a/tests/testthat/test-CST_CategoricalEnsCombination.R b/tests/testthat/test-CST_CategoricalEnsCombination.R index 1e366bf33cc9858c3efc482ba70e5fe24f9c25bd..2dfc4a036026d002f628018a140637f51466f2b5 100644 --- a/tests/testthat/test-CST_CategoricalEnsCombination.R +++ b/tests/testthat/test-CST_CategoricalEnsCombination.R @@ -1,4 +1,29 @@ -context("Generic tests") +context("CSTools::CST_CategoricalEnsCombination tests") + +############################################## +# dat1 +dat_exp <- array(abs(rnorm(4*6*3*4*4))*275, dim = c(dataset = 1, member = 4, sdate = 6, + ftime = 3, lat = 4, lon = 4)) +dat_obs <- array(abs(rnorm(6*3*4*4))*275, dim = c(dataset = 1, member = 1, sdate = 6, + ftime = 3, lat = 4, lon = 4)) +lon <- seq(0, 3) +lat <- seq(48, 27) + +coords <- list(lon = lon, lat = lat) + +exp <- list(data = dat_exp, coords = coords) +obs <- list(data = dat_obs, coords = coords) +attr(exp, 'class') <- 's2dv_cube' +attr(obs, 'class') <- 's2dv_cube' + +# dat2 +exp2 <- exp +exp2$data[1, 2, 1, 1, 1, 1] <- NA +obs2 <- obs +obs2$data[1, 1, 2, 1, 1, 1] <- NA + +############################################## + test_that("Sanity checks", { expect_error( CST_CategoricalEnsCombination(exp = 1), @@ -8,42 +33,57 @@ test_that("Sanity checks", { CST_CategoricalEnsCombination(obs = 1), c("argument \"exp\" is missing, with no default") ) - library(zeallot) - c(exp, obs) %<-% lonlat_temp cal <- CST_CategoricalEnsCombination(exp = exp, obs = obs) - expect_equal(length(cal), 9) - expect_equal(as.numeric(dim(cal$data)[c(1, 2)]), c(1, 1)) - expect_equal(as.numeric(dim(cal$data)[c(-1, -2, -3)]), - as.numeric(dim(exp$data)[c(-1, -2)])) - expect_equal(names(dim(cal$data))[c(-1, -2, -3)], - names(dim(exp$data))[c(-1, -2)]) - expect_equal(names(dim(cal$data))[c(1, 2, 3)], - c("dataset", "member", "category")) - expect_equal(cal$lat, exp$lat) - expect_equal(cal$lat, obs$lat) - expect_equal(cal$lon, exp$lon) - expect_equal(cal$lon, obs$lon) + expect_equal( + length(cal), + 2 + ) + expect_equal( + as.numeric(dim(cal$data)[c(1, 2)]), + c(1, 1) + ) + expect_equal( + as.numeric(dim(cal$data)[c(-1, -2, -3)]), + as.numeric(dim(exp$data)[c(-1, -2)]) + ) + expect_equal( + names(dim(cal$data))[c(-1, -2, -3)], + names(dim(exp$data))[c(-1, -2)] + ) + expect_equal( + names(dim(cal$data))[c(1, 2, 3)], + c("dataset", "member", "category") + ) + expect_equal( + cal$lat, + exp$lat + ) + expect_equal( + cal$lat, + obs$lat + ) + expect_equal( + cal$lon, + exp$lon + ) + expect_equal( + cal$lon, + obs$lon + ) expect_error( CST_CategoricalEnsCombination(exp = exp, obs = exp), "The length of the dimension 'member' in the component 'data' " ) - - exp2 <- exp - exp2$data[1, 2, 1, 1, 1, 1] <- NA expect_warning( CST_CategoricalEnsCombination(exp = exp2, obs = obs), "Parameter 'exp' contains NA values." ) - - obs2 <- obs - obs2$data[1, 1, 2, 1, 1, 1] <- NA expect_warning( CST_CategoricalEnsCombination(exp = exp, obs = obs2), "Parameter 'obs' contains NA values." ) - expect_warning( CST_CategoricalEnsCombination(exp = exp2, obs = obs2), "Parameter 'obs' contains NA values", "Parameter 'exp' contains NA values." ) -}) +}) \ No newline at end of file diff --git a/tests/testthat/test-CST_EnsClustering.R b/tests/testthat/test-CST_EnsClustering.R index 2e7364d9b7e5c7e44af6b53eacf60210f5193e11..afbe1598d740bc649fc6579abd9effb048044afd 100644 --- a/tests/testthat/test-CST_EnsClustering.R +++ b/tests/testthat/test-CST_EnsClustering.R @@ -1,52 +1,112 @@ -context("Generic tests") -test_that("Sanity and Functionality tests", { - data <- rnorm(2 * 15 * 4 * 5 * 6 * 7) - dim(data) <- c(dataset = 2, member = 15, - sdate = 4, ftime = 5, lat = 6, lon = 7) - lon <- seq(0, 12, 2) - lat <- seq(10, 15, 1) - exp <- list(data = data, lat = lat, lon = lon) +context("CSTools::CST_EnsClustering tests") - # Check error messages +############################################## +# dat1 +data <- rnorm(2 * 10 * 4 * 5 * 6 * 7) +dim(data) <- c(dataset = 2, member = 10, + sdate = 4, ftime = 5, lat = 6, lon = 7) +lon <- seq(0, 12, 2) +lat <- seq(10, 15, 1) +coords <- list(lon = lon, lat = lat) +exp <- list(data = data, coords = coords) +attr(exp, "class") <- "s2dv_cube" + +# dat2 +exp2 <- exp +exp2$attrs <- list(source_files = 'exp') +exp2$coords <- NULL +exp2_2 <- exp2 +exp2_2$coords <- list(long = seq(1:4), lati = seq(1:4)) +############################################## +test_that("1. Input checks", { + # Check 's2dv_cube' expect_error( - CST_EnsClustering(exp), - "Parameter 'exp' must be of the class 's2dv_cube'" + CST_EnsClustering(rnorm(2 * 15 * 4 * 5 * 6 * 7)), + paste0("Parameter 'exp' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + ) + # Check 'exp' object structure + expect_error( + CST_EnsClustering(exp2), + paste0("Parameter 'exp' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + ) + expect_error( + CST_EnsClustering(exp2_2), + paste0("Spatial coordinate names do not match any of the names accepted by ", + "the package. Latitudes accepted names: 'lat', 'lats', 'latitude',", + " 'y', 'j', 'nav_lat'. Longitudes accepted names: 'lon', 'lons',", + " 'longitude', 'x', 'i', 'nav_lon'.") ) - attr(exp, "class") <- "s2dv_cube" - + expect_error( + EnsClustering(array(rnorm(8400), dim = c(member = 10, sdate = 4, ftime = 5, + lati = 6, loni = 7)), + lat = seq(1:5), lon = seq(1:6)), + paste0("Spatial coordinate names do not match any of the names accepted by ", + "the package.") + ) + # Checks in Analogs function expect_error( CST_EnsClustering(exp, time_moment = "invalid"), "Invalid time_moment" ) + exp$coords$lat <- 1 + expect_error( + CST_EnsClustering(exp), + "Incorrect lat length" + ) + exp$coords$lon <- 1 + exp$coords$lat <- lat + expect_error( + CST_EnsClustering(exp), + "Incorrect lon length" + ) + exp$coords$lon <- lon +}) - exp$lat <- 1 - expect_error(CST_EnsClustering(exp), "Incorrect lat length") - exp$lon <- 1 - exp$lat <- lat - expect_error(CST_EnsClustering(exp), "Incorrect lon length") - exp$lon <- lon - +test_that("2. Output checks", { # Sanity checks on dimensions res <- CST_EnsClustering(exp, numclus = 3) - expect_equivalent(dim(res$cluster), dim(exp$data)[c(2, 1, 3)]) + expect_equal( + dim(res$cluster), + dim(exp$data)[c(2, 1, 3)] + ) res <- CST_EnsClustering(exp, numclus = 3, cluster_dim = "sdate") - expect_equivalent(dim(res$cluster), dim(exp$data)[c(3, 1, 2)]) + expect_equal( + dim(res$cluster), + dim(exp$data)[c(3, 1, 2)] + ) res <- CST_EnsClustering(exp, numclus = 3, - cluster_dim = c("member", "dataset", "sdate")) - expect_equivalent(dim(res$cluster), dim(exp$data)[c(2, 1, 3)]) + cluster_dim = c("member", "dataset", "sdate")) + expect_equal( + dim(res$cluster), + dim(exp$data)[c(2, 1, 3)] + ) res <- CST_EnsClustering(exp, numclus = 3, cluster_dim = c("member", "sdate")) - expect_equivalent(dim(res$cluster), dim(exp$data)[c(2, 3, 1)]) - expect_equivalent(dim(res$freq), c(cluster = 3, dim(exp$data)[1])) - expect_equivalent(dim(res$closest_member$sdate), c(cluster = 3, - dim(exp$data)[1])) - expect_equivalent(dim(res$repr_field), c(cluster = 3, - dim(exp$data)[c(5, 6)], dim(exp$data)[1])) - expect_equivalent(dim(res$composites), c(cluster = 3, - dim(exp$data)[c(5, 6)], dim(exp$data)[1])) + expect_equal( + dim(res$cluster), + dim(exp$data)[c(2, 3, 1)] + ) + expect_equal( + dim(res$freq), + c(cluster = 3, dim(exp$data)[1]) + ) + expect_equal( + dim(res$closest_member$sdate), + c(cluster = 3, dim(exp$data)[1]) + ) + expect_equal( + dim(res$repr_field), + c(cluster = 3, dim(exp$data)[c(5, 6)], dim(exp$data)[1]) + ) + expect_equal( + dim(res$composites), + c(cluster = 3, dim(exp$data)[c(5, 6)], dim(exp$data)[1]) + ) # Functionality tests res <- CST_EnsClustering(exp, numclus = 3, variance_explained = 80, - cluster_dim = "member") + cluster_dim = "member") # The closest member of each cluster should be member of that cluster for (i in 1:3) { expect_equivalent(res$cluster[res$closest_member$member[i, 1, 1], 1, 1], i) diff --git a/tests/testthat/test-CST_InsertDim.R b/tests/testthat/test-CST_InsertDim.R new file mode 100644 index 0000000000000000000000000000000000000000..39e5ade86562f5aaf1f00c61a10d54f479bd34c2 --- /dev/null +++ b/tests/testthat/test-CST_InsertDim.R @@ -0,0 +1,47 @@ +context("CSTools::CST_InsertDim tests") + +############################################## +exp <- lonlat_temp$exp +############################################## + +test_that("1. Input checks", { + expect_error( + CST_InsertDim(1), + "Parameter 'data' must be of the class 's2dv_cube'." + ) + expect_error( + CST_InsertDim(exp, name = 1), + "Parameter 'name' must be a character string" + ) + expect_warning( + CST_InsertDim(exp, posdim = 1, lendim = 1, name = "variable"), + paste0("Parameter 'values' is not provided. Adding a sequence of ", + "integers from 1 to 'lendim' as the values for the new dimension.") + ) + expect_error( + CST_InsertDim(exp, posdim = 1, lendim = 1, name = "variable", values = 1:2), + paste0("The length of the parameter 'values' must be consistent", + "with the parameter 'lendim'.") + ) +}) + +############################################## + +test_that("2. Output checks", { + exp <- CST_InsertDim(exp, posdim = 2, lendim = 1, name = "variable", + values = c("tas")) + expect_equal( + dim(exp$data), + c(dataset = 1, variable = 1, member = 15, sdate = 6, ftime = 3, lat = 22, + lon = 53) + ) + expect_equal( + exp$dims, + c(dataset = 1, variable = 1, member = 15, sdate = 6, ftime = 3, lat = 22, + lon = 53) + ) + expect_equal( + exp$coords$variable, + c("tas") + ) +}) diff --git a/tests/testthat/test-CST_MultiEOF.R b/tests/testthat/test-CST_MultiEOF.R index d9c46ab1ec06169593c39e72e20c83b4fa503ca6..f4843231cc664f85c586e49862c123f3afcc1005 100644 --- a/tests/testthat/test-CST_MultiEOF.R +++ b/tests/testthat/test-CST_MultiEOF.R @@ -1,77 +1,160 @@ -context("Generic tests") -test_that("Sanity checks and simple use case", { - library(abind) - # Generate simple synthetic data - seq <- 1 : (2 * 3 * 4 * 5 * 6 * 8) - seq3 <- 1 : (2 * 3 * 4 * 4 * 6 * 8) - mod1 <- sin( 0.7 + seq )^2 + cos( seq ^ 2 * 1.22 ) - dim(mod1) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 8) - mod2 <- sin( seq * 2 ) ^ 3 + cos( seq ^ 2 ) - dim(mod2) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 8) - mod3 <- cos( 0.5 + seq3 ) + sin ( seq3 ^ 2 * 0.2 ) - dim(mod3) <- c(dataset = 2, member = 3, sdate = 4, ftime = 4, lat = 6, lon = 8) - lon <- seq(0, 35, 5) - lat <- seq(0, 25, 5) - exp1 <- list(data = mod1, lat = lat, lon = lon) - exp2 <- list(data = mod2, lat = lat, lon = lon) - exp3 <- list(data = mod3, lat = lat, lon = lon) - attr(exp1, 'class') <- 's2dv_cube' - attr(exp2, 'class') <- 's2dv_cube' - attr(exp3, 'class') <- 's2dv_cube' - d=as.POSIXct(c("2017/01/01", "2017/01/02", "2017/01/03", "2017/01/04", "2017/01/05", +context("CSTools::CST_MultiEOF tests") + + +############################################## + +# exp1, exp2, exp03 +seq <- 1 : (2 * 3 * 4 * 5 * 6 * 8) +seq3 <- 1 : (2 * 3 * 4 * 4 * 6 * 8) +mod1 <- sin( 0.7 + seq )^2 + cos( seq ^ 2 * 1.22 ) +dim(mod1) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 8) +mod2 <- sin( seq * 2 ) ^ 3 + cos( seq ^ 2 ) +dim(mod2) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 8) +mod3 <- cos( 0.5 + seq3 ) + sin ( seq3 ^ 2 * 0.2 ) +dim(mod3) <- c(dataset = 2, member = 3, sdate = 4, ftime = 4, lat = 6, lon = 8) +lon <- seq(0, 35, 5) +lat <- seq(0, 25, 5) +exp1 <- list(data = mod1, coords = list(lat = lat, lon = lon)) +exp2 <- list(data = mod2, coords = list(lat = lat, lon = lon)) +exp03 <- list(data = mod3, coords = list(lat = lat, lon = lon)) +attr(exp1, 'class') <- 's2dv_cube' +attr(exp2, 'class') <- 's2dv_cube' +attr(exp03, 'class') <- 's2dv_cube' +d = as.POSIXct(c("2017/01/01", "2017/01/02", "2017/01/03", "2017/01/04", "2017/01/05", "2018/01/01", "2018/01/02", "2018/01/03", "2018/01/04", "2018/01/05", "2019/01/01", "2019/01/02", "2019/01/03", "2019/01/04", "2019/01/05", "2020/01/01", "2020/01/02", "2020/01/03", "2020/01/04", "2020/01/05")) - - exp1$Dates$start=d - exp2$Dates$start=d - exp3$Dates$start=d + +exp1$attrs$Dates = d +exp2$attrs$Dates = d +exp03$attrs$Dates = d + +# dat3 +exp3 <- exp03 +mod3 <- cos( 0.5 + seq ) + sin( seq ^ 2 * 0.2 ) +dim(mod3) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 8) +exp3$data <- mod3 +# dat0 +dat0 <- exp1 +dat01 <- exp2 +dat0$coords <- NULL +dat01$coords <- NULL +dat02 <- dat0 +dat03 <- dat01 +dat02$coords <- list(long = seq(1:4), lati = seq(1:4)) +dat03$coords <- list(long = seq(1:4), lati = seq(1:4)) + +############################################## +test_that("1. Input checks", { expect_error( CST_MultiEOF(datalist = 1), - "Elements of the list in parameter 'datalist' must be of the class 's2dv_cube', as output by CSTools::CST_Load." + paste0("Elements of the list in parameter 'datalist' must be of the class ", + "'s2dv_cube', as output by CSTools::CST_Load.") ) + # Check if all dims equal expect_error( - CST_MultiEOF(list(exp1, exp3)), + CST_MultiEOF(list(exp1, exp03)), "Input data fields must all have the same dimensions." ) - mod3 <- cos( 0.5 + seq ) + sin( seq ^ 2 * 0.2 ) - dim(mod3) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 8) - exp3$data <- mod3 - + # Know spatial coordinates names + expect_error( + CST_MultiEOF(list(dat0, dat01)), + paste0("Parameter 'datalist' must have 'data', 'coords' and 'attrs' elements ", + "within the 's2dv_cube' structure.") + ) expect_error( - CST_MultiEOF(list(exp1, exp2, exp3), lon_lim=c(-250, -245), lat_lim=c(10, 25)), + CST_MultiEOF(list(dat02, dat03)), + paste0("Spatial coordinate names do not match any of the names accepted by ", + "the package. Latitudes accepted names: 'lat', 'lats', 'latitude',", + " 'y', 'j', 'nav_lat'. Longitudes accepted names: 'lon', 'lons',", + " 'longitude', 'x', 'i', 'nav_lon'.") + ) + expect_error( + MultiEOF(data = array(rnorm(96), dim = c(var = 2, lonss = 8, latss = 6)), + lon = seq(1:7), lat = seq(1:5), lon_dim = 'lonss', lat_dim = 'latss'), + paste0("Spatial coordinate names do not match any of the names accepted by ", + "the package.") + ) + expect_error( + CST_MultiEOF(list(exp1, exp2, exp3), lon_lim = c(-250, -245), lat_lim = c(10, 25)), "No intersection between longitude bounds and data domain.") +}) - cal <- CST_MultiEOF(list(exp1, exp2, exp3), neof_composed=2) - expect_equal(length(cal), 5) - dimexp=dim(exp1$data) - expect_equal(dim(cal$coeff), c(dimexp["ftime"], dimexp["sdate"], - eof=2, dimexp["dataset"], dimexp["member"])) - expect_equal(dim(cal$variance), c(eof=2, dimexp["dataset"], dimexp["member"])) - expect_equal(dim(cal$eof_pattern), c(var=3, dimexp["lon"], dimexp["lat"], - eof=2, dimexp["dataset"], - dimexp["member"])) - expect_equal(cal$variance[1, 1, 1], 0.2909419, tolerance = .00001) - expect_equal(cal$coeff[2, 1, 1, 1, 1], 0.5414261, tolerance = .00001) - expect_equal(cal$eof_pattern[1, 2, 2, 2, 1, 1], 0.3932484, tolerance = .00001) - - cal <- CST_MultiEOF(list(exp1, exp2, exp3), neof_max=5, neof_composed=2, minvar=0.2) - expect_equal(cal$coeff[2, 1, 1, 1, 1], -0.6117927, tolerance = .00001) - - cal <- CST_MultiEOF(list(exp1, exp2, exp3), lon_lim=c(5, 30), lat_lim=c(10, 25)) - expect_equal(cal$coeff[2, 1, 1, 1, 1], 0.8539488, tolerance = .00001) - expect_equivalent(cal$lon, seq(5, 30, 5)) - expect_equivalent(cal$lat, seq(10, 25, 5)) - cal <- CST_MultiEOF(list(exp1, exp2, exp3), lon_lim=c(350, 15), lat_lim=c(10, 25)) - expect_equivalent(cal$lon, seq(0, 15, 5)) - expect_equivalent(cal$lat, seq(10, 25, 5)) - cal <- CST_MultiEOF(list(exp1, exp2, exp3), lon_lim=c(-355, -345)) - expect_equivalent(cal$lon, seq(5, 15, 5)) +############################################## - exp3$data[1, 1, 1, 1, 1, 1]=NaN +test_that("2. Output checks", { + cal <- CST_MultiEOF(datalist = list(exp1, exp2, exp3), neof_composed=2) + expect_equal( + length(cal), + 5 + ) + dimexp = dim(exp1$data) + expect_equal( + dim(cal$coeff), + c(dimexp["ftime"], dimexp["sdate"], eof=2, dimexp["dataset"], dimexp["member"]) + ) + expect_equal( + dim(cal$variance), + c(eof = 2, dimexp["dataset"], dimexp["member"]) + ) + expect_equal( + dim(cal$eof_pattern), + c(var = 3, dimexp["lon"], dimexp["lat"], eof = 2, + dimexp["dataset"], dimexp["member"]) + ) + expect_equal( + cal$variance[1, 1, 1], + 0.2909419, + tolerance = .00001 + ) + expect_equal( + cal$coeff[2, 1, 1, 1, 1], + 0.5414261, + tolerance = .00001 + ) + expect_equal( + cal$eof_pattern[1, 2, 2, 2, 1, 1], + 0.3932484, + tolerance = .00001 + ) + cal <- CST_MultiEOF(list(exp1, exp2, exp3), neof_max = 5, + neof_composed = 2, minvar = 0.2) + expect_equal( + cal$coeff[2, 1, 1, 1, 1], + -0.6117927, + tolerance = .00001 + ) + cal <- CST_MultiEOF(list(exp1, exp2, exp3), lon_lim = c(5, 30), lat_lim = c(10, 25)) + expect_equal( + cal$coeff[2, 1, 1, 1, 1], + 0.8539488, + tolerance = .00001 + ) + expect_equivalent( + cal$lon, + seq(5, 30, 5) + ) + expect_equivalent( + cal$lat, + seq(10, 25, 5) + ) + cal <- CST_MultiEOF(list(exp1, exp2, exp3), lon_lim = c(350, 15), lat_lim = c(10, 25)) + expect_equivalent( + cal$lon, seq(0, 15, 5) + ) + expect_equivalent( + cal$lat, + seq(10, 25, 5) + ) + cal <- CST_MultiEOF(list(exp1, exp2, exp3), lon_lim = c(-355, -345)) + expect_equivalent( + cal$lon, + seq(5, 15, 5) + ) + exp3$data[1, 1, 1, 1, 1, 1] = NaN expect_error( - CST_MultiEOF(list(exp1, exp3), neof_max=8, neof_composed=2), + CST_MultiEOF(list(exp1, exp3), neof_max = 8, neof_composed=2), "Input data contain NA values." ) }) diff --git a/tests/testthat/test-CST_MultiMetric.R b/tests/testthat/test-CST_MultiMetric.R index 6914058581485b63d1ccb59fa7d271ab6f4d9bdd..5c45e6a5da6590f7304920621e9efb67579809d8 100644 --- a/tests/testthat/test-CST_MultiMetric.R +++ b/tests/testthat/test-CST_MultiMetric.R @@ -1,91 +1,107 @@ -context("Generic tests") -test_that("basic use case", { - mod <- 1 : (2 * 3 * 4 * 5 * 6 * 8) - dim(mod) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 8) - obs <- 1 : (1 * 1 * 4 * 5 * 6 * 8) - dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 8) - lon <- seq(0, 30, 5) - lat <- seq(0, 30, 5) - exp <- list(data = mod, lat = lat, lon = lon) - obs <- list(data = obs, lat = lat, lon = lon) - attr(exp, 'class') <- 's2dv_cube' - attr(obs, 'class') <- 's2dv_cube' - - result <- list(data = list(corr = array(rep(1, 3* 48), - dim = c(nexp = 3, nobs = 1, - lat = 6, lon = 8)), - p.val = array(rep(0, 3 * 48), dim = c(nexp = 3, nobs = 1, - lat = 6, lon = 8)), - conf.lower = array(rep(1, 3* 48), - dim = c(nexp = 3, nobs = 1, - lat = 6, lon = 8)), - conf.upper = array(rep(1, 3* 48), - dim = c(nexp = 3, nobs = 1, - lat = 6, lon = 8))), - lat = lat, lon = lon) - attr(result, 'class') <- 's2dv_cube' - expect_equal(CST_MultiMetric(exp = exp, obs = obs), result) - - exp2 <- exp - exp2$data[1, 1, 1, 2, 1, 1] = NA - res <- CST_MultiMetric(exp = exp, obs = obs, metric = 'rms') - expect_equal(length(res), 3) - expect_equal(dim(res$data$rms), - c(nexp = 3, nobs = 1, lat = 6, lon = 8)) - res <- CST_MultiMetric(exp = exp, obs = obs, metric = 'rms', - multimodel = FALSE) - expect_equal(dim(res$data$rms), - c(nexp = 2, nobs = 1, lat = 6, lon = 8)) - expect_equal(length(res$data), 3) - res <- CST_MultiMetric(exp = exp, obs = obs, metric = 'rmsss') - expect_equal(dim(res$data$rmsss), - c(nexp = 3, nobs = 1, lat = 6, lon = 8)) - res <- CST_MultiMetric(exp = exp, obs = obs, metric = 'rmsss', multimodel = FALSE) - expect_equal(dim(res$data$rmsss), - c(nexp = 2, nobs = 1, lat = 6, lon = 8)) - }) +context("CSTools::CST_MultiMetric") + +################################################################################ + +# dat +mod <- 1 : (2 * 3 * 4 * 5 * 6 * 8) +dim(mod) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 8) +obs <- 1 : (1 * 1 * 4 * 5 * 6 * 8) +dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 8) +lon <- seq(0, 30, 5) +lat <- seq(0, 30, 5) +coords <- list(lon = lon, lat = lat) +exp <- list(data = mod, coords = coords) +obs <- list(data = obs, coords = coords) +attr(exp, 'class') <- 's2dv_cube' +attr(obs, 'class') <- 's2dv_cube' +################################################################################ -test_that("Sanity checks", { +test_that("1. Sanity checks", { expect_error( CST_MultiMetric(exp = 1), paste0("Parameter 'exp' and 'obs' must be of the class 's2dv_cube', ", - "as output by CSTools::CST_Load.")) - mod <- 1 : (2 * 3 * 4 * 5 * 6 * 8) - dim(mod) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 8) - obs <- 1 : (1 * 1 * 4 * 5 * 6 * 8) - dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 8) - lon <- seq(0, 30, 5) - lat <- seq(0, 30, 5) - exp <- list(data = mod, lat = lat, lon = lon) - obs <- list(data = obs, lat = lat, lon = lon) - attr(exp, 'class') <- 's2dv_cube' - attr(obs, 'class') <- 's2dv_cube' - - + "as output by CSTools::CST_Load.") + ) expect_error( CST_MultiMetric(exp = exp, obs = obs, metric = 1), paste0("Parameter 'metric' must be a character string indicating one ", - "of the options: 'correlation', 'rms', 'rmsss' or 'rpss'")) + "of the options: 'correlation', 'rms', 'rmsss' or 'rpss'") + ) expect_error( CST_MultiMetric(exp = exp, obs = obs, metric = NA), - "missing value where TRUE/FALSE needed") + "missing value where TRUE/FALSE needed" + ) expect_error( CST_MultiMetric(exp = exp, obs = obs, metric = NULL), - "argument is of length zero") + "argument is of length zero" + ) expect_error( - CST_MultiMetric(exp = exp, obs = obs, metric = "correlation", - multimodel = NULL), - "Parameter 'multimodel' must be a logical value.") + CST_MultiMetric(exp = exp, obs = obs, metric = "correlation", multimodel = NULL), + "Parameter 'multimodel' must be a logical value." + ) expect_error( - MultiMetric(exp = lonlat_temp$exp, obs = lonlat_temp$obs, metric = "rpss", - multimodel = TRUE), - "Element 'data' from parameters 'exp' and 'obs' should have dimmension names.") -exp <- lonlat_temp$exp$data[1,,,,,] -obs <- lonlat_temp$obs$data[1,,,,,] + MultiMetric(exp = array(rnorm(10)), obs = array(rnorm(10)), metric = "rpss", + multimodel = TRUE), + "Element 'data' from parameters 'exp' and 'obs' should have dimension names." + ) expect_error( - MultiMetric(exp = exp, obs = obs, metric = "rpss", - multimodel = TRUE), - paste0("Dimension names of element 'data' from parameters 'exp' and ", - "'obs' should have the same name dimmension.")) + MultiMetric(exp = array(rnorm(10), dim = c(sdate = 10)), + obs = array(rnorm(10), dim = c(time = 10)), + metric = "rpss", multimodel = TRUE), + paste0("Dimension names of element 'data' from parameters 'exp' and ", + "'obs' should have the same name dimmension.") + ) }) + +################################################################################ + +test_that("2. Basic use case", { + + corr <- array(rep(1, 3* 48), dim = c(nexp = 3, nobs = 1, lat = 6, lon = 8)) + p.val <- array(rep(0, 3 * 48), dim = c(nexp = 3, nobs = 1, lat = 6, lon = 8)) + conf.lower <- array(rep(1, 3* 48), dim = c(nexp = 3, nobs = 1, lat = 6, lon = 8)) + conf.upper = array(rep(1, 3* 48), dim = c(nexp = 3, nobs = 1, lat = 6, lon = 8)) + + data <- list(corr = corr, p.val = p.val, conf.lower = conf.lower, conf.upper = conf.upper) + result <- list(data = data, coords = coords) + attr(result, 'class') <- 's2dv_cube' + + expect_equal( + CST_MultiMetric(exp = exp, obs = obs), + result + ) + + exp2 <- exp + exp2$data[1, 1, 1, 2, 1, 1] <- NA + res <- CST_MultiMetric(exp = exp, obs = obs, metric = 'rms') + + expect_equal( + length(res), + 2 + ) + expect_equal( + dim(res$data$rms), + c(nexp = 3, nobs = 1, lat = 6, lon = 8) + ) + res <- CST_MultiMetric(exp = exp, obs = obs, metric = 'rms', + multimodel = FALSE) + expect_equal( + dim(res$data$rms), + c(nexp = 2, nobs = 1, lat = 6, lon = 8) + ) + expect_equal( + length(res$data), + 3 + ) + res <- CST_MultiMetric(exp = exp, obs = obs, metric = 'rmsss') + expect_equal( + dim(res$data$rmsss), + c(nexp = 3, nobs = 1, lat = 6, lon = 8) + ) + res <- CST_MultiMetric(exp = exp, obs = obs, metric = 'rmsss', multimodel = FALSE) + expect_equal( + dim(res$data$rmsss), + c(nexp = 2, nobs = 1, lat = 6, lon = 8) + ) +}) \ No newline at end of file diff --git a/tests/testthat/test-CST_MultivarRMSE.R b/tests/testthat/test-CST_MultivarRMSE.R new file mode 100644 index 0000000000000000000000000000000000000000..b9f93f6862a42efe4c956608f17b8b33c761cd9e --- /dev/null +++ b/tests/testthat/test-CST_MultivarRMSE.R @@ -0,0 +1,192 @@ +context("CSTools::CST_MultivarRMSE tests") + +# dat1 +set.seed(1) +mod1 <- abs(rnorm(1 * 3 * 4 * 5 * 6 * 7)) +set.seed(2) +mod2 <- abs(rnorm(1 * 3 * 4 * 5 * 6 * 7)) +dim(mod1) <- c(datasets = 1, members = 3, sdates = 4, ftimes = 5, lat = 6, lon = 7) +dim(mod2) <- c(datasets = 1, members = 3, sdates = 4, ftimes = 5, lat = 6, lon = 7) +set.seed(1) +obs1 <- abs(rnorm(1 * 1 * 4 * 5 * 6 * 7)) +set.seed(2) +obs2 <- abs(rnorm(1 * 1 * 4 * 5 * 6 * 7)) +dim(obs1) <- c(datasets = 1, members= 1, sdates = 4, ftimes = 5, lat = 6, lon = 7) +dim(obs2) <- c(datasets = 1, members = 1, sdates = 4, ftimes = 5, lat = 6, lon = 7) +lon <- seq(0, 30, 5) +lat <- seq(0, 25, 5) +coords <- list(lat = lat, lon = lon) + +exp1 <- list(data = mod1, coords = coords, + attrs = list(Datasets = "EXP1", source_files = "file1", + Variable = list(varName = 'pre'))) +exp2 <- list(data = mod2, coords = coords, + attrs = list(Datasets = "EXP2", source_files = "file2", + Variable = list(varName = 'tas'))) +obs1_1 <- list(data = obs1, coords = coords, + attrs = list(Datasets = "OBS1", source_files = "file1", + Variable = list(varName = 'pre'))) +obs2_1 <- list(data = obs2, coords = coords, + attrs = list(Datasets = "OBS2", source_files = "file2", + Variable = list(varName = 'tas'))) + +attr(exp1, 'class') <- 's2dv_cube' +attr(exp2, 'class') <- 's2dv_cube' +attr(obs1_1, 'class') <- 's2dv_cube' +attr(obs2_1, 'class') <- 's2dv_cube' + +anom1 <- CST_Anomaly(exp1, obs1_1, cross = TRUE, memb = TRUE, dim_anom = 'sdates', memb_dim = 'members',dat_dim = c('datasets', 'members')) +anom2 <- CST_Anomaly(exp2, obs2_1, cross = TRUE, memb = TRUE, dim_anom = 'sdates', memb_dim = 'members', dat_dim = c('datasets', 'members')) + +ano_exp <- list(anom1$exp, anom2$exp) +ano_obs <- list(anom1$obs, anom2$obs) + +# dat2 + +dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) +dim(mod2) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) +dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) +dim(obs2) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) + +exp1 <- list(data = mod1, coords = coords, + attrs = list(Datasets = "EXP1", source_files = "file1", + Variable = list(varName = 'pre'))) +exp2 <- list(data = mod2, coords = coords, + attrs = list(Datasets = "EXP2", source_files = "file2", + Variable = list(varName = 'tas'))) +obs1 <- list(data = obs1, coords = coords, + attrs = list(Datasets = "OBS1", source_files = "file1", + Variable = list(varName = 'pre'))) +obs2 <- list(data = obs2, coords = coords, + attrs = list(Datasets = "OBS2", source_files = "file2", + Variable = list(varName = 'tas'))) + +attr(exp1, 'class') <- 's2dv_cube' +attr(exp2, 'class') <- 's2dv_cube' +attr(obs1, 'class') <- 's2dv_cube' +attr(obs2, 'class') <- 's2dv_cube' + +anom1 <- CST_Anomaly(exp1, obs1, cross = TRUE, memb = TRUE) +anom2 <- CST_Anomaly(exp2, obs2, cross = TRUE, memb = TRUE) + +ano_exp2 <- list(anom1$exp, anom2$exp) +ano_obs2 <- list(anom1$obs, anom2$obs) + +############################################## +test_that("1. Input checks", { + # s2dv_cube + expect_error( + CST_MultivarRMSE(exp = 1, obs = 1), + "Parameters 'exp' and 'obs' must be lists of 's2dv_cube' objects" + ) + # exp and obs + expect_error( + CST_MultivarRMSE(exp = exp1, obs = exp1), + paste0("Elements of the list in parameter 'exp' must be of the class ", + "'s2dv_cube', as output by CSTools::CST_Load.") + ) + # exp and obs + expect_error( + CST_MultivarRMSE(exp = c(ano_exp, ano_exp), obs = ano_obs), + "Parameters 'exp' and 'obs' must be of the same length." + ) + # memb_dim + expect_error( + CST_MultivarRMSE(exp = ano_exp, obs = ano_obs, memb_dim = NULL), + "Parameter 'memb_dim' cannot be NULL." + ) + expect_error( + CST_MultivarRMSE(exp = ano_exp, obs = ano_obs, memb_dim = 1), + "Parameter 'memb_dim' must be a character string." + ) + expect_error( + CST_MultivarRMSE(exp = ano_exp2, obs = ano_obs), + "Dimension names of element 'data' from parameters 'exp' and 'obs' should be equal." + ) + expect_error( + CST_MultivarRMSE(exp = ano_exp, obs = ano_obs, memb_dim = 'memb'), + "Parameter 'memb_dim' is not found in 'exp' or in 'obs' dimension." + ) + # dat_dim + expect_error( + CST_MultivarRMSE(exp = ano_exp2, obs = ano_obs2, dat_dim = 1), + "Parameter 'dat_dim' must be a character string." + ) + expect_error( + CST_MultivarRMSE(exp = ano_exp2, obs = ano_obs2, dat_dim = 'dats'), + "Parameter 'dat_dim' is not found in 'exp' or in 'obs' dimension." + ) + # ftime_dim + expect_error( + CST_MultivarRMSE(exp = ano_exp2, obs = ano_obs2, ftime_dim = 1), + "Parameter 'ftime_dim' must be a character string." + ) + expect_error( + CST_MultivarRMSE(exp = ano_exp2, obs = ano_obs2, ftime_dim = 'ftimes'), + "Parameter 'ftime_dim' is not found in 'exp' or in 'obs' dimension." + ) + expect_error( + CST_MultivarRMSE(exp = ano_exp2, obs = ano_obs2, ftime_dim = NULL), + "Parameter 'ftime_dim' cannot be NULL." + ) + # sdate_dim + expect_error( + CST_MultivarRMSE(exp = ano_exp2, obs = ano_obs2, sdate_dim = 1), + "Parameter 'sdate_dim' must be a character string." + ) + expect_error( + CST_MultivarRMSE(exp = ano_exp2, obs = ano_obs2, sdate_dim = 'sdates'), + "Parameter 'sdate_dim' is not found in 'exp' or in 'obs' dimension." + ) + expect_error( + CST_MultivarRMSE(exp = ano_exp2, obs = ano_obs2, sdate_dim = NULL), + "Parameter 'sdate_dim' cannot be NULL." + ) +}) + +############################################## + +test_that("2. Output checks", { + res1 <- CST_MultivarRMSE(exp = ano_exp2, obs = ano_obs2, weight = c(1, 2)) + res2 <- CST_MultivarRMSE(exp = ano_exp, obs = ano_obs, weight = c(1, 2), + dat_dim = 'datasets', ftime_dim = 'ftimes', + memb_dim = 'members', sdate_dim = 'sdates') + res3 <- CST_MultivarRMSE(exp = ano_exp, obs = ano_obs, weight = c(1, 2), + dat_dim = NULL, ftime_dim = 'ftimes', + memb_dim = 'members', sdate_dim = 'sdates') + expect_equal( + names(res1), + c('data', 'coords', 'attrs') + ) + expect_equal( + dim(res1$data), + dim(res2$data) + ) + expect_equal( + dim(res1$data), + c(nexp = 1, nobs = 1, lat = 6, lon = 7) + ) + expect_equal( + res1$data, + res2$data + ) + expect_equal( + as.vector(res1$data)[1:5], + c(0.9184747, 1.0452328, 1.7559577, 0.7936543, 0.9163216), + tolerance = 0.0001 + ) + expect_equal( + as.vector(res2$data)[1:5], + c(0.9184747, 1.0452328, 1.7559577, 0.7936543, 0.9163216), + tolerance = 0.0001 + ) + expect_equal( + dim(res3$data), + c(datasets = 1, lat = 6, lon = 7) + ) + expect_equal( + as.vector(res3$data)[1:5], + c(0.9184747, 1.0452328, 1.7559577, 0.7936543, 0.9163216), + tolerance = 0.0001 + ) +}) diff --git a/tests/testthat/test-CST_ProxiesAttractor.R b/tests/testthat/test-CST_ProxiesAttractor.R new file mode 100644 index 0000000000000000000000000000000000000000..65831b24a9fdd4e7beef7c625ed4b45577c44676 --- /dev/null +++ b/tests/testthat/test-CST_ProxiesAttractor.R @@ -0,0 +1,29 @@ +context("CSTools::CST_ProxiesAttractor tests") + +############################################## + +# dat1 +data <- rnorm(2 * 10 * 4 * 5 * 6 * 7) +dim(data) <- c(dataset = 2, member = 10, + sdate = 4, ftime = 5, lat = 6, lon = 7) +lon <- seq(0, 12, 2) +lat <- seq(10, 15, 1) +coords <- list(lon = lon, lat = lat) +exp1 <- list(data = data, coords = coords) +attr(exp1, "class") <- "s2dv_cube" + +############################################## +test_that("1. Input checks", { + # Check 's2dv_cube' + expect_error( + CST_ProxiesAttractor(rnorm(2 * 15 * 4 * 5 * 6 * 7)), + paste0("Parameter 'data' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + ) + # Check quanti + expect_error( + CST_ProxiesAttractor(data = exp1, quanti = NULL), + paste0("Parameter 'quanti' cannot be NULL.") + ) +}) +############################################## \ No newline at end of file diff --git a/tests/testthat/test-CST_QuantileMapping.R b/tests/testthat/test-CST_QuantileMapping.R index 70f1f749237c9fa27d6c01390de1535bdedb623f..a086ab93d3de7f2f6a4187bec54b0c5e95596e9b 100644 --- a/tests/testthat/test-CST_QuantileMapping.R +++ b/tests/testthat/test-CST_QuantileMapping.R @@ -50,36 +50,42 @@ obs3_2$data[1] <- NA res3_2$data[1] <- 0 # dat4 -exp4 <- lonlat_temp$exp -exp4$data <- exp4$data[,,1:4,,,] -dim(exp4$data) <- c(dataset = 1, member = 15, sdate = 4, ftime = 3, - lat = 22, lon = 53) -obs4 <- lonlat_temp$obs -obs4$data <- obs4$data[,,1:4, ,,] -dim(obs4$data) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, - lat = 22, lon = 53) -exp_cor4 <- lonlat_temp$exp -exp_cor4$data <- exp_cor4$data[,,5:6,,,] -dim(exp_cor4$data) <- c(dataset = 1, member = 15, sdate = 2, ftime = 3, - lat = 22, lon = 53) +lon <- seq(0, 3) +lat <- seq(48, 45) +set.seed(1) +exp4 <- NULL +exp4$data <- array(runif(1152)*280, dim = c(dataset = 1, member = 4, sdate = 6, + ftime = 3, lat = 4, lon = 4)) +exp4$coords <- list(lon = lon, lat = lat) +class(exp4) <- 's2dv_cube' +set.seed(2) +obs4 <- NULL +obs4$data <- array(runif(288)*280, dim = c(dataset = 1, member = 1, sdate = 6, + ftime = 3, lat = 4, lon = 4)) +obs4$coords <- list(lon = lon, lat = lat) +class(obs4) <- 's2dv_cube' +exp_cor4 <- exp4 +exp_cor4$data <- exp_cor4$data[, , 5:6, , , ] +class(exp_cor4) <- 's2dv_cube' # dat5 -exp5 <- lonlat_temp$exp -obs5 <- lonlat_temp$obs +exp5 <- exp4 +obs5 <- obs4 + set.seed(1) res5 <- NULL -res5$data <- array(rnorm(length(exp5)), dim = c(member = 15, sdate = 6, - dataset = 1, ftime = 3, lat = 22, lon = 53)) +res5$data <- array(rnorm(length(exp5)), dim = c(member = 4, sdate = 6, + dataset = 1, ftime = 3, lat = 4, lon = 4)) class(res5) <- "s2dv_cube" res5_1 <- NULL -res5_1$data <- array(rnorm(length(res5_1)), dim = c(member = 15, ftime = 3, - dataset = 1, sdate = 6, lat = 22, lon = 53)) +res5_1$data <- array(rnorm(length(res5_1)), dim = c(member = 4, ftime = 3, + dataset = 1, sdate = 6, lat = 4, lon = 4)) class(res5_1) <- "s2dv_cube" # dat6 -exp6 <- lonlat_temp$exp -obs6 <- lonlat_temp$obs +exp6 <- exp4 +obs6 <- obs4 obs6$data <- s2dv::InsertDim(obs6$data, pos = 1, len = 4, name = 'window') obs6_1 <- obs6 @@ -87,16 +93,25 @@ obs6_1$data[2] <- NA exp6_1 <- exp6 -exp6_1$data[1,,,1,1,1] <- NA +exp6_1$data[1, , , 1, 1, 1] <- NA exp_cor6_1 <- exp6_1 exp_cor6_1$data <- ClimProjDiags::Subset(exp_cor6_1$data, 'sdate', 1) exp_cor6_2 <- exp6 -exp_cor6_2$data <- ClimProjDiags::Subset(exp_cor6_2$data, 'member', 1:5) +exp_cor6_2$data <- ClimProjDiags::Subset(exp_cor6_2$data, 'member', 1:2) + +# dat7 +exp7 <- 1 : c(1 * 1 * 6 * 3 * 4 * 4 * 2) +dim(exp7) <- c(dataset = 1, sdate = 6, ftime = 3, + lat = 4, lon = 4, member = 2) + +obs7 <- 101 : c(100 + 1 * 1 * 6 * 3 * 4 * 4 * 2) +dim(obs7) <- c(dataset = 1, sdate = 6, ftime = 3, + lat = 4, lon = 4, window = 2) +exp_cor7 <- exp7 + 1 ############################################## test_that("1. Sanity checks", { - # s2dv_cube expect_error( CST_QuantileMapping(exp = 1), @@ -115,7 +130,7 @@ test_that("1. Sanity checks", { expect_error( CST_QuantileMapping(exp = exp1, obs = obs1, exp_cor = 1), paste0("Parameter 'exp_cor' must be of the class 's2dv_cube', as output ", - "by CSTools::CST_Load.") + "by CSTools::CST_Load.") ) # exp and obs expect_error( @@ -170,7 +185,7 @@ test_that("2. dat2, dat3 and dat4", { ) expect_equal( length(CST_QuantileMapping(exp4, obs4, exp_cor4)), - 9 + 2 ) }) @@ -178,16 +193,16 @@ test_that("2. dat2, dat3 and dat4", { test_that("3. dat5", { expect_equal( - dim(CST_QuantileMapping(exp5, obs5)$data), - dim(res5$data) + dim(CST_QuantileMapping(exp5, obs5)$data), + dim(res5$data) ) expect_equal( - dim(CST_QuantileMapping(exp5, obs5)$data), - dim(res5$data) + dim(CST_QuantileMapping(exp5, obs5)$data), + dim(res5$data) ) expect_equal( - dim(CST_QuantileMapping(exp5, obs5, sdate_dim = "ftime")$data), - dim(res5_1$data) + dim(CST_QuantileMapping(exp5, obs5, sdate_dim = "ftime")$data), + dim(res5_1$data) ) }) @@ -195,23 +210,50 @@ test_that("3. dat5", { test_that("4. dat6", { expect_equal( - CST_QuantileMapping(exp6, obs6, window_dim = 'window'), - CST_QuantileMapping(exp6, obs6, window_dim = 'window', na.rm = TRUE) + CST_QuantileMapping(exp6, obs6, window_dim = 'window'), + CST_QuantileMapping(exp6, obs6, window_dim = 'window', na.rm = TRUE) ) expect_equal( - dim(CST_QuantileMapping(exp6, obs6_1, window_dim = 'window')$data), - c(member = 15, sdate = 6, dataset = 1, ftime = 3, lat = 22, lon = 53) + dim(CST_QuantileMapping(exp6, obs6_1, window_dim = 'window')$data), + c(member = 4, sdate = 6, dataset = 1, ftime = 3, lat = 4, lon = 4) ) expect_equal( - sum(is.na(CST_QuantileMapping(exp6_1, obs6_1, exp_cor = exp_cor6_1, window_dim = 'window', na.rm = TRUE)$data)), + sum(is.na(CST_QuantileMapping(exp6_1, obs6_1, exp_cor = exp_cor6_1, + window_dim = 'window', na.rm = TRUE)$data)), sum(is.na(exp_cor6_1$data)) ) expect_equal( - dim(CST_QuantileMapping(exp6, obs6_1, exp_cor6_1, window_dim = 'window', na.rm = T)$data), - c(member = 15, sdate = 1, dataset = 1, ftime = 3, lat = 22, lon = 53) + dim(CST_QuantileMapping(exp6, obs6_1, exp_cor6_1, window_dim = 'window', + na.rm = T)$data), + c(member = 4, sdate = 1, dataset = 1, ftime = 3, lat = 4, lon = 4) + ) + expect_equal( + dim(CST_QuantileMapping(exp6, obs6_1, exp_cor6_2, + window_dim = 'window')$data), + c(member = 2, sdate = 6, dataset = 1, ftime = 3, lat = 4, lon = 4) + ) +}) + +############################################## + +test_that("5. dat7", { + expect_equal( + dim(QuantileMapping(exp7, obs7, memb_dim = NULL)), + c(sdate = 6, dataset = 1, ftime = 3, lat = 4, lon = 4, member = 2, window = 2) + ) + # window_dim + expect_equal( + dim(QuantileMapping(exp7, obs7, memb_dim = NULL, window_dim = 'window')), + c(sdate = 6, dataset = 1, ftime = 3, lat = 4, lon = 4, member = 2) + ) + # exp_cor + expect_equal( + dim(QuantileMapping(exp7, obs7, exp_cor7, memb_dim = NULL)), + c(sdate = 6, dataset = 1, ftime = 3, lat = 4, lon = 4, member = 2, window = 2) ) expect_equal( - dim(CST_QuantileMapping(exp6, obs6_1, exp_cor6_2, window_dim = 'window')$data), - c(member = 5, sdate = 6, dataset = 1, ftime = 3, lat = 22, lon = 53) + dim(QuantileMapping(exp7, obs7, exp_cor7, memb_dim = NULL, + window_dim = 'window')), + c(sdate = 6, dataset = 1, ftime = 3, lat = 4, lon = 4, member = 2) ) }) diff --git a/tests/testthat/test-CST_RFSlope.R b/tests/testthat/test-CST_RFSlope.R new file mode 100644 index 0000000000000000000000000000000000000000..f08d3a591443fb5a3a79cd2bd89d8058b2441263 --- /dev/null +++ b/tests/testthat/test-CST_RFSlope.R @@ -0,0 +1,32 @@ +context("CSTools::CST_RFSlope tests") + +############################################## +# dat1 +data <- rnorm(2 * 10 * 4 * 5 * 6 * 7) +dim(data) <- c(dataset = 2, member = 10, + sdate = 4, ftime = 5, lat = 6, lon = 7) +lon <- seq(0, 12, 2) +lat <- seq(10, 15, 1) +coords <- list(lon = lon, lat = lat) +exp <- list(data = data, coords = coords) +attr(exp, "class") <- "s2dv_cube" + +# dat2 +exp2_3 <- exp +names(dim(exp2_3$data)) <- c("dataset", "member", "sdate", "ftime", "lati", "loni") + +############################################## +test_that("1. Input checks", { + # Check 's2dv_cube' + expect_error( + CST_RFSlope(rnorm(2 * 15 * 4 * 5 * 6 * 7)), + paste0("Parameter 'data' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + ) + # Check dimensions + expect_error( + CST_RFSlope(exp2_3), + paste0("Spatial dimension names do not match any of the names accepted by ", + "the package.") + ) +}) diff --git a/tests/testthat/test-CST_RFTemp.R b/tests/testthat/test-CST_RFTemp.R index 536e44f0e0092d0266cf253271c5df6138c7666e..325b260becf999915a952fca004c4674335d1201 100644 --- a/tests/testthat/test-CST_RFTemp.R +++ b/tests/testthat/test-CST_RFTemp.R @@ -1,37 +1,97 @@ -context("Generic tests") -test_that("Sanity checks and simple use cases", { - # Generate simple synthetic data - t <- rnorm(2 * 6 * 6 * 2 * 3 * 4) * 10 + 273.15 + 10 - dim(t) <- c(dataset = 2, member = 2, sdate = 3, ftime = 4, lat = 6, lon = 6) - lon <- seq(4, 9, 1) - lat <- seq(42, 47, 1) - exp <- list(data = t, lat = lat, lon = lon) - o <- runif(29 * 29) * 3000 - dim(o) <- c(lat = 29, lon = 29) - lon <- seq(3.125, 10.125, 0.25) - 100 - lat <- seq(41.125, 48.125, 0.25) - 60 - oro <- list(data = o, lat = lat, lon = lon) - attr(oro, "class") <- "s2dv_cube" +context("CSTools::CST_RFTemp tests") - expect_error( - res <- CST_RFTemp(exp, oro, xlim = c(1, 3), ylim = c(1, 3)), - paste("Parameter 'data' must be of the class", - "'s2dv_cube', as output by CSTools::CST_Load.")) - attr(exp, "class") <- "s2dv_cube" +############################################## + +# dat1 +data <- rnorm(2 * 10 * 4 * 5 * 6 * 7) +dim(data) <- c(dataset = 2, member = 10, + sdate = 4, ftime = 5, lat = 6, lon = 7) +lon <- seq(0, 12, 2) +lat <- seq(10, 15, 1) +coords <- list(lon = lon, lat = lat) +exp1 <- list(data = data, coords = coords) +attr(exp1, "class") <- "s2dv_cube" + +# dat2 +exp2 <- exp1 +exp2$attrs <- list(source_files = 'exp') +exp2$coords <- NULL +exp2_2 <- exp2 +exp2_2$coords <- list(long = seq(1:4), lati = seq(1:4)) +# dat +t <- rnorm(2 * 6 * 6 * 2 * 3 * 4) * 10 + 273.15 + 10 +dim(t) <- c(dataset = 2, member = 2, sdate = 3, ftime = 4, lat = 6, lon = 6) +lon <- seq(4, 9, 1) +lat <- seq(42, 47, 1) +coords <- list(lon = lon, lat = lat) +exp <- list(data = t, coords = coords) +attr(exp, "class") <- "s2dv_cube" +o <- runif(29 * 29) * 3000 +dim(o) <- c(lat = 29, lon = 29) +lon <- seq(3.125, 10.125, 0.25) - 100 +lat <- seq(41.125, 48.125, 0.25) - 60 +coords <- list(lon = lon, lat = lat) +oro1 <- list(data = o, coords = coords) +attr(oro1, "class") <- "s2dv_cube" + +oro <- oro1 +oro$coords$lon <- oro$coords$lon + 100 +oro$coords$lat <- oro$coords$lat + 60 + +############################################## +test_that("1. Input checks", { + # Check 's2dv_cube' + expect_error( + CST_RFTemp(rnorm(2 * 15 * 4 * 5 * 6 * 7)), + paste0("Parameter 'data' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + ) + expect_error( + CST_RFTemp(data = exp1, oro = rnorm(2 * 15 * 4 * 5 * 6 * 7)), + paste0("Parameter 'oro' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + ) expect_error( - res <- CST_RFTemp(exp, oro, xlim = c(1, 3), ylim = c(1, 3)), + CST_RFTemp(data = exp1, oro = exp1, delta = rnorm(2 * 15 * 4 * 5 * 6 * 7)), + paste0("Parameter 'delta' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + ) + # Check 's2dv_cube' structure + expect_error( + CST_RFTemp(exp2, oro = exp1), + paste0("Parameter 'data' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + ) + expect_error( + CST_RFTemp(oro = exp2, data = exp1), + paste0("Parameter 'oro' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + ) + # Check coordinates + expect_error( + CST_RFTemp(exp2_2, oro = exp1), + paste0("Spatial coordinate names of 'data' do not match any of the names ", + "accepted by the package.") + ) + expect_error( + CST_RFTemp(exp1, oro = exp2_2), + paste0("Spatial coordinate names of 'oro' do not match any of the names ", + "accepted by the package.") + ) + expect_error( + res <- CST_RFTemp(exp, oro1, xlim = c(1, 3), ylim = c(1, 3), time_dim = 'ftime'), "Orography not available for selected area" ) - - oro$lon <- oro$lon + 100 - oro$lat <- oro$lat + 60 - expect_error( - res <- CST_RFTemp(exp, oro, xlim = c(3, 8), ylim = c(43, 46)), + res <- CST_RFTemp(exp, oro, xlim = c(3, 8), ylim = c(43, 46), time_dim = 'ftime'), "Downscaling area not contained in input data" ) +}) + +############################################## +test_that("2. Output checks", { expect_warning( resl <- CST_RFTemp(exp, oro, lapse = 6.5), "Selected time dim: ftime" diff --git a/tests/testthat/test-CST_RFWeights.R b/tests/testthat/test-CST_RFWeights.R new file mode 100644 index 0000000000000000000000000000000000000000..68b45830fd751f7d1c820ef83e7563e1c8f8d40d --- /dev/null +++ b/tests/testthat/test-CST_RFWeights.R @@ -0,0 +1,55 @@ +context("CSTools::CST_RFTWeights tests") + +############################################## +# dat1 +data <- rnorm(2 * 10 * 4 * 5 * 6 * 6) +dim(data) <- c(dataset = 2, member = 10, + sdate = 4, ftime = 5, lats = 6, lons = 6) +lon <- seq(0, 11, 2) +lat <- seq(10, 15, 1) +coords <- list(longitude = lon, latitude = lat) +exp <- list(data = data, coords = coords) +attr(exp, "class") <- "s2dv_cube" + +# dat2 +exp2_2 <- exp +exp2_2$coords <- NULL +exp2_3 <- exp +names(exp2_3$coords) <- c("lati", "loni") + +############################################## +test_that("1. Input checks", { + # Check 's2dv_cube' + expect_error( + CST_RFWeights(rnorm(2 * 15 * 4 * 5 * 6 * 7), lon = lon, lat = lat), + paste0("Parameter 'climfile' is expected to be a character string indicating", + " the path to the files or an object of class 's2dv_cube'.") + ) + # Check object structure + expect_error( + CST_RFWeights(exp2_2, lon = lon, lat = lat, nf = 3), + paste0("Parameter 'climfile' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + ) + # Check coordinates + expect_error( + CST_RFWeights(exp2_3, lon = lon, lat = lat, nf = 3), + paste0("Spatial coordinate names do not match any of the names accepted by ", + "the package.") + ) +}) + +############################################## + +test_that("2. Output checks", { + res <- CST_RFWeights(climfile = exp, nf = 3, lon, lat, lonname = 'lons', + latname = 'lats', fsmooth = TRUE) + expect_equal( + names(res$coords), + c("longitude", "latitude") + ) + expect_equal( + names(dim(res$data)), + c("lons", "lats", "dataset", "member", "sdate", "ftime" ) + ) +}) \ No newline at end of file diff --git a/tests/testthat/test-CST_RainFARM.R b/tests/testthat/test-CST_RainFARM.R index d9414925dc4a4d1a48060022e02c93e9a9f430ed..c014cfdc68e2fdd046efe7221f95264ff74e1795 100644 --- a/tests/testthat/test-CST_RainFARM.R +++ b/tests/testthat/test-CST_RainFARM.R @@ -1,12 +1,58 @@ -context("Generic tests") -test_that("Sanity checks and simple use cases", { +context("CSTools::CST_RFSlope tests") + +############################################## +# dat1 +data <- rnorm(2 * 10 * 4 * 5 * 6 * 7) +dim(data) <- c(dataset = 2, member = 10, + sdate = 4, ftime = 5, lat = 6, lon = 7) +lon <- seq(0, 12, 2) +lat <- seq(10, 15, 1) +coords <- list(lon = lon, lat = lat) +exp <- list(data = data, coords = coords) +attr(exp, "class") <- "s2dv_cube" + +# dat2 +exp2 <- exp +exp2$attrs <- list(source_files = 'exp') +exp2$coords <- NULL +exp2_2 <- exp2 +exp2_2$coords <- list(long = seq(1:4), lati = seq(1:4)) +exp2_3 <- exp +names(dim(exp2_3$data)) <- c("dataset", "member", "sdate", "ftime", "lati", "loni") + +############################################## +test_that("1. Input checks", { + # Check 's2dv_cube' + expect_error( + CST_RainFARM(rnorm(2 * 15 * 4 * 5 * 6 * 7)), + paste0("Parameter 'data' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + ) + # Check 'exp' object structure + expect_error( + CST_RainFARM(exp2), + paste0("Parameter 'data' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + ) + # Check coordinates + expect_error( + CST_RainFARM(exp2_2), + paste0("Spatial coordinate names do not match any of the names accepted by ", + "the package.") + ) + # Check dimensions + expect_error( + CST_RainFARM(exp2_3), + paste0("Spatial dimension names do not match any of the names accepted by ", + "the package.") + ) # Generate simple synthetic data # 4x5 in space, 2 members, 3 sdates, 6 ftime r <- exp(rnorm(1 * 2 * 3 * 6 * 5 * 4)) dim(r) <- c(dataset = 1, member = 2, sdate = 3, ftime = 6, lat = 5, lon = 4) lon <- seq(0, 6, 2) lat <- seq(10, 18, 2) - exp <- list(data = r, lat = lat, lon = lon) + exp <- list(data = r, coords = list(lat = lat, lon = lon)) attr(exp, 'class') <- 's2dv_cube' expect_warning( @@ -17,7 +63,7 @@ test_that("Sanity checks and simple use cases", { r <- exp(rnorm(1 * 2 * 3 * 6 * 4 * 4)) dim(r) <- c(dataset = 1, member = 2, sdate = 3, ftime = 6, lat = 4, lon = 4) lat <- seq(10, 16, 2) - exp <- list(data = r, lat = lat, lon = lon) + exp <- list(data = r, coords = list(lat = lat, lon = lon)) attr(exp, 'class') <- 's2dv_cube' expect_warning( @@ -26,29 +72,42 @@ test_that("Sanity checks and simple use cases", { ) expect_error( res <- CST_RainFARM(exp, nf=8, weights=array(0,dim=c(2,2))), - "Parameter 'weights' must have dimension names when it is not a scalar." + paste0("Parameters 'lon_dim' and 'lat_dim' do not match with 'weights' ", + "dimension names.") ) +}) + +############################################## + +test_that("2. Simple use case", { + r <- exp(rnorm(1 * 2 * 3 * 6 * 4 * 4)) + dim(r) <- c(dataset = 1, member = 2, sdate = 3, ftime = 6, lat = 4, lon = 4) + lat <- seq(10, 16, 2) + lon <- seq(0, 6, 2) + exp <- list(data = r, coords = list(lat = lat, lon = lon)) + attr(exp, 'class') <- 's2dv_cube' - dimexp=dim(exp$data) + dimexp = dim(exp$data) - res <- CST_RainFARM(exp, nf=8, time_dim=c("ftime", "sdate"), slope=1.7, nens=2) + res <- CST_RainFARM(exp, nf = 8, time_dim = c("ftime", "sdate"), + slope = 1.7, nens = 2) expect_equal(dim(res$data), c(dimexp["dataset"], dimexp["member"], realization = 2, dimexp["sdate"], dimexp["ftime"], dimexp["lat"] * 8, dimexp["lon"] * 8)) - expect_equivalent(length(res$lon), dimexp["lon"] * 8) - expect_equivalent(length(res$lat), dimexp["lat"] * 8) + expect_equivalent(length(res$coords$lon), dimexp["lon"] * 8) + expect_equivalent(length(res$coords$lat), dimexp["lat"] * 8) - res <- CST_RainFARM(exp, nf=8, time_dim=c("ftime", "sdate"), - nens=2, drop_realization_dim=TRUE) + res <- CST_RainFARM(exp, nf = 8, time_dim = c("ftime", "sdate"), + nens = 2, drop_realization_dim = TRUE) expect_equal(dim(res$data), c(dimexp["dataset"], dimexp["member"] * 2, dimexp["sdate"], dimexp["ftime"], dimexp["lat"] * 8, dimexp["lon"] * 8)) - res <- CST_RainFARM(exp, nf=8, time_dim=c("ftime", "sdate"), slope=1.7, - nens=2, nproc=2, fsmooth=FALSE) + res <- CST_RainFARM(exp, nf = 8, time_dim = c("ftime", "sdate"), slope = 1.7, + nens = 2, nproc = 2, fsmooth = FALSE) expect_equal(dim(res$data), c(dimexp["dataset"], dimexp["member"], realization = 2, dimexp["sdate"], dimexp["ftime"], dimexp["lat"] * 8, @@ -60,11 +119,11 @@ test_that("Sanity checks and simple use cases", { expect_equivalent(agg(res$data[1,1,1,1,1,,], 4), exp$data[1,1,1,1,,]) - res <- CST_RainFARM(exp, nf=8, time_dim=c("ftime", "sdate"), - nens=2, nproc=2, fglob=TRUE) + res <- CST_RainFARM(exp, nf = 8, time_dim = c("ftime", "sdate"), + nens = 2, nproc = 2, fglob = TRUE) expect_equal(mean(agg(res$data[1,1,1,1,1,,], 4)), - mean(exp$data[1,1,1,1,,])) + mean(exp$data[1,1,1,1,,])) # Create a more realistic perfect-model precipitation z <- 1 : (32 * 32) @@ -91,12 +150,13 @@ test_that("Sanity checks and simple use cases", { rpfm=agg(apply(rpf, c(5, 6), mean),32) # Use climatological mean of PF precipitation to generate sythetic weights - w <- rfweights(rpfm, res$lon, res$lat, exp$lon, exp$lat, 8, fsmooth=FALSE ) + w <- rfweights(rpfm, res$coords$lon, res$coords$lat, exp$coords$lon, + exp$coords$lat, 8, fsmooth = FALSE ) names(dim(w)) <- c('lon', 'lat') - res <- CST_RainFARM(exppf, nf=8, time_dim=c("ftime", "sdate", "member"), - nens=2, nproc=2, fsmooth=FALSE) - resw <- CST_RainFARM(exppf, nf=8, time_dim=c("ftime", "sdate", "member"), - nens=2, nproc=2, fsmooth=FALSE, weights=w) + res <- CST_RainFARM(exppf, nf = 8, time_dim = c("ftime", "sdate", "member"), + nens = 2, nproc = 2, fsmooth = FALSE) + resw <- CST_RainFARM(exppf, nf = 8, time_dim = c("ftime", "sdate", "member"), + nens = 2, nproc = 2, fsmooth = FALSE, weights = w) resm <- agg(apply(res$data, c(6,7), mean),32) reswm <- agg(apply(resw$data, c(6,7), mean),32) @@ -114,8 +174,10 @@ test_that("Sanity checks and simple use cases", { dim(r) <- c(dataset = 1, member = 1, sdate = 1, ftime = 1, lat = 256, lon = 256) expfine <- exp expfine$data <- r - res <- CST_RainFARM(expcoarse, nf=32, time_dim=c("ftime", "sdate"), - slope=1.7, fsmooth=FALSE, drop_realization_dim=TRUE) + res <- CST_RainFARM(expcoarse, nf = 32, time_dim=c("ftime", "sdate"), + slope = 1.7, fsmooth=FALSE, drop_realization_dim=TRUE) + # TO DO: Develop within the new s2dv_cube + sres= CST_RFSlope(res, time_dim = c("ftime", "sdate")) sexp= CST_RFSlope(expfine, time_dim = c("ftime", "sdate")) expect_equal(sres, sexp, tolerance=0.25) diff --git a/tests/testthat/test-CST_RegimesAssign.R b/tests/testthat/test-CST_RegimesAssign.R index b822f759d3abeaa303564e67195684b12969446b..456aaf8172200d8e4a603019d023007994483e44 100644 --- a/tests/testthat/test-CST_RegimesAssign.R +++ b/tests/testthat/test-CST_RegimesAssign.R @@ -1,104 +1,116 @@ -context("Generic tests") -test_that("Sanity checks", { +context("CSTools::CST_RegimesAssign") + +############################################## + +test_that("1. Input checks", { + # Check 's2dv_cube' expect_error( CST_RegimesAssign(data = 1), paste0("Parameter 'data' must be of the class 's2dv_cube', as output by ", - "CSTools::CST_Load.")) - + "CSTools::CST_Load.") + ) data1 <- 1 : 20 data1 <- list(data = data1) class(data1) <- 's2dv_cube' expect_error( - CST_RegimesAssign(data = data1,ref_maps=1), + CST_RegimesAssign(data = data1, ref_maps = 1), paste0("Parameter 'ref_maps' must be of the class 's2dv_cube', as output by ", - "CSTools::CST_Load.")) - + "CSTools::CST_Load.") + ) + # data regimes <- 1:20 - dim(regimes) <- c(lat = 5, lon=2, cluster=2) - regimes <- list(data=regimes) - class(regimes) <- 's2dv_cube' + dim(regimes) <- c(lat = 5, lon = 2, cluster = 2) + regimes <- list(data = regimes) expect_error( - CST_RegimesAssign(data = data1,ref_maps = regimes), - paste0("Parameter 'data' must be an array with named dimensions.")) - + RegimesAssign(data = data1$data, ref_maps = regimes$data), + paste0("Parameter 'data' must be an array with named dimensions.") + ) + # Temporal dimensions data1 <- 1 : 20 - dim(data1) <- c(lat = 5, lon=4) - data1 <- list(data = data1 , lat=1:5) + dim(data1) <- c(lat = 5, lon = 4) + data1 <- list(data = data1 , coords = list(lat = 1:5)) class(data1) <- 's2dv_cube' expect_error( - CST_RegimesAssign(data = data1,ref_maps = regimes), - paste0("Parameter 'data' must have temporal dimensions.")) - + RegimesAssign(data = data1$data, ref_maps = regimes$data, + lat = data1$coords$lat), + paste0("Parameter 'data' must have temporal dimensions.") + ) data1 <- 1 : 20 - dim(data1) <- c(time=20) + dim(data1) <- c(time = 20) data1 <- list(data = data1) class(data1) <- 's2dv_cube' expect_error( - CST_RegimesAssign(data = data1,ref_maps = regimes), - paste0("Parameter 'lat' must be specified.")) - - + RegimesAssign(data = data1$data, ref_maps = regimes$data, + lat = data1$coords$lat), + paste0("Parameter 'lat' must be specified.") + ) data1 <- 1 : 20 - dim(data1) <- c(time=20) - data1 <- list(data = data1,lat=1:5) - class(data1) <- 's2dv_cube' - - expect_error( - CST_RegimesAssign(data = data1,ref_maps = regimes), - paste0("Parameter 'data' must contain the named dimensions 'lat' and 'lon'.")) - - data1 <- 1: 20 - dim(data1) <- c(lat = 2, lon=5, time=2) - data1 <- list(data = data1, lat=1:5) + dim(data1) <- c(time = 20) + data1 <- list(data = data1, coords = list(lat = 1:5)) class(data1) <- 's2dv_cube' - expect_error( - CST_RegimesAssign(data = data1,ref_maps = regimes), - " Parameter 'lat' does not match with the dimension 'lat' in the - parameter 'data' or in the parameter 'ref_maps'.") + RegimesAssign(data = data1$data, ref_maps = regimes$data, + lat = data1$coords$lat), + paste0("Spatial coordinate dimension names do not match any of the names ", + "accepted by the package.") + ) +}) - +############################################## + +test_that("2. Output checks", { data1 <- 1: 20 - dim(data1) <- c(lat = 5, lon=2, time=2) - data1 <- list(data = data1, lat=1:5) + dim(data1) <- c(lat = 5, lon = 2, time = 2) + data1 <- list(data = data1, coords = list(lat = 1:5, lon = 1:2)) class(data1) <- 's2dv_cube' - - expect_equal(names(CST_RegimesAssign(data = data1, ref_maps = regimes)$statistics), - c('cluster', 'frequency')) - - expect_equal(names( - suppressWarnings( - CST_RegimesAssign( - data = data1, - ref_maps = regimes, - composite = TRUE))$statistics), c('pvalue', 'cluster', 'frequency')) - - expect_equal(names(dim( + regimes <- 1:20 + dim(regimes) <- c(lat = 5, lon = 2, cluster = 2) + regimes <- list(data = regimes, coords = list(lat = 1:5, lon = 1:2)) + class(regimes) <- 's2dv_cube' + expect_equal( + names(CST_RegimesAssign(data = data1, ref_maps = regimes)$statistics), + c('cluster', 'frequency') + ) + expect_equal( + names(suppressWarnings( + CST_RegimesAssign( + data = data1, + ref_maps = regimes, + composite = TRUE))$statistics), + c('pvalue', 'cluster', 'frequency') + ) + expect_equal( + names(dim( suppressWarnings( CST_RegimesAssign( data = data1, ref_maps = regimes, - composite = TRUE))$data)), c('lon', 'lat', 'composite.cluster')) + composite = TRUE))$data)), c('lon', 'lat', 'composite.cluster') + ) data1 <- 1: 160 dim(data1) <- c(lat = 5, lon=2, time=2, member=8) - data1 <- list(data = data1, lat=1:5) + data1 <- list(data = data1, coords = list(lat = 1:5)) class(data1) <- 's2dv_cube' - expect_equal(names(dim( + expect_equal( + names(dim( suppressWarnings( CST_RegimesAssign( data = data1, ref_maps = regimes, - composite = TRUE))$data)), c('lon', 'lat', 'composite.cluster', 'member')) + composite = TRUE))$data)), c('lon', 'lat', 'composite.cluster', 'member') + ) - expect_equal(names(dim( + expect_equal( + names(dim( suppressWarnings( CST_RegimesAssign( data = data1, ref_maps = regimes, - composite = TRUE))$statistics$cluster)), c('time', 'member')) + composite = TRUE))$statistics$cluster)), c('time', 'member') + ) regimes <- 1:60 dim(regimes) <- c(lat = 5, lon=2, cluster=6) @@ -109,17 +121,14 @@ test_that("Sanity checks", { unname(dim(regimes$data)['cluster'])) - regimes <- 1:60 - dim(regimes) <- c(lat = 5, lon=2, cluster=3, member=2) + regimes <- 1:240 + dim(regimes) <- c(lat = 5, lon=2, cluster=3, member=8) regimes <- list(data=regimes) class(regimes) <- 's2dv_cube' - expect_equal(names(dim(CST_RegimesAssign(data = data1, ref_maps = regimes, - composite = FALSE)$statistics$cluster)),c('time','member','member')) - - + expect_equal( + names(dim(CST_RegimesAssign(data = data1, ref_maps = regimes, + composite = FALSE)$statistics$cluster)), + c('member', 'time') + ) - - - - }) diff --git a/tests/testthat/test-CST_SaveExp.R b/tests/testthat/test-CST_SaveExp.R new file mode 100644 index 0000000000000000000000000000000000000000..cb028d6a42ad8803cf8bce745ad1ae8cd7eb1cf1 --- /dev/null +++ b/tests/testthat/test-CST_SaveExp.R @@ -0,0 +1,216 @@ +context("CSTools::CST_SaveExp tests") +############################################## + +# cube0 +cube0 <- array(1:5, dim = c(sdate = 5, lon = 4, lat = 4, ftime = 1)) +class(cube0) <- 's2dv_cube' + +# cube1 +cube1 <- NULL +cube1$data <- array(1:5, dim = c(sdate = 5, lon = 4, lat = 4, ftime = 1)) +coords2 <- list(sdate = c('20000101', '20010102', '20020103', '20030104', '20040105'), + var = 'tas', + lon = 1.:4., + lat = 1.:4.) +cube1$coords <- coords2 +dates2 <- c('20000101', '20010102', '20020103', '20030104', '20040105') +dates2 <- as.Date(dates2, format = "%Y%m%d", tz = "UTC") +dim(dates2) <- c(sdate = 5, ftime = 1) +cube1$attrs$Dates <- dates2 +class(cube1) <- 's2dv_cube' + +# cube2 +cube2 <- cube1 +cube2$data <- array(1:5, dim = c(sdate = 5, lon = 4, lat = 4, ftime = 1, + test = 2, test2 = 3)) +dim(cube2$data) <- c(sdate = 5, lon = 4, lat = 4, ftime = 1, member = 1, + ensemble = 1, test = 2, test2 = 3) + +# cube3 +cube3 <- cube1 + +# dat0 +dates0 <- as.Date('2022-02-01', format = "%Y-%m-%d") +dim(dates0) <- c(sdate = 1) +# dat1 +dat1 <- array(1, dim = c(test = 1)) +# dat2 +dat2 <- array(1:5, dim = c(sdate = 5, lon = 4, lat = 4, ftime = 1)) +coords2 <- list(sdate = c('20000101', '20010102', '20020103', '20030104', '20040105'), + var = 'tas', + lon = 1.:4., + lat = 1.:4.) +dates2 <- c('20000101', '20010102', '20020103', '20030104', '20040105') +dates2 <- as.Date(dates2, format = "%Y%m%d", tz = "UTC") +dim(dates2) <- c(sdate = 5, ftime = 1) + +############################################## + +test_that("1. Input checks: CST_SaveExp", { + # s2dv_cube + expect_error( + CST_SaveExp(data = 1), + paste0("Parameter 'data' must be of the class 's2dv_cube', ", + "as output by CSTools::CST_Load.") + ) + # structure + expect_error( + CST_SaveExp(data = cube0), + paste0("Parameter 'data' must have at least 'data' and 'attrs' elements ", + "within the 's2dv_cube' structure.") + ) + cube0 <- list(data = cube0, attrs = 1) + class(cube0) <- 's2dv_cube' + expect_error( + CST_SaveExp(data = cube0), + paste0("Level 'attrs' must be a list with at least 'Dates' element.") + ) + # cube0$attrs <- NULL + # cube0$attrs$Dates <- dates2 + # expect_warning( + # CST_SaveExp(data = cube0, sdate_dim = c('sdate', 'sweek'), + # ftime_dim = 'ftime', memb_dim = NULL, dat_dim = NULL, + # var_dim = NULL, single_file = FALSE), + # paste0("Element 'coords' not found. No coordinates will be used.") + # ) + + # sdate_dim + suppressWarnings( + expect_error( + CST_SaveExp(data = cube1, sdate_dim = 1), + paste0("Parameter 'sdate_dim' must be a character string.") + ) + ) + # expect_warning( + # CST_SaveExp(data = cube1, sdate_dim = c('sdate', 'sweek'), + # ftime_dim = 'ftime', memb_dim = NULL, dat_dim = NULL, + # var_dim = NULL), + # paste0("Parameter 'sdate_dim' has length greater than 1 and ", + # "only the first element will be used.") + # ) + suppressWarnings( + expect_error( + CST_SaveExp(data = cube1, sdate_dim = 'a', ftime_dim = 'ftime'), + paste0("Parameter 'sdate_dim' is not found in 'data' dimension.") + ) + ) + # # metadata + # expect_warning( + # CST_SaveExp(data = cube1, ftime_dim = 'ftime', memb_dim = NULL, + # dat_dim = NULL, var_dim = NULL), + # paste0("No metadata found in element Variable from attrs.") + # ) + cube1$attrs$Variable$metadata <- 'metadata' + expect_error( + CST_SaveExp(data = cube1, ftime_dim = 'ftime', memb_dim = NULL, + dat_dim = NULL, var_dim = NULL), + paste0("Element metadata from Variable element in attrs must be a list.") + ) + cube1$attrs$Variable$metadata <- list(test = 'var') + # expect_warning( + # CST_SaveExp(data = cube1, ftime_dim = 'ftime', memb_dim = NULL, + # dat_dim = NULL, var_dim = NULL), + # paste0("Metadata is not found for any coordinate.") + # ) + cube1$attrs$Variable$metadata <- list(var = 'var') + # expect_warning( + # CST_SaveExp(data = cube1, ftime_dim = 'ftime', memb_dim = NULL, + # dat_dim = NULL, var_dim = NULL), + # paste0("Metadata is not found for any variable.") + # ) + # memb_dim + suppressWarnings( + expect_error( + CST_SaveExp(data = cube1, memb_dim = 1, ftime_dim = 'ftime'), + paste0("Parameter 'memb_dim' must be a character string.") + ) + ) + suppressWarnings( + expect_error( + CST_SaveExp(data = cube1, memb_dim = 'member', ftime_dim = 'ftime'), + paste0("Parameter 'memb_dim' is not found in 'data' dimension. Set it ", + "as NULL if there is no member dimension.") + ) + ) + # expect_warning( + # CST_SaveExp(data = cube2, memb_dim = c('member', 'ensemble'), + # ftime_dim = 'ftime', dat_dim = NULL, var_dim = NULL), + # paste0("Detected unknown dimension: test, test2") + # ) +}) + +############################################## + +test_that("1. Input checks", { + # data + expect_error( + SaveExp(data = NULL), + "Parameter 'data' cannot be NULL." + ) + expect_error( + SaveExp(data = 1:10), + "Parameter 'data' must be an array with named dimensions." + ) + # destination + expect_error( + SaveExp(data = array(1, dim = c(a = 1)), destination = NULL), + paste0("Parameter 'destination' must be a character string of one element ", + "indicating the name of the file (including the folder if needed) ", + "where the data will be saved."), + fixed = TRUE + ) + # Dates + expect_error( + SaveExp(data = array(1, dim = c(a = 1)), Dates = 'a'), + paste0("Parameter 'Dates' must be of 'POSIXct' or 'Dates' class.") + ) + expect_error( + SaveExp(data = array(1, dim = c(a = 1)), + Dates = as.Date('2022-02-01', format = "%Y-%m-%d")), + paste0("Parameter 'Dates' must have dimension names.") + ) + # # varname + # expect_warning( + # SaveExp(data = dat2, coords = coords2, + # metadata = list(tas = list(level = '2m')), + # Dates = dates2, ftime_dim = 'ftime', memb_dim = NULL, + # dat_dim = NULL, var_dim = NULL), + # paste0("Parameter 'varname' is NULL. It will be assigned to 'X'.") + # ) + suppressWarnings( + expect_error( + SaveExp(data = dat2, coords = coords2, varname = 1, + metadata = list(tas = list(level = '2m')), + Dates = dates2), + "Parameter 'varname' must be a character." + ) + ) + # # coords + # expect_warning( + # SaveExp(data = dat2, coords = list(sdate = coords2[[1]]), + # varname = 'tas', metadata = list(tas = list(level = '2m')), + # Dates = dates2, ftime_dim = 'ftime', memb_dim = NULL, + # dat_dim = NULL, var_dim = NULL), + # "Coordinate 'lon' is not provided and it will be set as index in element coords.", + # "Coordinate 'lat' is not provided and it will be set as index in element coords.", + # "Coordinate 'ftime' is not provided and it will be set as index in element coords." + # ) + # # varname, metadata, spatial coords, unknown dim + # expect_warning( + # SaveExp(data = dat1, ftime_dim = NULL, sdate_dim = NULL, memb_dim = NULL, + # dat_dim = NULL, var_dim = NULL, single_file = TRUE), + # "Parameter 'varname' is NULL. It will be assigned to 'X'.", + # "Parameter 'metadata' is not provided so the metadata saved will be incomplete.", + # paste0("Spatial coordinate names do not match any of the names accepted by ", + # "the package."), + # "Detected unknown dimension: test" + # ) + expect_error( + SaveExp(data = dat1, varname = 1, ftime_dim = NULL, sdate_dim = NULL, + memb_dim = NULL, dat_dim = NULL, var_dim = NULL), + paste0("Parameter 'varname' must be a character string with the ", + "variable names.") + ) +}) + +############################################## diff --git a/tests/testthat/test-CST_SplitDim.R b/tests/testthat/test-CST_SplitDim.R index 59f88597261fc0b3524d32570fef30cc2e49e78d..c2652477c71fc52616aa41e3fd799f5c02be1638 100644 --- a/tests/testthat/test-CST_SplitDim.R +++ b/tests/testthat/test-CST_SplitDim.R @@ -1,92 +1,133 @@ -context("Generic tests") -test_that("Sanity checks", { +context("CSTools::CST_SplitDim tests") + + +############################################## + +# dat1 +data1 <- 1 : 20 +dim(data1) <- c(time = 20) +data1 <- list(data = data1) +class(data1) <- 's2dv_cube' + +indices1 <- c(rep(1,5), rep(2,5), rep (3, 5), rep(4, 5)) +output1 <- matrix(data1$data, nrow = 5, ncol = 4) +names(dim(output1)) <- c('time', 'monthly') +output1 <- list(data = output1) +class(output1) <- 's2dv_cube' + +exp_cor <- 1 : 20 +dim(exp_cor) <- 20 +exp_cor <- list(data = exp_cor) +class(exp_cor) <- 's2dv_cube' + +# dat2 +output2 <- matrix(data1$data, nrow = 5, ncol = 4) +names(dim(output2)) <- c('time', 'index') +output2 <- list(data = output2) +class(output2) <- 's2dv_cube' + +time2 <- c(seq(ISOdate(1903, 1, 1), ISOdate(1903, 1, 4), "days"), + seq(ISOdate(1903, 2, 1), ISOdate(1903, 2, 4), "days"), + seq(ISOdate(1904, 1, 1), ISOdate(1904, 1, 2), "days")) +attrs <- list(Dates = time2) +data2 <- list(data = data1$data, attrs = attrs) +class(data2) <- 's2dv_cube' + +# dat3 +time3 <- c(seq(ISOdate(1903, 1, 1), ISOdate(1903, 1, 8), "days"), + seq(ISOdate(1903, 2, 1), ISOdate(1903, 2, 8), "days"), + seq(ISOdate(1904, 1, 1), ISOdate(1904, 1, 4), "days")) +attrs <- list(Dates = time3) +data3 <- list(data = data1$data, attrs = attrs) +class(data3) <- 's2dv_cube' +output3 <- c(data3$data, rep(NA, 4)) +dim(output3) <- c(time = 8, monthly = 3) +result3 <- data3 +result3$data <- output3 + +# dat4 +data4 <- list(data = array(rnorm(10), dim = c(sdate = 2, lon = 5))) +class(data4) <- 's2dv_cube' + +############################################## + +test_that("1. Input checks", { expect_error( CST_SplitDim(data = 1), paste0("Parameter 'data' must be of the class 's2dv_cube', as output by ", - "CSTools::CST_Load.")) - - data <- 1 : 20 - dim(data) <- c(time = 20) - data <- list(data = data) - class(data) <- 's2dv_cube' + "CSTools::CST_Load.") + ) expect_error( - CST_SplitDim(data = data), + CST_SplitDim(data = data1), paste0("Parameter 'freq' must be a integer number indicating ", - " the length of each chunk.")) -indices <- c(rep(1,5), rep(2,5), rep (3, 5), rep(4, 5)) -output = matrix(data$data, nrow = 5, ncol = 4) -names(dim(output)) <- c('time', 'monthly') -output <- list(data = output) -class(output) <- 's2dv_cube' - expect_equal( - CST_SplitDim(data = data, indices = indices), output) -output = matrix(data$data, nrow = 5, ncol = 4) -names(dim(output)) <- c('time', 'index') -output <- list(data = output) -class(output) <- 's2dv_cube' - expect_equal( - CST_SplitDim(data = data, freq = 5), output) - -time <- c(seq(ISOdate(1903, 1, 1), ISOdate(1903, 1, 4), "days"), - seq(ISOdate(1903, 2, 1), ISOdate(1903, 2, 4), "days"), - seq(ISOdate(1904, 1, 1), ISOdate(1904, 1, 2), "days")) -data <- list(data = data$data, Dates = time) -class(data) <- 's2dv_cube' + " the length of each chunk.") + ) expect_error( - CST_SplitDim(data = data), + CST_SplitDim(data = data2), paste0("Parameter 'indices' has different length of parameter data ", - "in the dimension supplied in 'split_dim'.")) -time <- c(seq(ISOdate(1903, 1, 1), ISOdate(1903, 1, 8), "days"), - seq(ISOdate(1903, 2, 1), ISOdate(1903, 2, 8), "days"), - seq(ISOdate(1904, 1, 1), ISOdate(1904, 1, 4), "days")) -data <- list(data = data$data, Dates = time) -class(data) <- 's2dv_cube' -output <- c(data$data, rep(NA, 4)) -dim(output) <- c(time = 8, monthly = 3) -result <- data -result$data <- output - - expect_equal( - CST_SplitDim(data = data), result) - - exp_cor <- 1 : 20 - dim(exp_cor) <- 20 - exp_cor <- list(data = exp_cor) - class(exp_cor) <- 's2dv_cube' + "in the dimension supplied in 'split_dim'.") + ) expect_error( CST_SplitDim(data = exp_cor, freq = 5), - "Parameter 'data' must have dimension names.") - # expect_error( - # CST_SplitDim(data, freq = 'x'), - # paste0("Parameter 'freq' must be numeric or a character: by 'day', ", - # "'month', 'year' or 'monthly' (for distinguishable month).")) - -library(CSTools) + "Parameter 'data' must have dimension names." + ) expect_error( - CST_SplitDim(data = lonlat_temp$exp), - "Parameter 'split_dims' must be one of the dimension names in parameter 'data'.") - output <- lonlat_temp$exp$data - output <- abind(output[, , , 1, ,], output[, , , 2, ,], output[, , , 3, ,], along = 5) - dim(output) <- c(dataset = 1, member = 15, sdate = 6, ftime = 1, - lat = 22, lon = 53, monthly = 3) - result <- lonlat_temp$exp - result$data <- output - expect_equal(CST_SplitDim(data = lonlat_temp$exp, split_dim = 'ftime'), - result) - - expect_equal(dim(CST_SplitDim(data = lonlat_temp$exp, split_dim = 'member', - freq = 5)$data), - c(dataset = 1, member = 5, sdate = 6, ftime = 3, - lat = 22, lon = 53, index = 3)) - expect_warning(CST_SplitDim(data = lonlat_temp$exp, split_dim = 'member', - freq = 5, new_dim_name = c('a', 'b')), - paste0("Parameter 'new_dim_name' has length greater than 1 ", - "and only the first elemenst is used.")) - expect_error(CST_SplitDim(data = lonlat_temp$exp, split_dim = 'member', - freq = 5, new_dim_name = 3), - "Parameter 'new_dim_name' must be character string") - expect_equal(dim(CST_SplitDim(data = lonlat_temp$exp, split_dim = 'member', - freq = 5, new_dim_name = 'wt')$data), - c(dataset = 1, member = 5, sdate = 6, ftime = 3, - lat = 22, lon = 53, wt = 3)) + CST_SplitDim(data = data4), + "Parameter 'split_dims' must be one of the dimension names in parameter 'data'." + ) }) + +############################################## + +test_that("2. Output checks", { + expect_equal( + CST_SplitDim(data = data1, indices = indices1), + output1 + ) + expect_equal( + CST_SplitDim(data = data1, freq = 5), + output2 + ) + expect_equal( + CST_SplitDim(data = data3), + result3 + ) +}) + +############################################## + +# test_that("3. Output checks: sample data", { +# output <- lonlat_temp$exp$data +# output <- abind(output[, , , 1, ,], output[, , , 2, ,], output[, , , 3, ,], along = 5) +# dim(output) <- c(dataset = 1, member = 15, sdate = 6, ftime = 1, +# lat = 22, lon = 53, monthly = 3) +# result <- lonlat_temp$exp +# result$data <- output +# expect_equal( +# CST_SplitDim(data = lonlat_temp$exp, split_dim = 'ftime'), +# result +# ) +# expect_equal( +# dim(CST_SplitDim(data = lonlat_temp$exp, split_dim = 'member', +# freq = 5)$data), +# c(dataset = 1, member = 5, sdate = 6, ftime = 3, lat = 22, +# lon = 53, index = 3) +# ) +# expect_warning( +# CST_SplitDim(data = lonlat_temp$exp, split_dim = 'member', freq = 5, +# new_dim_name = c('a', 'b')), +# paste0("Parameter 'new_dim_name' has length greater than 1 ", +# "and only the first elemenst is used.") +# ) +# expect_error( +# CST_SplitDim(data = lonlat_temp$exp, split_dim = 'member', freq = 5, +# new_dim_name = 3), +# "Parameter 'new_dim_name' must be character string" +# ) +# expect_equal( +# dim(CST_SplitDim(data = lonlat_temp$exp, split_dim = 'member', +# freq = 5, new_dim_name = 'wt')$data), +# c(dataset = 1, member = 5, sdate = 6, ftime = 3, lat = 22, +# lon = 53, wt = 3) +# ) +# }) diff --git a/tests/testthat/test-CST_Subset.R b/tests/testthat/test-CST_Subset.R new file mode 100644 index 0000000000000000000000000000000000000000..cb567279f715e40de2d3b5116c2a04961fbc94c1 --- /dev/null +++ b/tests/testthat/test-CST_Subset.R @@ -0,0 +1,254 @@ +context("CSTools::CST_Subset tests") + +############################################## + +library(startR) + +############################################## + +test_that("1. Input checks: CST_Subset", { + # Check that x is s2dv_cube + expect_error( + CST_Subset(array(10)), + "Parameter 'x' must be of the class 's2dv_cube'." + ) + # Check var_dim + expect_error( + CST_Subset(lonlat_prec, var_dim = 1), + "Parameter 'var_dim' must be a character string." + ) + expect_error( + CST_Subset(lonlat_prec, var_dim = c('tas', 'psl')), + "Parameter 'var_dim' must be a character string." + ) + # Check dat_dim + expect_error( + CST_Subset(lonlat_prec, dat_dim = 1), + "Parameter 'dat_dim' must be a character string." + ) + expect_error( + CST_Subset(lonlat_prec, dat_dim = c('dat1', 'dat2')), + "Parameter 'dat_dim' must be a character string." + ) +}) + +############################################## + +test_that("2. Output checks: CST_Subset", { + res1 <- CST_Subset(lonlat_prec, along = c('lat', 'lon', 'sdate', 'member'), + indices = list(1, 1:2, 1, 1:2), + drop = 'all') + # Check dimensions + expect_equal( + dim(res1$data), + res1$dims + ) + expect_equal( + dim(res1$data), + c(member = 2, ftime = 31, lon = 2) + ) + # Check coordinates + expect_equal( + names(res1$coords), + c("member", "ftime", "lon") + ) + # Check attrs + expect_equal( + names(res1$attrs), + names(lonlat_prec$attrs) + ) + expect_equal( + names(res1$attrs$Variable$metadata), + c("lon", "prlr") + ) + expect_equal( + res1$attrs$Datasets, + c("exp1") + ) + # Check 'dat_dim' + res2 <- CST_Subset(lonlat_prec, along = c('lat'), indices = list(1), + drop = 'all', dat_dim = 'dataset') + res3 <- CST_Subset(lonlat_prec, along = c('lat'), indices = list(1), + drop = 'selected', dat_dim = 'dataset') + res4 <- CST_Subset(lonlat_prec, along = c('dataset'), indices = list(1), + drop = 'all', dat_dim = 'dataset') + res5 <- CST_Subset(lonlat_prec, along = c('dataset'), indices = list(1), + drop = 'selected', dat_dim = 'dataset') + expect_equal( + res2$attrs$Datasets, + res3$attrs$Datasets + ) + expect_equal( + length(res4$attrs$Datasets), + length(res5$attrs$Datasets) + ) + # Check 'Dates' + res6 <- CST_Subset(lonlat_prec, along = c('sdate', 'ftime'), + indices = list(1, 1:10), drop = 'selected') + res7 <- CST_Subset(lonlat_prec, along = c('sdate', 'ftime'), + indices = list(1, 1:10), drop = 'none') + # Dates dimensions + expect_equal( + dim(res6$attrs$Dates), + res6$dims[which(names(dim(res6$data)) %in% c('sdate', 'ftime'))] + ) + expect_equal( + dim(res7$attrs$Dates), + c(ftime = 10, sdate = 1) + ) + # sdates coordinates + expect_equal( + names(res6$coords), + c("dataset", "member", "ftime", "lat", "lon") + ) + expect_equal( + as.vector(res7$coords$sdate), + c("20101101") + ) +}) + +############################################## + +repos1 <- "/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc" +repos2 <- "/esarchive/exp/ecmwf/system4_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc" + +suppressWarnings( + data <- Start(dat = list(list(name = 'system4_m1', path = repos2), + list(name = 'system5_m1', path = repos1)), + var = c('tas', 'sfcWind'), + sdate = c('20170101'), + ensemble = indices(1), + time = indices(1:3), + lat = indices(1:10), + lon = indices(1:10), + synonims = list(lat = c('lat', 'latitude'), + lon = c('lon', 'longitude')), + return_vars = list(time = 'sdate', + longitude = 'dat', + latitude = 'dat'), + metadata_dims = c('dat', 'var'), + retrieve = T) +) +suppressWarnings( + exp_start <- as.s2dv_cube(data) +) + +############################################## + +test_that("3. Output checks with Start", { + res8 <- CST_Subset(exp_start, along = c('lon', 'sdate', 'ensemble', 'var', 'dat'), + indices = list(1:2, 1, 1, 1, 1), + drop = 'none') + res9 <- CST_Subset(exp_start, along = c('lon', 'sdate', 'ensemble', 'var', 'dat'), + indices = list(1:2, 1, 1, 1, 1), + drop = FALSE, var_dim = 'var', dat_dim = 'dat') + res10 <- CST_Subset(exp_start, along = c('lon', 'sdate', 'ensemble', 'var', 'dat'), + indices = list(1:2, 1, 1, 1, 1), + drop = 'selected', var_dim = 'var', dat_dim = 'dat') + # Check dimensions + expect_equal( + dim(res8$data), + c(dat = 1, var = 1, sdate = 1, ensemble = 1, time = 3, lat = 10, lon = 2) + ) + expect_equal( + dim(res8$data), + dim(res9$data) + ) + expect_equal( + dim(res10$data), + c(time = 3, lat = 10, lon = 2) + ) + # Check coordinates + expect_equal( + names(res8$coords), + names(res8$dims) + ) + expect_equal( + names(res9$coords), + names(res9$dims) + ) + # varName + expect_equal( + res8$attrs$Variable$varName, + c("tas", "sfcWind") + ) + expect_equal( + res9$attrs$Variable$varName, + c("tas") + ) + expect_equal( + res10$attrs$Variable$varName, + NULL + ) + # metadata + expect_equal( + names(res8$attrs$Variable$metadata), + c("time", "lat", "lon", "tas", "sfcWind") + ) + expect_equal( + names(res9$attrs$Variable$metadata), + c("time", "lat", "lon", "tas") + ) + expect_equal( + names(res10$attrs$Variable$metadata), + c("time", "lat", "lon") + ) + # Datasets + expect_equal( + res8$attrs$Datasets, + c("system4_m1", "system5_m1") + ) + expect_equal( + res9$attrs$Datasets, + c("system4_m1") + ) + expect_equal( + length(res10$attrs$Datasets), + 0 + ) + # Check source_files + expect_equal( + dim(res8$attrs$source_files), + c(dat = 1, var = 1, sdate = 1) + ) + expect_equal( + dim(res9$attrs$source_files), + c(dat = 1, var = 1, sdate = 1) + ) + expect_equal( + dim(res10$attrs$source_files), + c(1) + ) +}) + +############################################## + +test_that("3. Output checks with Start", { + res11 <- CST_Subset(exp_start, along = c("dat", "lon", 'time', 'var'), + indices = list(1, 1:2, 1:2, 1), dat_dim = 'dat', + var_dim = 'var', drop = 'non-selected') + expect_equal( + dim(res11$data), + c(dat = 1, var = 1, time = 2, lat = 10, lon = 2) + ) + expect_equal( + names(res11$coords), + names(res11$dims) + ) + expect_equal( + dim(res11$attrs$Dates), + c(time = 2) + ) + expect_equal( + dim(res11$coords$time), + c(sdate = 1, time = 2) + ) + expect_equal( + dim(res11$attrs$source_files), + c(var = 1) + ) + expect_equal( + names(res11$attrs$Variable$metadata), + c("time", "lat", "lon", "tas") + ) +}) diff --git a/tests/testthat/test-CST_WeatherRegimes.R b/tests/testthat/test-CST_WeatherRegimes.R index 5f2967a172f393659dea3c93f8f1cc06986a9099..ebf8730de07460b2d92cde998ca211c6a9212503 100644 --- a/tests/testthat/test-CST_WeatherRegimes.R +++ b/tests/testthat/test-CST_WeatherRegimes.R @@ -1,72 +1,121 @@ context("Generic tests") -test_that("Sanity checks", { - expect_error( + +############################################## +# dat1 +data <- rnorm(2 * 10 * 4 * 5 * 6 * 7) +dim(data) <- c(dataset = 2, member = 10, + sdate = 4, ftime = 5, lat = 6, lon = 7) +lon <- seq(0, 12, 2) +lat <- seq(10, 15, 1) +coords <- list(lon = lon, lat = lat) +exp <- list(data = data, coords = coords) +attr(exp, "class") <- "s2dv_cube" + +# dat2 +exp2 <- exp +exp2$attrs <- list(source_files = 'exp') +exp2$coords <- NULL +exp2_2 <- exp2 +exp2_2$coords <- list(long = seq(1:4), lati = seq(1:4)) + +# data1 +data1 <- 1 : 400 +dim(data1) <- c(time = 20, lat = 5, lon = 4) +data1 <- list(data = data1, coords = list(lat = 1:5, lon = 1:4)) +class(data1) <- 's2dv_cube' + +############################################## +test_that("1. Input checks", { + expect_error( CST_WeatherRegimes(data = 1), paste0("Parameter 'data' must be of the class 's2dv_cube', as output by ", - "CSTools::CST_Load.")) - + "CSTools::CST_Load.") + ) + # Check 'exp' object structure + expect_error( + CST_WeatherRegimes(exp2), + paste0("Parameter 'data' must have 'data' and 'coords' elements ", + "within the 's2dv_cube' structure.") + ) + expect_error( + CST_WeatherRegimes(exp2_2, ncenters = 3), + paste0("Spatial coordinate names do not match any of the names accepted ", + "the package.") + ) + expect_error( + WeatherRegime(array(rnorm(8400), dim = c(member = 10, sdate = 4, ftime = 5, + lati = 6, loni = 7)), + lat = seq(1:5), lon = seq(1:6), ncenters = 3), + paste0("Spatial coordinate names do not match any of the names accepted by ", + "the package.") + ) data1 <- 1 : 20 - data1 <- list(data = data1) + data1 <- list(data = data1, coords = list(lat = 1, lon = 1)) class(data1) <- 's2dv_cube' expect_error( CST_WeatherRegimes(data = data1), - paste0("Parameter 'data' must be an array with named dimensions.")) - + paste0("Parameter 'data' must be an array with named dimensions.") + ) data1 <- 1 : 20 dim(data1) <- c(lat = 5, lon = 4) - data1 <- list(data = data1 , lat = 1:5) + data1 <- list(data = data1 , coords = list(lat = 1:5, lon = 1:4)) class(data1) <- 's2dv_cube' expect_error( CST_WeatherRegimes(data = data1), - paste0("Parameter 'data' must have temporal dimensions.")) - - data1 <- 1 : 20 - dim(data1) <- c(time = 20) - data1 <- list(data = data1) - class(data1) <- 's2dv_cube' - expect_error( - CST_WeatherRegimes(data = data1) , - paste0("Parameter 'lat' must be specified.")) - + paste0("Parameter 'data' must have temporal dimensions.") + ) data1 <- 1 : 400 dim(data1) <- c(time = 20, lat = 5, lon = 4) - data1 <- list(data = data1, lat = 1:5) + data1 <- list(data = data1, coords = list(lat = 1:5, lon = 1:4)) class(data1) <- 's2dv_cube' expect_error( CST_WeatherRegimes(data = data1), - paste0("Parameter 'ncenters' must be specified.")) - + paste0("Parameter 'ncenters' must be specified.") + ) expect_error( - CST_WeatherRegimes(data = data1, ncenters = 3), - paste0("Parameter 'lon' must be specified.")) - + WeatherRegime(data = data1$data, ncenters = 3), + paste0("Parameter 'lon' must be specified.") + ) + expect_error( + WeatherRegime(data = data1$data, lon = data1$coords$lon, ncenters = 3), + paste0("Parameter 'lat' must be specified.") + ) +}) + +############################################## + +test_that("2. Output checks", { expect_equal( names(dim(CST_WeatherRegimes(data = data1, ncenters = 3, EOFs = FALSE)$data)), - c('lat', 'lon', 'cluster')) - + c('lat', 'lon', 'cluster') + ) data1 <- 1 : 400 dim(data1) <- c(sdate = 2, ftime = 10, lat = 5, lon = 4) - data1 <- list(data = data1, lat = 1:5) + data1 <- list(data = data1, coords = list(lat = 1:5, lon = 1:4)) class(data1) <- 's2dv_cube' - nclusters <- 3 - - expect_equal( - dim(CST_WeatherRegimes(data = data1 , - ncenters = nclusters, - EOFs = FALSE)$statistics$frequency), c(2, nclusters)) - expect_equal( - names(dim(CST_WeatherRegimes(data = data1, nclusters, EOFs = FALSE)$data)), - c('lat', 'lon', 'cluster')) - + nclusters <- 3 + suppressWarnings( + expect_equal( + dim(CST_WeatherRegimes(data = data1 , + ncenters = nclusters, + EOFs = FALSE)$statistics$frequency), + c(2, nclusters) + ) + ) + suppressWarnings( + expect_equal( + names(dim(CST_WeatherRegimes(data = data1, nclusters, EOFs = FALSE)$data)), + c('lat', 'lon', 'cluster') + ) + ) data1 <- 1 : 400 dim(data1) <- c(sdate = 2, ftime = 10, lat = 5, lon = 4) - data1 <- list(data = data1, lat = 1:5 ,lon = 1:4) + data1 <- list(data = data1, coords = list(lat = 1:5, lon = 1:4)) class(data1) <- 's2dv_cube' - expect_equal( names(CST_WeatherRegimes(data = data1 , ncenters = 4)$statistics), - c('pvalue', 'cluster', 'frequency', 'persistence')) - + c('pvalue', 'cluster', 'frequency', 'persistence') + ) expect_equal( names(CST_WeatherRegimes(data = data1 , ncenters = 4, method = 'ward.D')$statistics), c('pvalue', 'cluster')) @@ -77,27 +126,29 @@ test_that("Sanity checks", { data1 <- 1 : 400 dim(data1) <- c(time = 20, lat = 5, lon = 4) - data1[4,,] <- NA - data1 <- list(data = data1, lat = 1:5 ,lon = 1:4) + data1[4, , ] <- NA + data1 <- list(data = data1, coords = list(lat = 1:5, lon = 1:4)) class(data1) <- 's2dv_cube' expect_error( CST_WeatherRegimes(data = data1, ncenters = 3, EOFs = FALSE), - paste0("Parameter 'data' contains NAs in the 'time' dimensions.")) - + paste0("Parameter 'data' contains NAs in the 'time' dimensions.") + ) data1 <- 1 : 400 dim(data1) <- c(time = 20, lat = 5, lon = 4) - data1[,2,3] <- NA - data1 <- list(data = data1, lat = 1:5 ,lon = 1:4) + data1[, 2, 3] <- NA + data1 <- list(data = data1, coords = list(lat = 1:5, lon = 1:4)) class(data1) <- 's2dv_cube' expect_equal( any(is.na(CST_WeatherRegimes(data = data1, ncenters = 3, EOFs = FALSE)$data)), - TRUE) + TRUE + ) expect_equal( names(dim(CST_WeatherRegimes(data = data1, ncenters = 3, EOFs = FALSE)$data)), - c('lat', 'lon', 'cluster')) + c('lat', 'lon', 'cluster') + ) }) - +############################################## diff --git a/tests/testthat/test-PlotWeeklyClim.R b/tests/testthat/test-PlotWeeklyClim.R new file mode 100644 index 0000000000000000000000000000000000000000..e7886fbc6836b809ca039dc577a37a65fd42037b --- /dev/null +++ b/tests/testthat/test-PlotWeeklyClim.R @@ -0,0 +1,90 @@ +context("CSTools::PlotWeeklyClim tests") + +############################################## + +# dat1 +dat1 <- array(rnorm(1*7), dim = c(dat = 1, var = 1, sdate = 1, time = 7)) + +############################################## + +test_that("1. Input checks", { + # data + expect_error( + PlotWeeklyClim(data = array(1:92), first_date = '2020-03-01', + ref_period_ini = 1993, ref_period_end = 2021), + "Parameter 'data' must have named dimensions." + ) + expect_error( + PlotWeeklyClim(data = data.frame(week = 1:92), first_date = '2020-03-01', + ref_period_ini = 1993, ref_period_end = 2021), + paste0("If parameter 'data' is a data frame, it must contain the ", + "following column names: 'week', 'clim', 'p10', 'p90', 'p33', ", + "'p66', 'week_mean', 'day' and 'data'.") + ) + expect_error( + PlotWeeklyClim(data = 1:92, first_date = '2020-03-01', + ref_period_ini = 1993, ref_period_end = 2021), + "Parameter 'data' must be an array or a data frame." + ) + # time_dim + expect_error( + PlotWeeklyClim(data = dat1, first_date = '2020-03-01', + ref_period_ini = 2020, ref_period_end = 2020, time_dim = 1), + "Parameter 'time_dim' must be a character string." + ) + expect_error( + PlotWeeklyClim(data = array(rnorm(1), dim = c(dat = 1)), + first_date = '2020-03-01', ref_period_ini = 2020, + ref_period_end = 2020), + "Parameter 'time_dim' is not found in 'data' dimension." + ) + expect_error( + PlotWeeklyClim(data = array(rnorm(1*7), dim = c(time = 6)), + first_date = '2020-03-01', ref_period_ini = 2020, + ref_period_end = 2020), + paste0("Parameter 'data' must have the dimension 'time_dim' of length ", + "equal or grater than 7 to compute the weekly means.") + ) + # sdate_dim + expect_error( + PlotWeeklyClim(data = dat1, first_date = '2020-03-01', + ref_period_ini = 2020, ref_period_end = 2020, + sdate_dim = 1), + "Parameter 'sdate_dim' must be a character string." + ) + expect_warning( + PlotWeeklyClim(data = array(rnorm(7), dim = c(time = 7)), + first_date = '2020-03-01', ref_period_ini = 2020, + ref_period_end = 2020), + paste0("Parameter 'sdate_dim' is not found in 'data' dimension. ", + "A dimension of length 1 has been added.") + ) + # ref_period_ini + expect_error( + PlotWeeklyClim(data = dat1, first_date = '2020-03-01', + ref_period_ini = "2020", ref_period_end = 2020), + "Parameters 'ref_period_ini' and 'ref_period_end' must be numeric." + ) + # first_date + expect_error( + PlotWeeklyClim(data = dat1, first_date = 2020-03-01, + ref_period_ini = 2020, ref_period_end = 2020), + paste0("Parameter 'first_date' must be a character string ", + "indicating the date in the format 'yyyy-mm-dd', 'POSIXct' ", + "or 'Dates' class.") + ) + expect_error( + PlotWeeklyClim(data = dat1, first_date = 'a', + ref_period_ini = 2020, ref_period_end = 2020), + paste0("Parameter 'first_date' must be a character string ", + "indicating the date in the format 'yyyy-mm-dd', 'POSIXct' ", + "or 'Dates' class.") + ) + expect_error( + PlotWeeklyClim(data = dat1, first_date = '2020-03-01', ref_period_ini = 2021, + ref_period_end = 2022), + "Parameter 'first_date' must be a date included in the reference period." + ) +}) + +############################################## \ No newline at end of file diff --git a/tests/testthat/test-as.s2dv_cube.R b/tests/testthat/test-as.s2dv_cube.R new file mode 100644 index 0000000000000000000000000000000000000000..5d6303d583b53f61435a57d57917cc8eccbbf92a --- /dev/null +++ b/tests/testthat/test-as.s2dv_cube.R @@ -0,0 +1,341 @@ +context("CSTools::as.s2dv_cube tests") + +############################################## +library(startR) +library(s2dv) +############################################## + +test_that("1. Input checks", { + expect_error( + as.s2dv_cube(object = array(1:10, dim = c(sdate = 2, lat = 5))), + paste0("The class of parameter 'object' is not implemented", + " to be converted into 's2dv_cube' class yet.") + ) + expect_error( + as.s2dv_cube(object = as.list(1:11)), + paste0("The s2dv::Load call did not return any data.") + ) +}) + +############################################## + +test_that("2. Tests from Load()", { + startDates <- c('20001101', '20011101') + suppressWarnings( + ob1 <- Load(var = 'tas', exp = 'system5c3s', + nmember = 2, sdates = startDates, + leadtimemax = 3, latmin = 30, latmax = 35, + lonmin = 10, lonmax = 20, output = 'lonlat') + ) + res1 <- as.s2dv_cube(ob1) + + # dimensions + expect_equal( + dim(res1$data), + c(dataset = 1, member = 2, sdate = 2, ftime = 3, lat = 6, lon = 11) + ) + # elements + expect_equal( + names(res1), + c("data", "dims", "coords", "attrs") + ) + expect_equal( + names(res1$attrs), + c("Variable", "Datasets", "Dates", "when", "source_files", + "not_found_files", "load_parameters") + ) + # coordinates + expect_equal( + attributes(res1$coords$sdate), + list(indices = FALSE) + ) + expect_equal( + attributes(res1$coords$ftime), + list(indices = TRUE) + ) + # Dates + expect_equal( + dim(res1$attrs$Dates), + c(ftime = 3, sdate = 2) + ) +}) + +############################################## + +test_that("3. Tests from Load()", { + obs_path <- list(name = "ERA5", + path = "/esarchive/recon/ecmwf/era5/$STORE_FREQ$_mean/$VAR_NAME$_f1h/$VAR_NAME$_$YEAR$$MONTH$.nc") + ob2 <- Load(var = 'windagl100', obs = list(obs_path), + sdates = '20180301', nmember = 1, + leadtimemin = 1, leadtimemax = 1, + storefreq = "monthly", sampleperiod = 1, + latmin = 36, latmax = 38, lonmin = 0, lonmax = 4, + output = 'lonlat', nprocs = 1, grid = 'r360x181') + + res2 <- as.s2dv_cube(ob2) + + # dimensions + expect_equal( + dim(res2$data), + c(dataset = 1, member = 1, sdate = 1, ftime = 1, lat = 3, lon = 5) + ) + # elements + expect_equal( + names(res2$attrs), + c("Variable", "Datasets", "Dates", "when", "source_files", + "not_found_files", "load_parameters") + ) + # coordinates + expect_equal( + attributes(res2$coords$sdate), + list(indices = FALSE) + ) + expect_equal( + unlist(res2$coords)[1:4], + c(dataset = "ERA5", member = "1", sdate = "20180301", ftime = "1") + ) + # Dates + expect_equal( + dim(res2$attrs$Dates), + c(ftime = 1, sdate = 1) + ) +}) + +############################################## + +test_that("4. Tests from Load()", { + exp <- list(name = 'ecmwfS5', + path = "/esarchive/exp/ecmwf/system5c3s/$STORE_FREQ$_mean/$VAR_NAME$_s0-24h/$VAR_NAME$_$START_DATE$.nc") + obs <- list(name = 'era5', + path = '/esarchive/recon/ecmwf/era5/$STORE_FREQ$_mean/$VAR_NAME$_f1h-r1440x721cds/$VAR_NAME$_$YEAR$$MONTH$.nc') + suppressWarnings( + ob3 <- Load(var = 'prlr', exp = list(exp), obs = list(obs), + sdates = paste0(1993:1995, '1101'), nmember = 1, + storefreq = "monthly", sampleperiod = 1, + latmin = 42, latmax = 45, lonmin = 4, lonmax = 6, + output = 'lonlat', nprocs = 1) + ) + expect_warning( + as.s2dv_cube(ob3), + "The output is a list of two 's2dv_cube' objects corresponding to 'exp' and 'obs'." + ) + suppressWarnings( + res3 <- as.s2dv_cube(ob3) + ) + + # dimensions + expect_equal( + dim(res3[[1]]$data), + c(dataset = 1, member = 1, sdate = 3, ftime = 8, lat = 4, lon = 3) + ) + expect_equal( + unlist(res3[[1]]$coords)[1:4], + c(dataset = "ecmwfS5", member = "1", sdate1 = "19931101", sdate2 = "19941101") + ) + # Dates + expect_equal( + dim(res3[[1]]$attrs$Dates), + dim(res3[[2]]$attrs$Dates) + ) +}) + +############################################## + +test_that("5. Tests from Start()", { + repos <- '/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc' + suppressWarnings( + data1 <- Start(dat = repos, + var = 'tas', + sdate = c('20170101', '20180101'), + ensemble = indices(1:3), + time = 'all', + latitude = indices(1:10), + longitude = indices(1:10), + return_vars = list(latitude = 'dat', longitude = 'dat', time = 'sdate'), + retrieve = TRUE) + ) + + res4 <- as.s2dv_cube(data1) + + # dimensions + expect_equal( + dim(res4$data), + c(dat = 1, var = 1, sdate = 2, ensemble = 3, time = 7, latitude = 10, longitude = 10) + ) + # elements + expect_equal( + names(res4), + c("data", "dims", "coords", "attrs") + ) + expect_equal( + names(res4$attrs), + c("Dates", "Variable", "Datasets", "when", "source_files", "load_parameters") + ) + # coordinates + expect_equal( + names(res4$coords), + c("dat", "var", "sdate", "ensemble", "time", "latitude", "longitude") + ) + # Dates + expect_equal( + dim(res4$attrs$Dates), + c(sdate = 2, time = 7) + ) +}) + +############################################## + +test_that("6. Tests from Start()", { + vari <- "rsds" + anlgs <- paste0("/esarchive/oper/VITIGEOSS","/output/cfsv2/weekly_mean/", + "$var$/$var$-vitigeoss-cat","_1999-2018_", "$file_date$.nc") + + file_date_array <- array(dim = c(sweek = 2, sday = 3)) + file_date_array[, 1] <- c(paste0('04', c('04', '07'))) + file_date_array[, 2] <- c(paste0('04', c('07', '11'))) + file_date_array[, 3] <- c(paste0('04', c('11', '14'))) + + suppressWarnings( + hcst <- Start(dat = anlgs, + var = vari, + latitude = indices(1:4), #'all', + longitude= indices(1:4), #'all', + member= indices(1), #'all', + time = 'all', + syear = indices(1:4), + file_date = file_date_array, + split_multiselected_dims = TRUE, + retrieve = T, + return_vars = list(leadtimes = 'file_date', longitude = 'dat', latitude = 'dat'), + synonims = list(longitude = c('lon', 'longitude'), + latitude = c('lat', 'latitude'), + syear = c('sdate','syear'), + member = c('ensemble','member'))) + ) + + res5 <- as.s2dv_cube(hcst) + + # dimensions + expect_equal( + dim(res5$data), + c(dat = 1, var = 1, latitude = 4, longitude = 4, member = 1, time = 4, + syear = 4, sweek = 2, sday = 3) + ) + # elements + expect_equal( + names(res5), + c("data", "dims", "coords", "attrs") + ) + expect_equal( + names(res5$attrs), + c("Variable", "Dates", "Datasets", "when", "source_files", "load_parameters") + ) + # coordinates + expect_equal( + names(res5$coords), + c('dat', 'var', 'latitude', 'longitude', 'member', 'time', 'syear', 'sweek', 'sday') + ) + # Dates + expect_equal( + dim(res5$attrs$Dates), + c(sweek = 2, sday = 3, syear = 20, time = 4) + ) +}) + +############################################## + +test_that("7. Tests from Start()", { + repos <- "/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc" + repos2 <- "/esarchive/exp/ecmwf/system4_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc" + + suppressWarnings( + data6 <- Start(dat = list(list(name = 'system4_m1', path = repos2), + list(name = 'system5_m1', path = repos)), + var = c('tas', 'sfcWind'), + sdate = '20170101', + ensemble = indices(1), + time = indices(1), + lat = indices(1:5), + lon = indices(1:5), + synonims = list(lat = c('lat', 'latitude'), + lon = c('lon', 'longitude')), + return_vars = list(time = 'sdate', + longitude = 'dat', + latitude = 'dat'), + metadata_dims = c('dat', 'var'), + retrieve = T) + ) + + suppressWarnings( + res6 <- as.s2dv_cube(data6) + ) + + # dimensions + expect_equal( + dim(res6$data), + c(dat = 2, var = 2, sdate = 1, ensemble = 1, time = 1, lat = 5, lon = 5) + ) + # elements + expect_equal( + names(res6), + c("data", "dims", "coords", "attrs") + ) + expect_equal( + res6$attrs$Variable$varName, + c('tas', 'sfcWind') + ) + # coordinates + expect_equal( + names(res6$coords), + c('dat', 'var', 'sdate', 'ensemble', 'time', 'lat', 'lon') + ) + # Dates + expect_equal( + dim(res6$attrs$Dates), + c(sdate = 1, time = 1) + ) +}) + +############################################## + +test_that("8. Tests from Start()", { + path <- paste0('/esarchive/exp/ecearth/a3t4/diags/CMIP/EC-Earth-Consortium/EC-Earth3-LR/piControl/$memb$/Omon/$var$/gn/', + 'v*/$var$_Omon_EC-Earth3-LR_piControl_$memb$_gn_$chunk$.nc') + suppressWarnings( + data7 <- Start(dat = list(list(name = 'a3t4', path = path)), + var = 'tosmean', + memb = paste0('r', 1:5, 'i1p1f1'), + region = c("ATL3", "Global_Ocean", "Nino3.4"), + time = indices(1:10), + chunk = 'all', + time_across = 'chunk', + merge_across_dims = TRUE, + return_vars = list(time = 'chunk', region = NULL), + num_procs = 8, + retrieve = T) + ) + + res7 <- as.s2dv_cube(data7) + + # dimensions + expect_equal( + dim(res7$data), + c(dat = 1, var = 1, memb = 5, region = 3, time = 10) + ) + # elements + expect_equal( + names(res7), + c("data", "dims", "coords", "attrs") + ) + expect_equal( + res7$attrs$Variable$varName, + c('tosmean') + ) + # Dates + expect_equal( + dim(res7$attrs$Dates), + c(time = 10) + ) +}) + +############################################## \ No newline at end of file diff --git a/tests/testthat/test-s2dv_cube.R b/tests/testthat/test-s2dv_cube.R new file mode 100644 index 0000000000000000000000000000000000000000..5737486ce4637670d829241c1a19a2e6114d414e --- /dev/null +++ b/tests/testthat/test-s2dv_cube.R @@ -0,0 +1,135 @@ +context("CSTools::s2dv_cube tests") + +############################################## + +# dat1 +dat1 <- array(1:5, dim = c(var = 1, sdate = 5)) +coords1 <- list(sdate = c('20000101', '20010102', '20020103', '20030104', '20040105'), + var = 'tas') +dates1 <- as.POSIXct(coords1[[1]], format = "%Y%m%d", tz = "UTC") +dim(dates1) <- c(sdate = 5) + +############################################## + +test_that("1. Input checks", { + # data + expect_error( + s2dv_cube(data = 1:10), + "Parameter 'data' must be an array with named dimensions." + ) + # coords + expect_warning( + s2dv_cube(data = dat1, coords = list(sdate = c('20000102', '20010202'), var = 'tas'), + varName = 'tas', metadata = list(tas = list(level = '2m')), + Dates = dates1), + "Coordinate 'sdate' has different lenght as its dimension and it will not be used." + ) + expect_warning( + s2dv_cube(data = dat1, coords = list(sdate = coords1[[1]]), + varName = 'tas', metadata = list(tas = list(level = '2m')), + Dates = dates1), + "Coordinate 'var' is not provided and it will be set as index in element coords." + ) + # Dates + expect_warning( + s2dv_cube(data = dat1, coords = coords1, + varName = 'tas', metadata = list(tas = list(level = '2m'))), + paste0("Parameter 'Dates' is not provided so the metadata ", + "of 's2dv_cube' object will be incomplete.") + ) + expect_warning( + s2dv_cube(data = dat1, coords = coords1, + varName = 'tas', metadata = list(tas = list(level = '2m')), + Dates = c('20000101', '20010102', '20020103', '20030104', '20040105')), + paste0("Parameter 'Dates' must be an array with named time dimensions.") + ) + expect_warning( + s2dv_cube(data = dat1, coords = coords1, + varName = 'tas', metadata = list(tas = list(level = '2m')), + Dates = array(c('20000101', '20010102', '20020103', '20030104', '20040105'))), + paste0("Parameter 'Dates' must have dimension names.") + ) + expect_warning( + s2dv_cube(data = dat1, coords = coords1, + varName = 'tas', metadata = list(tas = list(level = '2m')), + Dates = array(c('20000101', '20010102', '20020103', '20030104', '20040105'), + dim = c(time = 5))), + paste0("Parameter 'Dates' must have the corresponding time dimension names in 'data'.") + ) + expect_warning( + s2dv_cube(data = dat1, coords = coords1, + varName = 'tas', metadata = list(tas = list(level = '2m')), + Dates = array(c('20000101', '20010102', '20020103', '20030104', '20040105'), + dim = c(sdate = 5))), + paste0("Parameter 'Dates' must be of class 'POSIXct'.") + ) + # varName + expect_warning( + s2dv_cube(data = dat1, coords = coords1, + metadata = list(tas = list(level = '2m')), + Dates = dates1), + paste0("Parameter 'varName' is not provided so the metadata ", + "of 's2dv_cube' object will be incomplete.") + ) + expect_warning( + s2dv_cube(data = dat1, coords = coords1, varName = 1, + metadata = list(tas = list(level = '2m')), + Dates = dates1), + "Parameter 'varName' must be a character." + ) + # metadata + expect_warning( + s2dv_cube(data = dat1, coords = coords1, varName = 'tas', + Dates = dates1), + "Parameter 'metadata' is not provided so the metadata of 's2dv_cube' object will be incomplete." + ) +}) + +############################################## + +test_that("2. Output checks", { + expect_equal( + names(s2dv_cube(data = dat1, coords = coords1, + varName = 'tas', metadata = list(tas = list(level = '2m')), + Dates = dates1)), + c("data", "dims", "coords", "attrs") + ) + expect_equal( + names(s2dv_cube(data = dat1, coords = coords1, + varName = 'tas', metadata = list(tas = list(level = '2m')), + Dates = dates1)$coords), + c("sdate", "var") + ) + expect_equal( + names(s2dv_cube(data = dat1, coords = coords1, + varName = 'tas', metadata = list(tas = list(level = '2m')), + Dates = dates1)$attrs), + c("Dates", "Variable") + ) + expect_equal( + names(s2dv_cube(data = dat1, coords = coords1, + varName = 'tas', metadata = list(tas = list(level = '2m')), + Dates = dates1, Datasets = 'dat1', + when = as.POSIXct("2022-12-21 17:13"), + source_files = "/home/R")$attrs), + c("Dates", "Variable", "Datasets", "when", "source_files") + ) + suppressWarnings( + object <- s2dv_cube(data = dat1, coords = list(sdate = coords1[[1]]), varName = 'tas', + metadata = list(tas = list(level = '2m')), + Dates = dates1) + ) + expect_equal( + attributes(object$coords$sdate), + list(indices = FALSE) + ) + expect_equal( + attributes(object$coords$var), + list(indices = TRUE) + ) + +}) + +############################################## + + diff --git a/vignettes/Analogs_vignette.Rmd b/vignettes/Analogs_vignette.Rmd index d7527657280fb9a11c1dd9b49493ce30fba81a61..674dccac5e42d414b1b611b83c705fa06d4c39ab 100644 --- a/vignettes/Analogs_vignette.Rmd +++ b/vignettes/Analogs_vignette.Rmd @@ -2,6 +2,8 @@ title: "Analogs based on large scale for downscaling" author: "M. Carmen Alvarez-Castro and M. del Mar Chaves-Montero (CMCC, Italy)" date: "November 2020" +revisor: "Eva Rifà" +revision date: "January 2023" output: rmarkdown::html_vignette vignette: > %\VignetteEngine{knitr::knitr} @@ -15,50 +17,22 @@ knitr::opts_chunk$set(eval = FALSE) --> ## Downscaling seasonal forecast data using Analogs -In this example, the seasonal temperature forecasts, initialized in october, -will be used to perform a downscaling in the Balearic Islands temperature using -the cmcc system 3 seasonal forecasting system from the Euro-Mediterranean Center -of Climate Change (CMCC), by computing Analogs in Sea level pressure data (SLP) -in a larger region (North Atlantic). The first step will be to load the data we -want to downscale (i.e. cmcc) in the large region (i.e North Atlantic) for -temperature (predictand) and SLP (predictor) and same variables and region for a -higher resolution data (ERA5). In a second step we will interpolate the model to -the resolution of ERA5. In a third step we will find the analogs using one of -the three criterias. In a four step we will get the downscaled dataset in the -region selected (local scale, in this case Balearic Islands) +In this example, the seasonal temperature forecasts, initialized in october, will be used to perform a downscaling in the Balearic Islands temperature using the cmcc system 3 seasonal forecasting system from the Euro-Mediterranean Center of Climate Change (CMCC), by computing Analogs in Sea level pressure data (SLP) in a larger region (North Atlantic). The first step will be to load the data we want to downscale (i.e. cmcc) in the large region (i.e North Atlantic) for temperature (predictand) and SLP (predictor) and same variables and region for a higher resolution data (ERA5). In a second step we will interpolate the model to the resolution of ERA5. In a third step we will find the analogs using one of the three criterias. In a four step we will get the downscaled dataset in the region selected (local scale, in this case Balearic Islands). ## 1. Introduction of the function -For instance if we want to perform a temperature donwscaling in Balearic Island -for October we will get a daily series of temperature with 1 analog per day, -the best analog. How we define the best analog for a certain day? This function -offers three options for that: +For instance if we want to perform a temperature donwscaling in Balearic Island for October we will get a daily series of temperature with 1 analog per day, +the best analog. How we define the best analog for a certain day? This function offers three options for that: -(1) The day with the minimum Euclidean distance in a large scale field: using -i.e. pressure or geopotencial height as variables and North Atlantic region as -large scale region. The Atmospheric circulation pattern in the North Atlantic -(LargeScale) has an important role in the climate in Spain (LocalScale). -The function will find the day with the most similar pattern in atmospheric -circulation in the database (obs, slp in ERA5) to the day of interest -(exp,slp in model). Once the date of the best analog is found, the function -takes the associated temperature to that day (obsVar, tas in ERA5), with a -subset of the region of interest (Balearic Island) +(1) The day with the minimum Euclidean distance in a large scale field: using i.e. pressure or geopotencial height as variables and North Atlantic region as large scale region. The Atmospheric circulation pattern in the North Atlantic (LargeScale) has an important role in the climate in Spain (LocalScale). The function will find the day with the most similar pattern in atmospheric circulation in the database (obs, slp in ERA5) to the day of interest (exp, slp in model). Once the date of the best analog is found, the function takes the associated temperature to that day (obsVar, tas in ERA5), with a subset of the region of interest (Balearic Island). -(2) Same that (1) but in this case we will search for analogs in the local -scale (Balearic Island) instead of in the large scale (North Atlantic). -Once the date of the best analog is found, the function takes the associated -temperature to that day (obsVar, t2m in ERA5), with a subset of the region of -interest (Balearic Island) +(2) Same that (1) but in this case we will search for analogs in the local scale (Balearic Island) instead of in the large scale (North Atlantic). Once the date of the best analog is found, the function takes the associated temperature to that day (obsVar, t2m in ERA5), with a subset of the region of interest (Balearic Island). -(3) Same that (2) but here we will search for analogs with higher correlation -at local scale (Balearic Island) and instead of using SLP we will use t2m. +(3) Same that (2) but here we will search for analogs with higher correlation at local scale (Balearic Island) and instead of using SLP we will use t2m. +In particular the _Analogs Method_ uses a nonlinear approach that follows (**Analogs**; Yiou et al. 2013). -In particular the _Analogs Method_ uses a nonlinear approach that follows -(**Analogs**; Yiou et al. 2013) - -An efficient implementation of Analogs is provided for CSTools by the -`CST_Analogs()` function. +An efficient implementation of Analogs is provided for CSTools by the `CST_Analogs()` function. Two datasets are used to illustrate how to use the function. The first one could be enterly run by the users since it is using data samples provided along with the package. The second one uses data that needs to be downloaded or requested. @@ -80,28 +54,48 @@ class(lonlat_temp$exp) names(lonlat_temp$obs) dim(lonlat_temp$obs$data) dim(lonlat_temp$exp$data) -head(lonlat_temp$exp$Dates$start) +head(lonlat_temp$exp$attrs$Dates) ``` -There are 15 ensemble members available in the data set, 6 starting dates and 3 -forecast times, which refer to daily values in the month of November following -starting dates on November 1st in the years 2010, 2011, 2012. +There are 15 ensemble members available in the `exp` data set, 6 starting dates and 3 forecast times, which refer to monthly values during 3 months following starting dates on November 1st in the years 2000, 2001, 2002, 2003, 2004 and 2005. ``` -down <- CST_Analogs(expL = lonlat_temp$exp, obsL = lonlat_temp$obs) -``` +exp1 <- lonlat_temp$exp +exp1$data <- exp1$data[, , 1, 1, , , drop = FALSE] +exp1$attrs$Dates <- exp1$attrs$Dates[1] + +down_1 <- CST_Analogs(expL = exp1, obsL = lonlat_temp$obs) + +exp2 <- lonlat_temp$exp +exp2$data <- exp2$data[, , 1, 2, , , drop = FALSE] +exp2$attrs$Dates <- exp2$attrs$Dates[2] + +down_2 <- CST_Analogs(expL = exp2, obsL = lonlat_temp$obs) + +exp3 <- lonlat_temp$exp +exp3$data <- exp3$data[, , 1, 3, , , drop = FALSE] +exp3$attrs$Dates <- exp3$attrs$Dates[3] +down_3 <- CST_Analogs(expL = exp3, obsL = lonlat_temp$obs) +``` The visualization of the first three time steps for the ensemble mean of the forecast initialized the 1st of Noveber 2000 can be done using the package **s2dv**: ``` library(s2dv) + +var = list(MeanDims(down_1$data, 'member'), + MeanDims(down_2$data, 'member'), + MeanDims(down_3$data, 'member')) + PlotLayout(PlotEquiMap, c('lat', 'lon'), - var = Reorder(MeanDims(down$data, 'member')[1,,,1,], - c('time_exp', 'lat', 'lon')), - nrow = 1, ncol = 3, - lon = down$lon, lat = down$lat, filled.continents = FALSE, - titles = c("2000-11-01", "2000-12-01", "2001-01-01"), units = 'T(K)', - toptitle = 'Analogs sdate November 2000', - width = 10, height = 4, fileout = './Figures/Analogs1.png') + var = var, + nrow = 1, ncol = 3, + lon = down_1$coords$lon, + lat = down_1$coords$lat, + filled.continents = FALSE, + titles = c("2000-11-01", "2000-12-01", "2001-01-01"), units = 'T(K)', + toptitle = 'Analogs sdate November 2000', + width = 10, height = 4) + ``` ![](./Figures/Analogs1.png) @@ -109,8 +103,8 @@ PlotLayout(PlotEquiMap, c('lat', 'lon'), The user can also request extra Analogs and the information: ``` -down <- CST_Analogs(expL = lonlat_temp$exp, obsL = lonlat_temp$obs, - nAnalogs = 2, AnalogsInfo = TRUE) +down <- CST_Analogs(expL = exp1, obsL = lonlat_temp$obs, + nAnalogs = 2, AnalogsInfo = TRUE) ``` Again, the user can explore the object down1 which is class 's2dv_cube'. The element 'data' contains in this case metrics and the dates corresponding to the observed field: @@ -121,17 +115,16 @@ names(down$data) dim(down$data$fields) dim(down$data$metric) dim(down$data$dates) -down$data$dates[1,15,1,1] +down$data$dates[1,15] ``` - The last command run concludes that the best analog of the ensemble 15 corresponding to the 1st of November 2000 is the 1st November 2004: ``` -PlotLayout(PlotEquiMap, c('lat', 'lon'), var = list(down$data$fields[1,,,15,1,1], - lonlat_temp$obs$data[1,1,5,1,,]), nrow = 1, ncol = 2, - lon = down$lon, lat = down$lat, filled.continents = FALSE, +PlotLayout(PlotEquiMap, c('lat', 'lon'), var = list(down$data$fields[1, , , 15], + lonlat_temp$obs$data[1, 1, 5, 1, , ]), nrow = 1, ncol = 2, + lon = down$coords$lon, lat = down$coords$lat, filled.continents = FALSE, titles = c("Downscaled 2000-11-01", "Observed 2004-11-01"), units = 'T(K)', - width = 7, height = 4, fileout = './Figures/Analogs2.png') + width = 7, height = 4) ``` ![](./Figures/Analogs2.png) @@ -142,10 +135,7 @@ As expected, they are exatly the same. In this case, the spatial field of a single forecast day will be downscale using Analogs in this example. This will allow illustrating how to use CST_Load to retrieve observations separated from simulations. To explore other options, see other CSTools vignettes as well as `CST_Load` documentation. -The simulations available for the desired model cover the period 1993-2016. Here, the 15th of October 2000 (for the simulation initialized in the 1st of October 2000), will be downscaled. -For ERA5 from 1979 to the present days. For this example we will just use October days from 2000 to 2006, so, the starting dates can be defined by running the -following lines: - +The simulations available for the desired model cover the period 1993-2016. Here, the 15th of October 2000 (for the simulation initialized in the 1st of October 2000), will be downscaled. For ERA5 from 1979 to the present days. For this example we will just use October days from 2000 to 2006, so, the starting dates can be defined by running the following lines: ``` start <- as.Date(paste(2000, 10, "01", sep = ""), "%Y%m%d") @@ -153,10 +143,7 @@ end <- as.Date(paste(2006, 10, "01", sep = ""), "%Y%m%d") dateseq <- format(seq(start, end, by = "year"), "%Y%m%d") ``` -Using the `CST_Load` function from **CSTool package**, the data available in our -data store can be loaded. The following lines show how this function can be -used. The experimental datasets are interpolated to the ERA5 grid by specifying the 'grid' parameter while ERA5 doesn't need to be interpolated. While parameter leadtimemax is set to 1 for the experimental dataset, it is set to 31 for the observations, returning the daily observations for October for the years requested in 'sdate' (2000-2006). -Download the data to run the recipe under the HTTPS: downloads.cmcc.bo.it/d_chaves/ANALOGS/data_for_Analogs.Rdat or ask carmen.alvarez-castro at cmcc.it or nuria.perez at bsc.es. +Using the `CST_Load` function from **CSTool package**, the data available in our data store can be loaded. The following lines show how this function can be used. The experimental datasets are interpolated to the ERA5 grid by specifying the 'grid' parameter while ERA5 doesn't need to be interpolated. While parameter leadtimemax is set to 1 for the experimental dataset, it is set to 31 for the observations, returning the daily observations for October for the years requested in 'sdate' (2000-2006). Download the data to run the recipe under the HTTPS: downloads.cmcc.bo.it/d_chaves/ANALOGS/data_for_Analogs.Rdat or ask carmen.alvarez-castro at cmcc.it or nuria.perez at bsc.es. ``` exp <- list(name = 'ECMWF_system4_m1', @@ -164,11 +151,11 @@ exp <- list(name = 'ECMWF_system4_m1', "$STORE_FREQ$_mean/$VAR_NAME$_*/$VAR_NAME$_$START_DATE$.nc")) obs <- list(name = 'ERA5', path = file.path("/esarchive/recon/ecmwf/era5/$STORE_FREQ$_mean/", - "$VAR_NAME$_f1h-r1440x721cds/$VAR_NAME$_$YEAR$$MONTH$.nc")) + "$VAR_NAME$_f1h-r1440x721cds/$VAR_NAME$_$YEAR$$MONTH$.nc")) expTAS <- CST_Load(var = 'tas', exp = list(exp), obs = NULL, sdates = '20001001', latmin = 22, latmax = 70, - lonmin = -80, lonmax = 50, output ='lonlat', + lonmin = -80, lonmax = 50, output = 'lonlat', storefreq = 'daily', nmember = 15, leadtimemin = 15, leadtimemax = 15, method = "bilinear", grid = 'r1440x721', nprocs = 1) @@ -180,7 +167,7 @@ obsTAS <- CST_Load(var = 'tas', exp = NULL, obs = list(obs), expPSL <- CST_Load(var = 'psl', exp = list(exp), obs = NULL, sdates = '20001001', latmin = 22, latmax = 70, - lonmin = -80, lonmax = 50, output ='lonlat', + lonmin = -80, lonmax = 50, output = 'lonlat', storefreq = 'daily', nmember = 15, leadtimemin = 15, leadtimemax = 15, method = "bilinear", grid = 'r1440x721', nprocs = 1) @@ -190,30 +177,22 @@ obsPSL <- CST_Load(var = 'psl', exp = NULL, obs = list(obs), lonmin = -80, lonmax = 50, output = 'lonlat', nprocs = 1, storefreq = "daily", nmember = 1) -save(expTAS, obsTAS, expPSL, obsPSL, - file = "../../data_for_Analogs.Rdat", - version = 2) - -#load(file = "./data_for_Analogs.Rdat") ``` *Note: `CST_Load` allows to load the data simultaneously for 'exp' and 'obs' already formatted to have the same dimensions as in this example. However, it is possible to request separated 'obs' and 'exp'. In this second case, the observations could be return in a continous time series instead of being split in start dates and forecast time.* - The s2dv_cube objects `expTAS`,`obsTAS`, `expPSL` and `obsPSL` are now loaded in the R enviroment. The first two elements correspond to the experimental and observed data for temperature and the other two are the equivalent for the SLP data. -Loading the data using `CST_Load` allows to obtain two lists, one for the -experimental data and another for the observe data, with the same elements and -compatible dimensions of the data element: +Loading the data using `CST_Load` allows to obtain two lists, one for the experimental data and another for the observe data, with the same elements and compatible dimensions of the data element: ``` dim(expTAS$data) -dataset member sdate ftime lat lon - 1 15 1 1 193 521 +# dataset member sdate ftime lat lon +# 1 15 1 1 193 521 dim(obsTAS$data) -dataset member sdate ftime lat lon - 1 1 7 31 193 521 +# dataset member sdate ftime lat lon +# 1 1 7 31 193 521 ``` @@ -226,45 +205,36 @@ down1 <- CST_Analogs(expL = expPSL, obsL = obsPSL, AnalogsInfo = TRUE, criteria = "Large_dist", nAnalogs = 3, obsVar = obsTAS, expVar = expTAS) ``` - Some warnings could appear indicating information about undefining parameters. It is possible to explore the information in object `down` by runing: ``` names(down1$data) dim(down1$data$field) -#nAnalogs lat lon member time -# 3 193 521 15 1 +# nAnalogs lat lon member time +# 3 193 521 15 1 dim(down1$data$dates) -#nAnalogs member time +# nAnalogs member time # 3 15 1 -down1$data$dates[1,1,1] -#"2005-10-07 UTC" +down1$data$dates[1,1] +# "07-10-2005" ``` - -Now, we can visualize the output and save it using library ragg (not mandatory): +Now, we can visualize the output: ``` -library(ragg) -agg_png("/esarchive/scratch/nperez/git/cstools/vignettes/Figures/Analogs3.png", - width = 1100, height = 500, units = 'px',res = 144) PlotLayout(PlotEquiMap, c('lat', 'lon'), - var = list(expPSL$data[1,1,1,1,,], obsPSL$data[1,1,1,15,,], - obsPSL$data[1,1,6,7,,]), - lon = obsPSL$lon, lat = obsPSL$lat, filled.continents = FALSE, + var = list(expPSL$data[1, 1, 1, 1, , ], obsPSL$data[1, 1, 1, 15, , ], + obsPSL$data[1, 1, 6, 7, , ]), lon = obsPSL$coords$lon, + lat = obsPSL$coords$lat, filled.continents = FALSE, titles = c('Exp PSL 15-10-2000','Obs PSL 15-10-2000', 'Obs PSL 7-10-2005'), - toptitle = 'First member', ncol = 3, nrow = 1) -dev.off() -agg_png("/esarchive/scratch/nperez/git/cstools/vignettes/Figures/Analogs4.png", - width = 800, height = 800, units = 'px',res = 144) + toptitle = 'First member', ncol = 3, nrow = 1, width = 10, height = 4) PlotLayout(PlotEquiMap, c('lat', 'lon'), var = list( - expTAS$data[1,1,1,1,,], obsTAS$data[1,1,1,15,,], - down1$data$field[1,,,1,1], obsTAS$data[1,1,6,7,,]), - lon = obsTAS$lon, lat = obsTAS$lat, filled.continents = FALSE, - titles = c('Exp TAS 15-10-2000', 'Obs TAS 15-10-2000', - 'Analog TAS 15-10-2000', 'Obs TAS 7-10-2005'), - ncol = 2, nrow = 2) -dev.off() + expTAS$data[1, 1, 1, 1, , ], obsTAS$data[1, 1, 1, 15, , ], + down1$data$field[1, , , 1], obsTAS$data[1, 1, 6, 7, , ]), + lon = obsTAS$coords$lon, lat = obsTAS$coords$lat, filled.continents = FALSE, + titles = c('Exp TAS 15-10-2000', 'Obs TAS 15-10-2000', + 'Analog TAS 15-10-2000', 'Obs TAS 7-10-2005'), + ncol = 2, nrow = 2) ``` ![](./Figures/Analogs3.png) @@ -282,10 +252,10 @@ The aim is to downscale the temperature simulation of the 15th of October 2000, ``` region <- c(lonmin = 0, lonmax = 5, latmin = 38.5, latmax = 40.5) -expPSL$data <- expPSL$data[1,1,1,1,,] -expTAS$data <- expTAS$data[1,1,1,1,,] +expPSL$data <- expPSL$data[1, 1, 1, 1, , ] +expTAS$data <- expTAS$data[1, 1, 1, 1, , ] down2 <- CST_Analogs(expL = expPSL, obsL = obsPSL, AnalogsInfo = TRUE, - criteria = "Local_dist", nAnalogs = 50, + criteria = "Local_dist", # nAnalogs = 50, obsVar = obsTAS, expVar = expTAS, region = region) ``` @@ -295,28 +265,28 @@ The parameter 'nAnalogs' doesn't correspond to the number of Analogs returned, b In this case, when looking to a large scale pattern and also to local scale pattern the best analog for the first member is the 13th of October 2001: ``` -down2$data$dates[1,1] -[1] "2001-10-13 UTC" +down2$data$dates[2] +# [1] "13-10-2001" ``` ``` library(ClimProjDiags) -agg_png("/esarchive/scratch/nperez/git/cstools/vignettes/Figures/Analogs5.png", - width = 800, height = 800, units = 'px',res = 144) -PlotLayout(PlotEquiMap, c('lat', 'lon'), var = list( - expTAS$data, obsTAS$data[1,1,1,15,,], - down2$data$field[1,,,1], SelBox(obsTAS$data[1,1,2,13,,], - lon = as.vector(obsTAS$lon), lat = as.vector(obsTAS$lat), - region)$data), - special_args = list(list(lon = expTAS$lon, lat = expTAS$lat), - list(lon = obsTAS$lon, lat = obsTAS$lat), - list(lon = down2$lon, down2$lat), - list(lon = down2$lon, down2$lat)), + +var = list(expTAS$data, obsTAS$data[1, 1, 1, 15, , ], + down2$data$field[1, , ], SelBox(obsTAS$data[1, 1, 2, 13, , ], + lon = as.vector(obsTAS$coords$lon), + lat = as.vector(obsTAS$coords$lat), + region)$data) + +PlotLayout(PlotEquiMap, c('lat', 'lon'), var = var, + special_args = list(list(lon = expTAS$coords$lon, lat = expTAS$coords$lat), + list(lon = obsTAS$coords$lon, lat = obsTAS$coords$lat), + list(lon = down2$coords$lon, down2$coords$lat), + list(lon = down2$coords$lon, down2$coords$lat)), filled.continents = FALSE, titles = c('Exp TAS 15-10-2000', 'Obs TAS 15-10-2000', - 'Analog TAS 15-10-2000', 'Obs TAS 13-10-2001'), - ncol = 2, nrow = 2) -dev.off() + 'Analog TAS 15-10-2000', 'Obs TAS 13-10-2001'), + ncol = 2, nrow = 2) ``` ![](./Figures/Analogs5.png) @@ -328,7 +298,7 @@ Previous figure shows that the best Analog field corrspond to the observed field ``` down3 <- CST_Analogs(expL = expPSL, obsL = obsPSL, AnalogsInfo = TRUE, - criteria = "Local_cor", nAnalogs = 50, + criteria = "Local_cor", # nAnalogs = 50, obsVar = obsTAS, expVar = expTAS, region = region) ``` @@ -336,21 +306,20 @@ down3 <- CST_Analogs(expL = expPSL, obsL = obsPSL, AnalogsInfo = TRUE, In this case, when looking to a large scale pattern and also to local scale pattern the best analog for the first member is the 10th of October 2001: ``` -down3$data$dates[1,1] -[1] "2001-10-10 UTC" +down3$data$dates[3] +# [1] "10-10-2001" ``` ``` -agg_png("/esarchive/scratch/nperez/git/cstools/vignettes/Figures/Analogs6.png", - width = 800, height = 400, units = 'px',res = 144) -PlotLayout(PlotEquiMap, c('lat', 'lon'), var = list( - down3$data$field[1,,,1], SelBox(obsTAS$data[1,1,2,10,,], - lon = as.vector(obsTAS$lon), lat = as.vector(obsTAS$lat), - region)$data), lon = down3$lon, lat = down3$lat, +var = list(down3$data$field[1, , ], SelBox(obsTAS$data[1, 1, 2, 10, , ], + lon = as.vector(obsTAS$coords$lon), lat = as.vector(obsTAS$coords$lat), + region)$data) + +PlotLayout(PlotEquiMap, c('lat', 'lon'), var = var, + lon = down3$coords$lon, lat = down3$coords$lat, filled.continents = FALSE, titles = c('Analog TAS 15-10-2000', 'Obs TAS 10-10-2001'), - ncol = 2, nrow = 1) -dev.off() + ncol = 2, nrow = 1) ``` ![](./Figures/Analogs6.png) @@ -359,18 +328,15 @@ Previous figure shows that the best Analog field corrspond to the observed field #### Downscaling using exp$data using excludeTime parameter -`ExludeTime` is set by default to Time_expL in order to find the same analog than -the day of interest. If there is some interest in excluding other dates should -be included in the argument 'excludeTime'. +`ExludeTime` is set by default to Time_expL in order to find the same analog than the day of interest. If there is some interest in excluding other dates should be included in the argument 'excludeTime'. ``` down4 <- CST_Analogs(expL = expPSL, obsL = obsPSL, AnalogsInfo = TRUE, criteria = "Large_dist", nAnalogs = 20, obsVar = obsTAS, expVar = expTAS, - region = region, excludeTime = obsPSL$Dates$start[10:20]) + region = region, excludeTime = obsPSL$attrs$Dates[10:20]) ``` In this case, the best analog is still being 7th of October, 2005. - *Note: You can compute the anomalies values before applying the criterias (as in Yiou et al, 2013) using `CST_Anomaly` of CSTools package* diff --git a/vignettes/BestEstimateIndex_vignette.Rmd b/vignettes/BestEstimateIndex_vignette.Rmd index 213499ee67e67ffba713f63e7e918e7e24508ac9..b7dfce685db029532898931300d4e82109f80a61 100644 --- a/vignettes/BestEstimateIndex_vignette.Rmd +++ b/vignettes/BestEstimateIndex_vignette.Rmd @@ -38,24 +38,24 @@ The synthetic data is created by running the following lines: ``` # observations -NAO_obs <- rnorm(20, sd=3) +NAO_obs <- rnorm(20, sd = 3) dim(NAO_obs) <- c(time = 20) # hindcast and forecast of a dynamical SFS 1 -NAO_hind1 <- rnorm(20 * 2 * 25, sd=2.5) +NAO_hind1 <- rnorm(20 * 2 * 25, sd = 2.5) dim(NAO_hind1) <- c(time = 20, member = 50) -NAO_fcst1 <- rnorm(2*51, sd=2.5) +NAO_fcst1 <- rnorm(2*51, sd = 2.5) dim(NAO_fcst1) <- c(time = 1, member = 102) # hindcast and forecast of an empirical SFS 2 -NAO_hind2_mean <- rnorm(20, sd=3) -NAO_hind2_sd <- rnorm(20, mean=5, sd=1) +NAO_hind2_mean <- rnorm(20, sd = 3) +NAO_hind2_sd <- rnorm(20, mean = 5, sd = 1) NAO_hind2 <- cbind(NAO_hind2_mean, NAO_hind2_sd) -dim(NAO_hind2) <- c(time=20, statistic=2) -NAO_fcst2_mean <- rnorm(1, sd=3) -NAO_fcst2_sd <- rnorm(1, mean=5, sd=1) +dim(NAO_hind2) <- c(time = 20, statistic = 2) +NAO_fcst2_mean <- rnorm(1, sd = 3) +NAO_fcst2_sd <- rnorm(1, mean = 5, sd = 1) NAO_fcst2 <- cbind(NAO_fcst2_mean, NAO_fcst2_sd) -dim(NAO_fcst2) <- c(time=1, statistic=2) +dim(NAO_fcst2) <- c(time = 1, statistic = 2) ``` @@ -63,17 +63,17 @@ The winter index NAO and the acumulated precipiation field from the dynamical SF ``` # NAO index of a SFS to compute weights for each ensemble's member -NAO_hind <- rnorm(20 * 25, sd=2.5) +NAO_hind <- rnorm(20 * 25, sd = 2.5) dim(NAO_hind) <- c(time = 20, member = 25) -NAO_fcst <- rnorm(51, sd=2.5) +NAO_fcst <- rnorm(51, sd = 2.5) dim(NAO_fcst) <- c(time = 1, member = 51) # The acumulated precipiation field -prec_hind <- rnorm(20 * 25 * 21 * 31, mean=30, sd=10) +prec_hind <- rnorm(20 * 25 * 21 * 31, mean = 30, sd = 10) dim(prec_hind) <- c(time = 20, member = 25, lat = 21, lon = 31) prec_hind <- list(data = prec_hind) class(prec_hind) <- 's2dv_cube' -prec_fcst <- rnorm(51 * 21 * 31, mean=25,sd=8) +prec_fcst <- rnorm(51 * 21 * 31, mean = 25,sd = 8) dim(prec_fcst) <- c(time = 1, member = 51, lat = 21, lon = 31) prec_fcst <- list(data = prec_fcst) class(prec_fcst) <- 's2dv_cube' @@ -95,7 +95,7 @@ pdf_hind_best <- BEI_PDFBest(NAO_obs, NAO_hind1, NAO_hind2, index_fcst1 = NULL, index_fcst2 = NULL, method_BC = 'none', time_dim_name = 'time', na.rm = FALSE) # for forecast -pdf_fcst_best <- BEI_PDFBest (NAO_obs, NAO_hind1, NAO_hind2, index_fcst1 = NAO_fcst1, +pdf_fcst_best <- BEI_PDFBest(NAO_obs, NAO_hind1, NAO_hind2, index_fcst1 = NAO_fcst1, index_fcst2 = NAO_fcst2, method_BC = 'none', time_dim_name = 'time', na.rm = FALSE) ``` diff --git a/vignettes/Data_Considerations.Rmd b/vignettes/Data_Considerations.Rmd index bd51b7d2f8f3acb1cfbef89dc887d4c5be7ff4ed..979e1c751727edfab63b9c767ca098170986f926 100644 --- a/vignettes/Data_Considerations.Rmd +++ b/vignettes/Data_Considerations.Rmd @@ -49,9 +49,11 @@ All CSTools functions have been developed following the same guidelines. The mai A reasonable important doubt that a new user may have at this point is: what 's2dv_cube' object is? 's2dv_cube' is a class of an object storing the data and metadata in several elements: + $data element is an N-dimensional array with named dimensions containing the data (e.g.: temperature values), - + $lat($lon) element is a vector indicating the latitudinal(longitudinal) values of the region in $data, - + $Variable describes the variable name and its units - + other elements for extra metadata information + + $coords is a named list with elements of the coordinates vectors corresponding to the dimensions of the $data, + + $attrs is a named list with elements corresponding to attributes of the object. It has the following elements: + + $Variable is a list with the variable name in element $varName and with the metadata of all the variables in the $metadata element, + + $Dates is an array of dates of the $data element, + + other elements for extra metadata information It is possible to visualize an example of the structure of 's2dv_cube' object by opening an R session and running: @@ -84,12 +86,8 @@ Independently of the tool used to read the data from your local storage to your - this function creates one NetCDF file per start date with the name of the variable and the start date: `$VARNAME$_$YEAR$$MONTH$.nc` - each file has dimensions: lon, lat, ensemble and time. - - ### 4. CST_Load example - - ``` library(CSTools) library(zeallot) @@ -120,7 +118,7 @@ object_size(exp) object_size(obs) # 3.09 MB library(s2dv) -PlotEquiMap(exp$data[1,1,1,1,,], lon = exp$lon, lat= exp$lat, +PlotEquiMap(exp$data[1,1,1,1,,], lon = exp$coords$lon, lat= exp$coords$lat, filled.continents = FALSE, fileout = "Meteofrance_r360x180.png") ``` diff --git a/vignettes/ENSclustering_vignette.Rmd b/vignettes/ENSclustering_vignette.Rmd index 3994db0077a6ef872b2d81a8670a59da1fc9c205..073a32100d71cf19991b1081eaa7a88ac5ce7d04 100644 --- a/vignettes/ENSclustering_vignette.Rmd +++ b/vignettes/ENSclustering_vignette.Rmd @@ -42,7 +42,7 @@ For our example we will use the sample seasonal temperature data provided within Data can be loaded as follows: ```r -datalist <- CSTools::lonlat_temp$exp +datalist <- lonlat_temp$exp ``` The data will has the following dimension: @@ -67,7 +67,7 @@ Let's launch the clustering using 4 clusters (numclus), 4 EOFs (numpcs), 'mean' ```r results <- CST_EnsClustering(datalist, numclus = numcl, numpcs = 4, - time_moment = 'mean', cluster_dim = c('member', 'sdate')) + time_moment = 'mean', cluster_dim = c('member', 'sdate')) ``` The EnsClustering produces the following outputs saved in object results: @@ -75,7 +75,7 @@ The EnsClustering produces the following outputs saved in object results: ```r names(results) #[1] "cluster" "freq" "closest_member" "repr_field" -#[5] "composites" "lat" "on" +#[5] "composites" "lon" "lat" ``` diff --git a/vignettes/Figures/Analogs1.png b/vignettes/Figures/Analogs1.png index db966fd89c24e19585e72bce7ae9d761142eaad3..5b4f05a94584422a6e58a638888be481a197b3b0 100644 Binary files a/vignettes/Figures/Analogs1.png and b/vignettes/Figures/Analogs1.png differ diff --git a/vignettes/Figures/Analogs2.png b/vignettes/Figures/Analogs2.png index 976bde0a08be6062a456a931018f9e13e5e3fa31..eb67ce3f2b5a3b405a1c36cfbf5ec466d170e73d 100644 Binary files a/vignettes/Figures/Analogs2.png and b/vignettes/Figures/Analogs2.png differ diff --git a/vignettes/Figures/Analogs3.png b/vignettes/Figures/Analogs3.png index 3731301e0ab8661fd1b92fc45f564edd475ee077..465510727fdd046a35c8318400df73f18c57d2a0 100644 Binary files a/vignettes/Figures/Analogs3.png and b/vignettes/Figures/Analogs3.png differ diff --git a/vignettes/Figures/Analogs4.png b/vignettes/Figures/Analogs4.png index 2b1e1a364923d6da7c0cf252cc8bdf110dd1e79c..2e7c9c64c332e7521a0f140a5ed423e6bd4e5864 100644 Binary files a/vignettes/Figures/Analogs4.png and b/vignettes/Figures/Analogs4.png differ diff --git a/vignettes/Figures/Analogs5.png b/vignettes/Figures/Analogs5.png index f8632d0eb75b390892335b72a99c0dcbf5d0b737..79907c8065e536f3a7e70536ee8b79efc9eed71d 100644 Binary files a/vignettes/Figures/Analogs5.png and b/vignettes/Figures/Analogs5.png differ diff --git a/vignettes/Figures/Analogs6.png b/vignettes/Figures/Analogs6.png index c5319ed82d22e028b34cc9f6c1f180477882f35f..abbc0d7a93988ffd4f2350ba746474188896214e 100644 Binary files a/vignettes/Figures/Analogs6.png and b/vignettes/Figures/Analogs6.png differ diff --git a/vignettes/Figures/MostLikelyTercile_fig1.png b/vignettes/Figures/MostLikelyTercile_fig1.png index 86ba94a2c47762e182e936848a2529a305143434..bd282ed25fda569eaca3c29616e0e5c9d471eda5 100644 Binary files a/vignettes/Figures/MostLikelyTercile_fig1.png and b/vignettes/Figures/MostLikelyTercile_fig1.png differ diff --git a/vignettes/Figures/MostLikelyTercile_fig2.png b/vignettes/Figures/MostLikelyTercile_fig2.png index 342877bc82a89d4f355d3133332dd3b477c10296..b96854ad8a9a3ded41e0d3be2a148eda4b2c47b5 100644 Binary files a/vignettes/Figures/MostLikelyTercile_fig2.png and b/vignettes/Figures/MostLikelyTercile_fig2.png differ diff --git a/vignettes/Figures/MultiModelSkill_cor_tas_1992-2012.png b/vignettes/Figures/MultiModelSkill_cor_tas_1992-2012.png deleted file mode 100644 index 13cdd7e1203254f071e77311df840d4e706277b0..0000000000000000000000000000000000000000 Binary files a/vignettes/Figures/MultiModelSkill_cor_tas_1992-2012.png and /dev/null differ diff --git a/vignettes/Figures/MultiModelSkill_cor_tas_1993-2012.png b/vignettes/Figures/MultiModelSkill_cor_tas_1993-2012.png new file mode 100644 index 0000000000000000000000000000000000000000..d87618198c05bcc4ec3fa3e9cf26dd9c5432bb55 Binary files /dev/null and b/vignettes/Figures/MultiModelSkill_cor_tas_1993-2012.png differ diff --git a/vignettes/Figures/MultiModelSkill_rms_tas_1992-2012.png b/vignettes/Figures/MultiModelSkill_rms_tas_1992-2012.png deleted file mode 100644 index 7b10d1e9bff9b4b026a24aaf1820983b06225227..0000000000000000000000000000000000000000 Binary files a/vignettes/Figures/MultiModelSkill_rms_tas_1992-2012.png and /dev/null differ diff --git a/vignettes/Figures/MultiModelSkill_rms_tas_1993-2012.png b/vignettes/Figures/MultiModelSkill_rms_tas_1993-2012.png new file mode 100644 index 0000000000000000000000000000000000000000..199ab861e264bcb25a48959b8dfcbefb179dc473 Binary files /dev/null and b/vignettes/Figures/MultiModelSkill_rms_tas_1993-2012.png differ diff --git a/vignettes/Figures/MultiModelSkill_rmsss_tas_1992-2012.png b/vignettes/Figures/MultiModelSkill_rmsss_tas_1992-2012.png deleted file mode 100644 index 607bcb675f9cae4c44129e483f5fefedab6b8a8c..0000000000000000000000000000000000000000 Binary files a/vignettes/Figures/MultiModelSkill_rmsss_tas_1992-2012.png and /dev/null differ diff --git a/vignettes/Figures/MultiModelSkill_rmsss_tas_1993-2012.png b/vignettes/Figures/MultiModelSkill_rmsss_tas_1993-2012.png new file mode 100644 index 0000000000000000000000000000000000000000..c8ab0c65bee47265e5b1c0990ff72ce4a5f56cf2 Binary files /dev/null and b/vignettes/Figures/MultiModelSkill_rmsss_tas_1993-2012.png differ diff --git a/vignettes/Figures/MultivarRMSE_gloseas5_tas_prlr_1992-2012.png b/vignettes/Figures/MultivarRMSE_gloseas5_tas_prlr_1992-2012.png deleted file mode 100644 index c652c2d2d0f6b792b6658cdb27011f6c690211a5..0000000000000000000000000000000000000000 Binary files a/vignettes/Figures/MultivarRMSE_gloseas5_tas_prlr_1992-2012.png and /dev/null differ diff --git a/vignettes/Figures/MultivarRMSE_gloseas5_tas_prlr_1993-2012.png b/vignettes/Figures/MultivarRMSE_gloseas5_tas_prlr_1993-2012.png new file mode 100644 index 0000000000000000000000000000000000000000..14d7055e00914ddf144262d07dc8af2b6d1c8c3e Binary files /dev/null and b/vignettes/Figures/MultivarRMSE_gloseas5_tas_prlr_1993-2012.png differ diff --git a/vignettes/MostLikelyTercile_vignette.Rmd b/vignettes/MostLikelyTercile_vignette.Rmd index b9e6b6e6e95c535b01659f21027857d6519a4b93..aa9e998e1465afb5d9f9cdafb00c28e09d3de370 100644 --- a/vignettes/MostLikelyTercile_vignette.Rmd +++ b/vignettes/MostLikelyTercile_vignette.Rmd @@ -81,13 +81,13 @@ Finally, the data are loaded using `CST_Load`: ```r -c(exp,obs) %<-% CST_Load(var = clim_var, exp = forecastsys, obs = obs, - sdates = dateseq, leadtimemin = mon1, leadtimemax = monf, - lonmin = lon_min, lonmax = lon_max, - latmin = lat_min, latmax = lat_max, - storefreq = "monthly", sampleperiod = 1, nmember = 10, - output = "lonlat", method = "bilinear", - grid = paste("r", grid, sep = "")) +c(exp, obs) %<-% CST_Load(var = clim_var, exp = forecastsys, obs = obs, + sdates = dateseq, leadtimemin = mon1, leadtimemax = monf, + lonmin = lon_min, lonmax = lon_max, + latmin = lat_min, latmax = lat_max, + storefreq = "monthly", sampleperiod = 1, nmember = 10, + output = "lonlat", method = "bilinear", + grid = paste("r", grid, sep = "")) ``` Loading the data using CST_Load returns two objects, one for the experimental data and another one for the observe data, with the same elements and compatible dimensions of the data element: @@ -107,38 +107,32 @@ The latitude and longitude are saved for later use: ```r -Lat <- exp$lat -Lon <- exp$lon +Lat <- exp$coords$lat +Lon <- exp$coords$lon ``` ### 3. Computing probabilities First, anomalies of forecast and observations are computed using cross-validation on individual members: - ``` c(Ano_Exp, Ano_Obs) %<-% CST_Anomaly(exp = exp, obs = obs, cross = TRUE, memb = TRUE) ``` - The seasonal mean of both forecasts and observations are computed by averaging over the ftime dimension. - ```r Ano_Exp$data <- MeanDims(Ano_Exp$data, 'ftime') Ano_Obs$data <- MeanDims(Ano_Obs$data, 'ftime') ``` - Finally, the probabilities of each tercile are computed by evaluating which tercile is forecasted by each ensemble member for the latest forecast (2020) using the function `ProbBins` in **s2dv** and then averaging the results along the member dimension to obtain the probability of each tercile. - ```r PB <- ProbBins(Ano_Exp$data, fcyr = numyears, thr = c(1/3, 2/3), compPeriod = "Without fcyr") prob_map <- MeanDims(PB, c('sdate', 'member', 'dataset')) ``` - ### 4. Visualization with PlotMostLikelyQuantileMap @@ -177,7 +171,6 @@ PlotEquiMap(RPSS$data[[1]], lat = Lat, lon = Lon, brks = seq(-1, 1, by = 0.1), filled.continents = FALSE) ``` - ![](./Figures/MostLikelyTercile_fig2.png) diff --git a/vignettes/MultiModelSkill_vignette.Rmd b/vignettes/MultiModelSkill_vignette.Rmd index fb66f94ff09199834a87e6b7d6c5d28a658e5015..5d9d123924276abd8a7df512fcdb6228345b3084 100644 --- a/vignettes/MultiModelSkill_vignette.Rmd +++ b/vignettes/MultiModelSkill_vignette.Rmd @@ -1,6 +1,8 @@ --- author: "Nuria Perez" date: "`r Sys.Date()`" +revisor: "Eva Rifà" +revision date: "March 2023" output: rmarkdown::html_vignette vignette: > %\VignetteEngine{knitr::knitr} @@ -11,7 +13,7 @@ vignette: > Multi-model Skill Assessment ----------------------------------------- -**reference**: Mishra, N., Prodhomme, C., & Guemas, V. (2018). Multi-Model Skill Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, 29-31. +**reference**: Mishra, N., Prodhomme, C., & Guemas, V. (2018). Multi-Model Skill Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, 29-31. The R package s2dv should be loaded by running: @@ -20,7 +22,6 @@ The R package s2dv should be loaded by running: library(s2dv) ``` - Library *CSTools*, should be installed from CRAN and loaded: @@ -29,7 +30,6 @@ install.packages("CSTools") library(CSTools) ``` - ### 1.- Load data In this case, the seasonal temperature forecasted, initialized in November, will be used to assess the EUROSIP multi-model seasonal forecasting system consists of a number of independent coupled seasonal forecasting systems integrated into a common framework. From September 2012, the systems include those from ECMWF, the Met Office, Meteo-France and NCEP. @@ -43,11 +43,11 @@ clim_var = 'tas' ``` -The simulations available for these models are covering the period 1992-2012. So, the starting and ending dates can be defined by running the following lines: +The simulations available for these models are covering the period 1993-2012. So, the starting and ending dates can be defined by running the following lines: ```r -ini <- 1992 +ini <- 1993 fin <- 2012 start <- as.Date(paste(ini, mth, "01", sep = ""), "%Y%m%d") end <- as.Date(paste(fin, mth, "01", sep = ""), "%Y%m%d") @@ -69,7 +69,7 @@ Ask nuria.perez at bsc.es to achieve the data to run the recipe. ```r require(zeallot) -glosea5 <- '/esarchive/exp/glosea5/glosea5c3s/$STORE_FREQ$_mean/$VAR_NAME$_f6h/$VAR_NAME$_$START_DATE$.nc' +glosea5 <- '/esarchive/exp/glosea5/glosea5c3s/$STORE_FREQ$_mean/$VAR_NAME$_f6h/$VAR_NAME$_$YEAR$$MONTH$.nc' c(exp, obs) %<-% CST_Load(var = clim_var, exp = list(list(name = 'glosea5', path = glosea5), @@ -80,13 +80,13 @@ c(exp, obs) %<-% storefreq = "monthly", sampleperiod = 1, nmember = 9, output = "lonlat", method = "bilinear", grid = paste("r", grid, sep = "")) -#save(exp, obs, file = "../tas_toydata.RData") +# save(exp, obs, file = "../tas_toydata.RData") # Or use the following line to load the file provided in .RData format: -load(file = "./tas_toydata.RData") +# load(file = "./tas_toydata.RData") ``` -There should be two new elements loaded in the R working environment: `exp` and `obs`, containing the experimental and the observed data for temperature. It's possible to check that they are of class `sd2v_cube` by running: +There should be two new elements loaded in the R working environment: `exp` and `obs`, containing the experimental and the observed data for temperature. It is possible to check that they are of class `sd2v_cube` by running: ``` @@ -99,20 +99,18 @@ The corresponding data is saved in the element `data` of each object, while othe ```r > dim(exp$data) dataset member sdate ftime lat lon - 3 9 21 3 35 64 + 3 9 20 3 35 64 > dim(obs$data) dataset member sdate ftime lat lon - 1 1 21 3 35 64 -Lat <- exp$lat -Lon <- exp$lon + 1 1 20 3 35 64 +Lat <- exp$coords$lat +Lon <- exp$coords$lon ``` - ### 2.- Computing and plotting Anomaly Correlation Coefficient The Anomaly Correlation Coefficient (ACC) is the most widely used skill metric for Seasonal Climate Forecast quality (Mishra et al., 2018). - First step is to compute the anomalies over the loaded data applying cross validation technique on individual members by running: ``` @@ -123,11 +121,9 @@ The dimensions are preserved: ``` > str(ano_exp$data) - num [1:3, 1:9, 1:21, 1:3, 1:35, 1:64] -1.647 -0.478 -0.096 1.575 1.086 ... - - attr(*, "dimensions")= chr [1:6] "dataset" "member" "sdate" "ftime" ... + num [1:20, 1:3, 1:9, 1:3, 1:35, 1:64] -1.3958 -0.0484 -0.1326 0.3621 -5.6905 ... > str(ano_obs$data) - num [1, 1, 1:21, 1:3, 1:35, 1:64] 0.0235 1.546 1.3885 -0.344 -5.972 ... - - attr(*, "dimensions")= chr [1:6] "dataset" "member" "sdate" "ftime" ... + num [1:20, 1, 1, 1:3, 1:35, 1:64] 1.551 1.393 -0.344 -5.986 -0.27 ... ``` The ACC is obtained by running the `CST_MultiMetric` function defining the parameter 'metric' as correlation. The function also includes the option of computing the Multi-Model Mean ensemble (MMM). @@ -135,10 +131,9 @@ The ACC is obtained by running the `CST_MultiMetric` function defining the param ```r AnomDJF <- CST_MultiMetric(exp = ano_exp, obs = ano_obs, metric = 'correlation', - multimodel = TRUE) + multimodel = TRUE) ``` - The output of the function `CST_MultiMetric` is a object of class `s2dv_cube`, it contains the result of the metric, in this case correlation, in the `data` element (including the correlation for the MMM in the latest position). While other relevant data is being stored in the corresponding element of the object: @@ -146,14 +141,13 @@ While other relevant data is being stored in the corresponding element of the ob ```r > str(AnomDJF$data) List of 4 - $ corr : num [1:4, 1, 1:35, 1:64] 0.586 0.614 0.143 0.501 0.419 ... - $ p.val : num [1:4, 1, 1:35, 1:64] 0.0026 0.00153 0.26805 0.01036 0.02931 ... - $ conf.lower: num [1:4, 1, 1:35, 1:64] 0.2073 0.2485 -0.3076 0.0883 -0.0154 ... - $ conf.upper: num [1:4, 1, 1:35, 1:64] 0.812 0.827 0.541 0.767 0.72 ... + $ corr : num [1:4, 1, 1:35, 1:64] 0.584 0.649 0.131 0.565 0.484 ... + $ p.val : num [1:4, 1, 1:35, 1:64] 0.0034 0.000989 0.291589 0.00475 0.015262 ... + $ conf.lower: num [1:4, 1, 1:35, 1:64] 0.192 0.289 -0.331 0.163 0.053 ... + $ conf.upper: num [1:4, 1, 1:35, 1:64] 0.816 0.848 0.542 0.806 0.763 ... > names(AnomDJF) -[1] "data" "lon" "lat" "Variable" "Datasets" "Dates" -[7] "when" "source_files" "load_parameters" -> names(AnomDJF$Datasets) +[1] "data" "dims" "coords" "attrs" +> names(AnomDJF$attrs$Datasets) [1] "glosea5" "ecmwf/system4_m1" "meteofrance/system5_m1" "erainterim" ``` @@ -169,15 +163,14 @@ PlotCombinedMap(AnomDJF$data$corr[,1,,], lon = Lon, lat = Lat, map_select_fun = c('white', 'darkblue'), c('white', 'darkred'), c('white', 'darkorange')), - bar_titles = c("MMM", names(AnomDJF$Datasets)), - fileout = "./vignettes/Figures/MultiModelSkill_cor_tas_1992-2012.png", + bar_titles = c("MMM", names(AnomDJF$attrs$Datasets)), width = 14, height = 8) ``` -The next figure is the map of the maximum positive Anomaly Correlation Coefficient (ACC) among the three individual models from EUROSIP and the multimodel ensemble. ACC for each model is calculated between their respective predicted ensemble mean anomalies and the anomalies of the observed temperature obtained from ERAINT for winter (DJF) seasons over the period 1992-2012. Blue, red, yellow and black colors indicate that the maximum correlation is obtained for GloSea5, ECMWF, MF and the Multi-Model Mean respectively (similar to figure 3 in Mishra et al., 2018). +The next figure is the map of the maximum positive Anomaly Correlation Coefficient (ACC) among the three individual models from EUROSIP and the multimodel ensemble. ACC for each model is calculated between their respective predicted ensemble mean anomalies and the anomalies of the observed temperature obtained from ERAINT for winter (DJF) seasons over the period 1993-2012. Blue, red, yellow and black colors indicate that the maximum correlation is obtained for GloSea5, ECMWF, MF and the Multi-Model Mean respectively (similar to figure 3 in Mishra et al., 2018). -![Max Skills Correlation](../vignettes/Figures/MultiModelSkill_cor_tas_1992-2012.png) +![Max Skills Correlation](./Figures/MultiModelSkill_cor_tas_1993-2012.png) @@ -187,7 +180,7 @@ The same function can be used to compute the RMS error by defining the parameter ```r AnomDJF <- CST_MultiMetric(exp = ano_exp, obs = ano_obs, metric = 'rms', - multimodel = TRUE) + multimodel = TRUE) ``` The following lines are necessary to obtain the plot which visualizes the best model given this metric for each grid point. @@ -201,12 +194,11 @@ PlotCombinedMap(AnomDJF$data$rms[,1,,], lon = Lon, lat = Lat, map_select_fun = m c('darkblue', 'white'), c('darkred', 'white'), c('darkorange', 'white')), - bar_titles = c("MMM", names(AnomDJF$Datasets)), - fileout = "./vignettes/Figures/MultiModelSkill_rms_tas_1992-2012.png", + bar_titles = c("MMM", names(AnomDJF$attrs$Datasets)), width = 14, height = 8) ``` -![Max Skills RMS](../vignettes/Figures/MultiModelSkill_rms_tas_1992-2012.png) +![Max Skills RMS](./Figures/MultiModelSkill_rms_tas_1993-2012.png) ### 4.- Computing and plotting Root Mean Square error Skill Scores (RMSSS) @@ -230,9 +222,8 @@ PlotCombinedMap(AnomDJF$data$rmsss[,1,,], lon = Lon, lat = Lat, c('white', 'darkblue'), c('white', 'darkred'), c('white', 'darkorange')), - bar_titles = c("MMM", names(AnomDJF$Datasets)), - fileout = "./vignettes/Figures/MultiModelSkill_rmsss_tas_1992-2012.png", + bar_titles = c("MMM", names(AnomDJF$attrs$Datasets)), width = 14, height = 8) ``` -![Max Skills RMSSS](../vignettes/Figures/MultiModelSkill_rmsss_tas_1992-2012.png) +![Max Skills RMSSS](./Figures/MultiModelSkill_rmsss_tas_1993-2012.png) diff --git a/vignettes/MultivarRMSE_vignette.Rmd b/vignettes/MultivarRMSE_vignette.Rmd index edfbe0b8a5b95507dadac736943146ef7cf3bb07..744350a8a083a9b0fdbe1a4e75cfec78c45951d6 100644 --- a/vignettes/MultivarRMSE_vignette.Rmd +++ b/vignettes/MultivarRMSE_vignette.Rmd @@ -1,6 +1,8 @@ --- author: "Deborah Verfaillie" date: "`r Sys.Date()`" +revisor: "Eva Rifà" +revision date: "March 2023" output: rmarkdown::html_vignette vignette: > %\VignetteEngine{knitr::knitr} @@ -20,10 +22,8 @@ library(s2dv) library(RColorBrewer) ``` - Library *CSTools*, should be installed from CRAN and loaded: - ```r install.packages("CSTools") library(CSTools) @@ -36,26 +36,23 @@ In this example, the seasonal temperature and precipitation forecasts, initializ The parameters defined are the initializing month and the variables: - ```{r cars} mth = '11' temp = 'tas' precip = 'prlr' ``` - -The simulations available for this model cover the period 1992-2012. So, the starting and ending dates can be defined by running the following lines: +The simulations available for this model cover the period 1993-2012. So, the starting and ending dates can be defined by running the following lines: ```r -ini <- 1992 +ini <- 1993 fin <- 2012 start <- as.Date(paste(ini, mth, "01", sep = ""), "%Y%m%d") end <- as.Date(paste(fin, mth, "01", sep = ""), "%Y%m%d") dateseq <- format(seq(start, end, by = "year"), "%Y%m%d") ``` - The grid in which all data will be interpolated should be also specified. The observational dataset used in this example is the EraInterim. @@ -70,28 +67,30 @@ Ask nuria.perez at bsc.es for the data to run the recipe. ```r require(zeallot) -glosea5 <- list(name = 'glosea5', path = '/esarchive/exp/glosea5/glosea5c3s/$STORE_FREQ$_mean/$VAR_NAME$_f6h/$VAR_NAME$_$START_DATE$.nc') +glosea5 <- '/esarchive/exp/glosea5/glosea5c3s/$STORE_FREQ$_mean/$VAR_NAME$_f6h/$VAR_NAME$_$YEAR$$MONTH$.nc' + +c(exp_T, obs_T) %<-% + CST_Load(var = temp, exp = list(list(name = 'glosea5', path = glosea5)), + obs = obs, sdates = dateseq, leadtimemin = 2, leadtimemax = 4, + latmin = 25, latmax = 75, lonmin = -20, lonmax = 70, output = 'lonlat', + nprocs = 1, storefreq = "monthly", sampleperiod = 1, nmember = 9, + method = "bilinear", grid = paste("r", grid, sep = "")) - c(exp_T, obs_T) %<-% - CST_Load(var = temp, exp = list(glosea5), - obs = obs, sdates = dateseq, leadtimemin = 2, leadtimemax = 4, - latmin = 25, latmax = 75, lonmin = -20, lonmax = 70, output = 'lonlat', - nprocs = 1, storefreq = "monthly", sampleperiod = 1, nmember = 9, - method = "bilinear", grid = paste("r", grid, sep = "")) +glosea5 <- '/esarchive/exp/glosea5/glosea5c3s/$STORE_FREQ$_mean/$VAR_NAME$_f24h/$VAR_NAME$_$YEAR$$MONTH$.nc' c(exp_P, obs_P) %<-% - CST_Load(var = precip, exp = list(glosea5), - obs = obs, sdates = dateseq, leadtimemin = 2, leadtimemax = 4, - latmin = 25, latmax = 75, lonmin = -20, lonmax = 70, output = 'lonlat', - nprocs = 1, storefreq = "monthly", sampleperiod = 1, nmember = 9, - method = "bilinear", grid = paste("r", grid, sep = "")) -#save(exp_T, obs_T, exp_P, obs_P, file = "./tas_prlr_toydata.RData") + CST_Load(var = precip, exp = list(list(name = 'glosea5', path = glosea5)), + obs = obs, sdates = dateseq, leadtimemin = 2, leadtimemax = 4, + latmin = 25, latmax = 75, lonmin = -20, lonmax = 70, output = 'lonlat', + nprocs = 1, storefreq = "monthly", sampleperiod = 1, nmember = 9, + method = "bilinear", grid = paste("r", grid, sep = "")) +# save(exp_T, obs_T, exp_P, obs_P, file = "./tas_prlr_toydata.RData") # Or use the following line to load the file provided in .RData format: -load(file = "./tas_prlr_toydata.RData") +# load(file = "./tas_prlr_toydata.RData") ``` -There should be four new elements loaded in the R working environment: `exp_T`, `obs_T`, `exp_P` and `obs_P`. The first two elements correspond to the experimental and observed data for temperature and the other are the equivalent for the precipitation data. It's possible to check that they are of class `sd2v_cube` by running: +There should be four new elements loaded in the R working environment: `exp_T`, `obs_T`, `exp_P` and `obs_P`. The first two elements correspond to the experimental and observed data for temperature and the other are the equivalent for the precipitation data. It is possible to check that they are of class `sd2v_cube` by running: ``` @@ -107,18 +106,18 @@ Loading the data using `CST_Load` allows to obtain two lists, one for the experi ``` > dim(exp_T$data) dataset member sdate ftime lat lon - 1 9 21 3 35 64 + 1 9 20 3 35 64 > dim(obs_T$data) dataset member sdate ftime lat lon - 1 1 21 3 35 64 + 1 1 20 3 35 64 ``` Latitudes and longitudes of the common grid can be saved: ```r -Lat <- exp_T$lat -Lon <- exp_T$lon +Lat <- exp_T$coords$lat +Lon <- exp_T$coords$lon ``` The next step is to compute the anomalies of the experimental and observational data using `CST_Anomaly` function, which could be applied over data from each variable, and in this case it's compute applying cross validation technique over individual members: @@ -132,11 +131,9 @@ The original dimensions are preserved and the anomalies are stored in the `data` ``` > str(ano_exp_T$data) - num [1, 1:9, 1:21, 1:3, 1:35, 1:64] -1.647 1.575 2.77 0.048 -1.886 ... - - attr(*, "dimensions")= chr [1:6] "dataset" "member" "sdate" "ftime" ... + num [1:20, 1, 1:9, 1:3, 1:35, 1:64] -1.3958 -0.0484 -0.1326 0.3621 -5.6905 ... > str(ano_obs_T$data) - num [1, 1, 1:21, 1:3, 1:35, 1:64] 0.0235 1.546 1.3885 -0.344 -5.972 ... - - attr(*, "dimensions")= chr [1:6] "dataset" "member" "sdate" "ftime" ... + num [1:20, 1, 1, 1:3, 1:35, 1:64] 1.551 1.393 -0.344 -5.986 -0.27 ... ``` Two lists containing the experiment ,`ano_exp`, and the observation, `ano_obs`, lists should be put together to serve as input of the function to compute multivariate RMSEs. @@ -160,31 +157,68 @@ It is obtained by running the `CST_MultivarRMSE` function: mvrmse <- CST_MultivarRMSE(exp = ano_exp, obs = ano_obs, weight) ``` - The function `CST_MultivarRMSE` returns the multivariate RMSE value for 2 or more variables. The output is a CSTool object containing the RMSE values in the `data` element and other relevant information: ```r > class(mvrmse) > str(mvrmse$data) - num [1, 1, 1, 1:35, 1:64] 0.764 0.8 0.67 0.662 0.615 ... -> str(mvrmse$Variable) - Named chr [1:2] "tas" "prlr" - - attr(*, "names")= chr [1:2] "varName" "varName" + num [1, 1, 1:35, 1:64] 1002261 1034354 1041180 1034907 1238147 ... +> str(mvrmse$attrs$Variable) +List of 2 + $ varName : chr [1:2] "tas" "prlr" + $ metadata:List of 6 + ..$ tas :List of 7 + .. ..$ use_dictionary : logi FALSE + .. ..$ units : chr "K" + .. ..$ longname : chr "2 metre temperature" + .. ..$ description : chr "none" + .. ..$ daily_agg_cellfun : chr "none" + .. ..$ monthly_agg_cellfun: chr "none" + .. ..$ verification_time : chr "none" + ..$ lon : num [1:64(1d)] 0 1.41 2.81 4.22 5.62 ... + .. ..- attr(*, "cdo_grid_name")= chr "r256x128" + .. ..- attr(*, "data_across_gw")= logi TRUE + .. ..- attr(*, "array_across_gw")= logi FALSE + .. ..- attr(*, "first_lon")= num 340 + .. ..- attr(*, "last_lon")= num 68.9 + .. ..- attr(*, "projection")= chr "none" + ..$ lat : num [1:35(1d)] 73.8 72.4 71 69.6 68.2 ... + .. ..- attr(*, "cdo_grid_name")= chr "r256x128" + .. ..- attr(*, "first_lat")= num [1(1d)] 26 + .. ..- attr(*, "last_lat")= num [1(1d)] 73.8 + .. ..- attr(*, "projection")= chr "none" + ..$ prlr:List of 7 + .. ..$ use_dictionary : logi FALSE + .. ..$ units : chr "m s-1" + .. ..$ longname : chr "Total precipitation" + .. ..$ description : chr "none" + .. ..$ daily_agg_cellfun : chr "none" + .. ..$ monthly_agg_cellfun: chr "none" + .. ..$ verification_time : chr "none" + ..$ lon : num [1:64(1d)] 0 1.41 2.81 4.22 5.62 ... + .. ..- attr(*, "cdo_grid_name")= chr "r256x128" + .. ..- attr(*, "data_across_gw")= logi TRUE + .. ..- attr(*, "array_across_gw")= logi FALSE + .. ..- attr(*, "first_lon")= num 340 + .. ..- attr(*, "last_lon")= num 68.9 + .. ..- attr(*, "projection")= chr "none" + ..$ lat : num [1:35(1d)] 73.8 72.4 71 69.6 68.2 ... + .. ..- attr(*, "cdo_grid_name")= chr "r256x128" + .. ..- attr(*, "first_lat")= num [1(1d)] 26 + .. ..- attr(*, "last_lat")= num [1(1d)] 73.8 + .. ..- attr(*, "projection")= chr "none" ``` - The following lines plot the multivariate RMSE ```r PlotEquiMap(mvrmse$data, lon = Lon, lat = Lat, filled.continents = FALSE, - toptitle = "Multivariate RMSE tas, prlr 1992 - 2012", colNA = "white", - bar_limits = c(0,2.5), cols = brewer.pal(n=5,name='Reds'), - fileout = "./MultivarRMSE_gloseas5_tas_prlr_1992-2012.png") + toptitle = "Multivariate RMSE tas, prlr 1993 - 2012", colNA = "white") ``` -![Multivariate RMSE](./Figures/MultivarRMSE_gloseas5_tas_prlr_1992-2012.png) +![Multivariate RMSE](./Figures/MultivarRMSE_gloseas5_tas_prlr_1993-2012.png) diff --git a/vignettes/PlotForecastPDF.Rmd b/vignettes/PlotForecastPDF.Rmd index 457e7d29a328e8fbccdf6999c11bdd89854558a7..bafbe7d6d001f804329b9219eb5aa823e71e1d9b 100644 --- a/vignettes/PlotForecastPDF.Rmd +++ b/vignettes/PlotForecastPDF.Rmd @@ -24,7 +24,7 @@ The first step is to put your forecasts in an appropriate format. For this vigne ```{r,fig.show = 'hide',warning=F} fcst <- data.frame(fcst1 = rnorm(mean = 25, sd = 3, n = 30), - fcst2 = rnorm(mean = 23, sd = 4.5, n = 30)) + fcst2 = rnorm(mean = 23, sd = 4.5, n = 30)) PlotForecastPDF(fcst, tercile.limits = c(20, 26)) ``` @@ -42,11 +42,11 @@ Some parameters allow to customize your plot by changing the title, the forecast ```{r,fig.show = 'hide',warning=F} fcst <- data.frame(fcst1 = rnorm(mean = 25, sd = 3, n = 30), - fcst2 = rnorm(mean = 23, sd = 4.5, n = 30)) + fcst2 = rnorm(mean = 23, sd = 4.5, n = 30)) PlotForecastPDF(fcst, tercile.limits = c(20, 26), var.name = "Temperature (ºC)", - title = "Forecasts valid for 2019-01-01 at Sunny Hills", - fcst.names = c("model a", "model b"), - color.set = "s2s4e") + title = "Forecasts valid for 2019-01-01 at Sunny Hills", + fcst.names = c("model a", "model b"), + color.set = "s2s4e") ``` ![Example 2](./Figures/PlotForecastPDF_ex2.png) @@ -55,12 +55,12 @@ Optionally, we can include the probability of extreme values or the actually obs ```{r,fig.show = 'hide',warning=F} fcst <- data.frame(fcst1 = rnorm(mean = 25, sd = 3, n = 30), - fcst2 = rnorm(mean = 28, sd = 4.5, n = 30), fcst3 = rnorm(mean = 17, sd = 3, n = 30)) + fcst2 = rnorm(mean = 28, sd = 4.5, n = 30), fcst3 = rnorm(mean = 17, sd = 3, n = 30)) PlotForecastPDF(fcst, tercile.limits = rbind(c(20, 26), c(22, 28), c(15, 22)), - var.name = "Temperature (ºC)", title = "Forecasts at Sunny Hills", - fcst.names = c("January", "February", "March"), obs = c(21, 24, 17), - extreme.limits = rbind(c(18, 28), c(20, 30), c(12, 24)), - color.set="s2s4e") + var.name = "Temperature (ºC)", title = "Forecasts at Sunny Hills", + fcst.names = c("January", "February", "March"), obs = c(21, 24, 17), + extreme.limits = rbind(c(18, 28), c(20, 30), c(12, 24)), + color.set = "s2s4e") ``` ![Example 3](./Figures/PlotForecastPDF_ex3.png) @@ -71,8 +71,8 @@ PlotForecastPDF uses ggplot2, so you can save the output of the function to a va ``` library(ggplot2) fcst <- array(rnorm(mean = 25, sd = 2, n = 90), dim = c(member = 30, 3)) -plot <-PlotForecastPDF(fcst, tercile.limits = c(23, 27)) -ggsave("outfile.pdf", plot, width=7, height=5) +plot <- PlotForecastPDF(fcst, tercile.limits = c(23, 27)) +ggsave("outfile.pdf", plot, width = 7, height = 5) ``` ### 5.- A reproducible example using lonlat_temp @@ -82,9 +82,9 @@ This final example uses the sample lonlat data from CSTools. It is suitable for fcst <- data.frame(fcst1 = lonlat_temp$exp$data[1,,1,1,1,1] - 273.15, fcst2 = lonlat_temp$exp$data[1,,1,2,1,1] - 273.15) PlotForecastPDF(fcst, tercile.limits = c(5, 7), extreme.limits = c(4, 8), - var.name = "Temperature (ºC)", - title = "Forecasts initialized on Nov 2000 at sample Mediterranean region", - fcst.names = c("November", "December")) + var.name = "Temperature (ºC)", + title = "Forecasts initialized on Nov 2000 at sample Mediterranean region", + fcst.names = c("November", "December")) ``` ![Example 4](./Figures/PlotForecastPDF_ex4.png) diff --git a/vignettes/RainFARM_vignette.Rmd b/vignettes/RainFARM_vignette.Rmd index 5fe249f3ab7888babb68c592df3d41a1d460f7dd..28ab753ded112a1d670fb8a00abc6dc9d0f8e0b8 100644 --- a/vignettes/RainFARM_vignette.Rmd +++ b/vignettes/RainFARM_vignette.Rmd @@ -45,25 +45,25 @@ exp <- lonlat_prec This gives us a CSTools object `exp`, containing an element `exp$data` with dimensions: ```{r} dim(exp$data) -#dataset member sdate ftime lat lon -# 1 6 3 31 4 4 +# dataset member sdate ftime lat lon +# 1 6 3 31 4 4 ``` There are 6 ensemble members available in the data set, 3 starting dates and 31 forecast times, which refer to daily values in the month of March following starting dates on November 1st in the years 2010, 2011, 2012. Please notice that RainFARM (in this version) only accepts square domains, possibly with an even number of pixels on each side, so we always need to select an appropriate cutout. Also, there are time and memory limitations when a large ensemble of downscaled realizations is generated with RainFARM, so that selecting a smaller target area is advised. On the other hand, if spectral slopes are to be determined from the large scales we will still need enough resolution to allow this estimation. In this example we have preselected a 4x4 pixel cutout at resolution 1 degree in a smaller area lon=[6,9], lat=[44,47] covering Northern Italy. ```{r} -ilon <- which ( exp$lon %in% 5:12 ) -ilat <- which ( exp$lat %in% 40:47 ) -exp$data <- exp$data[ , , , , ilon, ilat, drop=FALSE] +ilon <- which(exp$coords$lon %in% 5:12) +ilat <- which(exp$coords$lat %in% 40:47 ) +exp$data <- exp$data[ , , , , ilon, ilat, drop = FALSE] names(dim(exp$data)) <- names(dim(lonlat_prec$data)) -exp$lon <- exp$lon[ilon] -exp$lat <- exp$lat[ilat] +exp$coords$lon <- exp$coords$lon[ilon] +exp$coords$lat <- exp$coords$lat[ilat] ``` ### Standard downscaling without climatological weights Our goal is to downscale with RainFARM these data from the resolution of 1 degree (about 100 km at these latitudes) to 0.05 degrees (about 5 km) using the `CST_RainFARM()` function. This means that we need to increase resolution by a factor `nf = 20`. RainFARM can compute automatically its only free parameter, i.e. the spatial spectral slope, from the large-scale field (here only with size 4x4 pixel, but in general we reccomend selecting at least 8x8 pixels). -In this example we would like to compute this slope as an average over the _member_ and _ftime_ dimensions, while we will use different slopes for the remaining _dataset_ and _sdate_ dimensions (a different choice may be more appropriate in a real application). To obtain this we specify the parameter `time_dim = c("member", "ftime")`. The slope is computed starting from the wavenumber corresponding to the box, `kmin=1`. We create 3 stochastic realizations for each dataset, member, starting date and forecast time with `nens=5`. The command to donwscale and the resulting fields are: +In this example we would like to compute this slope as an average over the _member_ and _ftime_ dimensions, while we will use different slopes for the remaining _dataset_ and _sdate_ dimensions (a different choice may be more appropriate in a real application). To obtain this we specify the parameter `time_dim = c("member", "ftime")`. The slope is computed starting from the wavenumber corresponding to the box, `kmin = 1`. We create 3 stochastic realizations for each dataset, member, starting date and forecast time with `nens = 5`. The command to donwscale and the resulting fields are: ```{r} exp_down <- CST_RainFARM(exp, nf = 20, kmin = 1, nens = 3, @@ -71,10 +71,10 @@ exp_down <- CST_RainFARM(exp, nf = 20, kmin = 1, nens = 3, dim(exp_down$data) # dataset member realization sdate ftime lat lon -# 1 6 3 3 31 80 80 -str(exp_down$lon) +# 1 6 3 3 31 80 80 +str(exp_down$coords$lon) # num [1:80] 5.53 5.58 5.62 5.67 5.72 ... -str(exp_down$lat) +str(exp_down$coords$lat) # num [1:80] 47.5 47.4 47.4 47.3 47.3 ... ``` The function returns an array `exp_down$data` with the additional "realization" dimension for the stochastic ensemble with 3 members. The longitudes and latitudes have been correspondingly interpolated to the finer resolution. @@ -82,7 +82,7 @@ The function returns an array `exp_down$data` with the additional "realization" Alternatively we could have used the "reduced" function `RainFARM` which accepts directly a data array (with arbitrary dimensions, provided a longitude, a latitude and a "time" dimension exist) and two arrays to describe longitudes and latitudes: ```{r} -downscaled <- RainFARM(exp$data, exp$lon, exp$lat, +downscaled <- RainFARM(exp$data, exp$coords$lon, exp$coords$lat, nf = 20, kmin = 1, nens = 3, time_dim = c("member", "ftime")) ``` @@ -92,18 +92,18 @@ Each instant and each realization will of course be different, but let's plot an ```{r} a <- exp$data[1, 1, 1, 17, , ] * 86400 * 1000 a[a > 60] <- 60 -image(exp$lon, rev(exp$lat), t(apply(a, 2, rev)), xlab = "lon", ylab = "lat", +image(exp$coords$lon, rev(exp$coords$lat), t(apply(a, 2, rev)), xlab = "lon", ylab = "lat", col = rev(terrain.colors(20)), zlim = c(0,60)) map("world", add = TRUE) title(main = "pr 17/03/2010 original") a <- exp_down$data[1, 1, 1, 1, 17, , ] * 86400 * 1000 a[a > 60] <- 60 -image(exp_down$lon, rev(exp_down$lat), t(apply(a, 2, rev)), xlab = "lon", ylab = "lat", +image(exp_down$coords$lon, rev(exp_down$coords$lat), t(apply(a, 2, rev)), xlab = "lon", ylab = "lat", col = rev(terrain.colors(20)), zlim = c(0, 60)) map("world", add = TRUE) title(main = "pr 17/03/2010 downscaled") @@ -118,10 +118,10 @@ RainFARM has downscaled the original field with a realistic fine-scale correlati The area of interest in our example presents a complex orography, but the basic RainFARM algorithm used does not consider topographic elevation in deciding how to distribute fine-scale precipitation. A long term climatology of the downscaled fields would have a resolution comparable to that of the original coarse fields and would not resemble the fine-scale structure of an observed climatology. If an external fine-scale climatology of precipitation is available, we can use the method discussed in Terzago et al. (2018) to change the distribution of precipitation by RainFARM for each timestep, so that the long-term average is close to this reference climatology in terms of precipitation distribution (while the total precipitation amount of the original fields to downscale is preserved). -Suitable climatology files could be for example a fine-scale precipitation climatology from a high-resolution regional climate model (see e.g. Terzago et al. 2018), a local high-resolution gridded climatology from observations, or a reconstruction such as those which can be downloaded from the WORLDCLIM (https://www.worldclim.org) or CHELSA (chelsa-climate.org) websites. The latter data will need to be converted to NetCDF format before being used (see for example the GDAL tools (https://gdal.org). +Suitable climatology files could be for example a fine-scale precipitation climatology from a high-resolution regional climate model (see e.g. Terzago et al. 2018), a local high-resolution gridded climatology from observations, or a reconstruction such as those which can be downloaded from the WORLDCLIM (https://www.worldclim.org) or CHELSA (https://chelsa-climate.org/) websites. The latter data will need to be converted to NetCDF format before being used (see for example the GDAL tools (https://gdal.org/). We will assume that a copy of the WORLDCLIM precipitation climatology at 30 arcseconds (about 1km resolution) is available in the local file `medscope.nc`. From this file we can derive suitable weights to be used with RainFARM using the `CST_RFWeights` functions as follows: ```{r} -ww <- CST_RFWeights("./worldclim.nc", nf = 20, lon = exp$lon, lat = exp$lat) +ww <- CST_RFWeights("./worldclim.nc", nf = 20, lon = exp$coords$lon, lat = exp$coords$lat) ``` The result is a two-dimensional weights matrix with the same `lon`and `lat` dimensions as requested. The weights (varying around an average value of 1) encode how to distribute differently precipitation in each stochastic realization of RainFARM. @@ -148,19 +148,19 @@ png("Figures/RainFARM_fig2.png", width = 640, height = 243) par(mfrow = c(1,3)) a <- exp_down_weights$data[1, 1, 1, 1, 17, , ] * 86400 * 1000 a[a > 60] <- 60 -image(exp_down$lon, rev(exp_down$lat), t(apply(a, 2, rev)), xlab = "lon", +image(exp_down$coords$lon, rev(exp_down$coords$lat), t(apply(a, 2, rev)), xlab = "lon", ylab = "lat", col = rev(terrain.colors(20)), zlim = c(0, 60)) map("world", add = TRUE) title(main = "pr 17/03/2010 with weights") a <- ad * 86400 * 1000 a[a > 5] <- 5 -image(exp_down$lon, rev(exp_down$lat), t(apply(a, 2, rev)), xlab = "lon", - ylab="lat", col = rev(terrain.colors(20)), zlim = c(0, 5)) +image(exp_down$coords$lon, rev(exp_down$coords$lat), t(apply(a, 2, rev)), xlab = "lon", + ylab = "lat", col = rev(terrain.colors(20)), zlim = c(0, 5)) map("world", add = TRUE) title(main = "climatology no weights") a <- adw * 86400 * 1000 a[a > 5] <- 5 -image(exp_down$lon, rev(exp_down$lat), t(apply(a, 2, rev)), xlab = "lon", +image(exp_down$coords$lon, rev(exp_down$coords$lat), t(apply(a, 2, rev)), xlab = "lon", ylab = "lat", col = rev(terrain.colors(20)), zlim = c(0, 5)) map("world", add = TRUE) title(main = "climatology with weights") diff --git a/vignettes/WeatherRegimes_vignette.Rmd b/vignettes/WeatherRegimes_vignette.Rmd index 788b25761eaac40a729cd90132b2d2abf5335a23..899067c556ab846bb6170122e0f0b7cb462b622d 100644 --- a/vignettes/WeatherRegimes_vignette.Rmd +++ b/vignettes/WeatherRegimes_vignette.Rmd @@ -45,12 +45,10 @@ c(exp, obs) %<-% CST_Load(var = 'psl', exp = 'system4_m1', lonmin = 274.5, lonmax = 45, output = 'lonlat') ``` - Notice that you need the files to be stored locally in your computer or server with correct configuration file. If you are interested into run this vignette, contact nuria.perez at bsc.es to get a data sample. The objects returned by `CST_Load()` are s2v_cube class. They contains among others, the array with the requested data. - ```r > dim(exp$data) dataset member sdate ftime lat lon @@ -59,13 +57,10 @@ dataset member sdate ftime lat lon dataset member sdate ftime lat lon 1 1 20 31 77 186 ``` - - ### 3- Daily anomalies based on a smoothed climatology) The weather regimes classification is based on daily anomalies, which have been computed by following these steps: - ```r c(ano_exp, ano_obs) %<-% CST_Anomaly(exp = exp, obs = obs, filter_span = 1) ``` @@ -78,25 +73,19 @@ The LOESS filter has been applied to the climatology to remove the short-term va `CST_WeatherRegimes()` function is used to define the clusters based on the sea level pressure anomalies from ERA-Interim. This function is based on the [*kmeans function*](https://stat.ethz.ch/R-manual/R-devel/library/stats/html/kmeans.html) from the stats R package. In this example we have made different assumptions: four clusters (`ncenters=4`) will be produced and the Empirical orthogonal functions are not used to filter the data (`EOFS=FALSE`) just to take into account the extreme values. More details about the methodology can be found in Cortesi et al. 2018 (submitted). - ```r WR_obs <- CST_WeatherRegimes(data = ano_obs, EOFs = FALSE, ncenters = 4) ``` - `CST_WeatherRegime()` provides a s2dv_cube object with several elements. `$data` the 4 weather regimes composites are stored while `$statistics` contains extra information (`$pvalue`, `$cluster`, `$persistence` and `$frequency`) which are the needed parameters for the weather regimes assessment. Further details about the outputs provided by the `CST_WeatherRegime()` function can be found in the package documentation or typing `?CST_WeatherRegimes` in the R session. - - ### 5- Visualisation of the observed weather regimes - To plot the composite maps of each regime and the mean frequencies of each cluster, we have employed the `PlotLayout()` and `PlotEquiMap()` functions available in s2dv. The object `WR_obs$data` is divided by 100 to change from Pa to hPa. As the `WR_obs$statistics$frequency` provides the monthly frequencies, the climatological frequencies are obtained as the average across the 20 years of the monthly frequencies. Note that these frequencies could slightly change as a consequence of the randomness inherent to the iterative processes involved in the k-means. ```r -clim_frequencies <- paste0('freq = ', - round(Mean1Dim(WR_obs$statistics$frequency, 1), 1), '%') -PlotLayout(PlotEquiMap, c(1, 2), lon = obs$lon, lat = obs$lat, +clim_frequencies <- paste0('freq = ', round(Mean1Dim(WR_obs$statistics$frequency, 1), 1), '%') +PlotLayout(PlotEquiMap, c(1, 2), lon = obs$coords$lon, lat = obs$coords$lat, var = WR_obs$data / 100, titles = paste0(paste0('Cluster ', 1:4), ' (', clim_frequencies,' )'), filled.continents = FALSE, @@ -117,21 +106,19 @@ freq_obs[is.na(freq_obs)] <- 0 dim(freq_obs) <- c(dimy = 20, dimcat = 4, dimx = 1) PlotTriangles4Categories(freq_obs, toptitle = 'Persistence', - xtitle = 'Start Dates', ytitle = '', xlab = FALSE, - ylabels = substr(sdates, 1, 4), cex_leg = 0.6, - lab_legend = c('AR', 'NAO-', 'BL', 'NAO+'), figure.width = .7) + xtitle = 'Start Dates', ytitle = '', xlab = FALSE, + ylabels = substr(sdates, 1, 4), cex_leg = 0.6, + lab_legend = c('AR', 'NAO-', 'BL', 'NAO+'), figure.width = .7) ``` - ### 7- Weather regimes in the predictions Predicted anomalies for each day, month, member and lead time are matched with the observed clusters (obtained in step 4). The assignment of the anomalies to a pre-defined set of clusters guarantees that the predicted weather regimes have very similar spatial structures to the observed regimes, which is an essential requirement for the verification of weather regimes. This is an example of how to produce a set of weather regimes based on the predictions that can be verified with the observational dataset, but this approach can be also used in an operational context for which the probability of occurence of each cluster could be estimated. - The matching is based on the minimization of Eucledian distance `method='distance'`, but it can also be also done in terms of spatial correlation `method='ACC'`. However the computational efficiency is superior for the distance method. @@ -149,7 +136,7 @@ The outputs of `RegimesAssign()` have been represented to be compared with those ```r -PlotLayout(PlotEquiMap, c(1, 2),lon = exp$lon, lat = exp$lat, +PlotLayout(PlotEquiMap, c(1, 2),lon = exp$coords$lon, lat = exp$coords$lat, var = WR_exp$data/100, titles = paste0(paste0('Cluster ',1:4), ' (',paste0('freq = ', round(WR_exp$statistics$frequency,1),'%'),' )'), @@ -160,5 +147,4 @@ PlotLayout(PlotEquiMap, c(1, 2),lon = exp$lon, lat = exp$lat, ``` - Observed and predicted weather regimes are very similar although their frequencies are slightly different. Cluster 1 is the Atlantic Ridge and cluster 3 the Blocking pattern, while cluster 4 and 2 are the positive and negative phases of the NAO. This patterns can change depending on the period analyzed.