diff --git a/inst/doc/faq.md b/inst/doc/faq.md index 087e00a227b98979683e405cd55fd7e7829b49ce..572bff8260a990ac035ae3c2d5d9f4d270c3041d 100644 --- a/inst/doc/faq.md +++ b/inst/doc/faq.md @@ -13,6 +13,7 @@ This document intends to be the first reference for any doubts that you may have 7. [Avoid or specify a node from cluster in Compute()](#7-avoid-or-specify-a-node-from-cluster-in-compute) 8. [Define a path with multiple dependencies](#8-define-a-path-with-multiple-dependencies) 9. [Use CDORemap() in function](#9-use-cdoremap-in-function) + 10. [The number of members depends on the start date](#10-the-number-of-members-depends-on-the-start-date) 2. **Something goes wrong...** @@ -387,6 +388,19 @@ If you want to interpolate data by s2dverification::CDORemap in function, you ne machine which CDO module to use. Therefore, `CDO_module = 'CDO/1.9.5-foss-2018b'` should be added in Compute() cluster list. See the example in usecase [ex2_3_cdo.R](inst/doc/usecase/ex2_3_cdo.R). +### 10. The number of members depends on the start date + +In seasonal forecast, some start dates, such as November 1st, are more widely used than others. For those start dates extensively used, the number of members available is greater than for other start dates. This is the case of the seasonal forecast system ECMWF SEAS5 (system5_m1): + - for the start date November 1st, 1999, there are 51 members available, while + - for the start date September 1st, 1999, there are 25 members available. + +When trying to load both start dates at once using Start(), the order in which the start dates is specified will impact on the dimensions of the dataset if all members are loaded with `member = 'all'`: + - `sdates = c('19991101', '19990901')`, the member dimension will be of length 51, showing missing values for the members 26 to 51 in the second start date; + - `sdates = c('19990901', '19991101')`, the member dimension will be of length 25, any member will be missing. + +The code to reproduce this behaviour could be found in the Use Cases section, [example 1.4](/inst/doc/usecase/ex1_4_variable_nmember.R). + + ## Something goes wrong... diff --git a/inst/doc/usecase.md b/inst/doc/usecase.md index 8f02b8e1628a0911a394b4019be1970b26e29c7e..fcaa7823d585a1237a447a81642d5cc3f2bc669a 100644 --- a/inst/doc/usecase.md +++ b/inst/doc/usecase.md @@ -18,6 +18,9 @@ In this document, you can link to the example scripts for various demands. For t 3. [Use experimental data attribute to load in oberservational data](inst/doc/usecase/ex1_3_attr_loadin.R) Load the experimental data first (with `retrieve = FALSE`), then retreive its dates and time attributes to use in the observational data load-in. It also shows how to use parameters `xxx_tolerance`, `xxx_across`, `merge_across_dims`, `split_multiselected_dims`. + + 4. [Checking impact of start date order in the number of members](inst/doc/usecase/ex1_4_variable_nmember.R) + Mixing start dates of different months can lead to load different number of members, check the code provided and the [FAQ 10](/inst/doc/faq.md). 2. **Execute computation (use `Compute()`)** diff --git a/inst/doc/usecase/ex1_4_variable_nmember.R b/inst/doc/usecase/ex1_4_variable_nmember.R new file mode 100644 index 0000000000000000000000000000000000000000..495c6f8278685482cd865378c59ed40faca3e38f --- /dev/null +++ b/inst/doc/usecase/ex1_4_variable_nmember.R @@ -0,0 +1,52 @@ +# This code shows that the number of members could depend on the start date +# and the order of start dates requested +# See FAQ 10 [The members depends on the start date](/inst/doc/faq.md) + +library(startR) + +path_list <- list(list(name = 'system5', + path = '/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc')) +sdates_exp <- c('19991101', '19990901') +data_Nov_Sep <- Start(dat = path_list, + var = 'psl', + member = 'all', + sdate = sdates_exp, + time = indices(1), + latitude = values(list(0, 20)), + latitude_reorder=Sort(), + longitude = values(list(0, 5)), + synonims = list(latitude = c('lat', 'latitude'), + longitude = c('lon', 'longitude'), + member = c('ensemble', 'realization')), + retrieve = TRUE) +# 51 members +dim(data_Nov_Sep) +# dat var member sdate time latitude longitude +# 1 1 51 2 1 71 19 +apply(data_Nov_Sep, 4, function(x){sum(is.na(x))}) +# 26 missing values for the second start date + +sdates_exp <- c('19990901', '19991101') +data_Sep_Nov <- Start(dat = path_list, + var = 'psl', + member = 'all', + sdate = sdates_exp, + time = indices(1), + latitude = values(list(0, 20)), + latitude_reorder=Sort(), + longitude = values(list(0, 5)), + synonims = list(latitude = c('lat', 'latitude'), + longitude = c('lon', 'longitude'), + member = c('ensemble', 'realization')), + retrieve = TRUE) + +# 25 members available +dim(data_Sep_Nov) +# dat var member sdate time latitude longitude +# 1 1 25 2 1 71 19 + +# Any missing value: +apply(data_Sep_Nov, 4, function(x){sum(is.na(x))}) + + +