Make Start() load the monthly data only consider the month value
This issue is for a potential development following the discussion here: #171 (comment 197274). When trying to load obs data using the time attribute of exp data as the time parameter input, we have problems if exp doesn't have the exact the same time value as the obs data. For example, if exp monthly data has time "2000-11-30" "2000-12-31", while obs has "2000-11-01" "2000-12-01", the retrieved obs data will be wrong since startR looks for the closest value in the data. For November, obs will have December data since "2000-12-01" is closer to "2000-11-30" than "2000-12-31".
It can be solved by tuning the exp time attributes (see example below), but one potential development is to tell startR that the data is monthly, then it only looks for the month value rather than the complete time value. We can add one parameter like time_freq
in Start(); if it is NULL, Start() looks for the closest time value as it does now; if it is "monthly"/"daily"/"hourly", Start() looks for the time value until "month"/"day"/"hour" granularity.
library(startR)
library(lubridate)
sdate <- as.vector(sapply(1995:1996, function(x) paste0(x, sprintf('%02d', 1:12), '01')))
exp <- Start(
dat = '/esarchive/exp/ecmwf/system5c3s/monthly_mean/$var$_f6h/$var$_$sdate$.nc',
var = 'tas',
sdate = sdate,
time = seq(1:3),
ensemble = 1,
latitude = indices(1),
longitude = indices(1),
synonims = list(latitude = c('lat', 'latitude'), longitude = c('lon', 'longitude')),
return_vars = list(time = 'sdate', latitude = NULL, longitude = NULL),
retrieve = TRUE)
dates <- attr(exp, 'Variables')$common$time
#===========WORKAROUND======================
# Adjust the day to the middle of the month
dates_mid <- dates - lubridate::days(15)
dim(dates_mid) <- dim(dates)
#===========================================
obs <- Start(
dat = '/esarchive/recon/ecmwf/era5/monthly_mean/$var$_f1h-r1440x721cds/$var$_$file_date$.nc',
var = 'tas',
file_date = unique(format(dates, '%Y%m')),
time = values(dates), #values(dates_mid),
time_across = 'file_date',
merge_across_dims = TRUE,
split_multiselected_dims = TRUE,
latitude = indices(1),
longitude = indices(1),
synonims = list(latitude = c('lat', 'latitude'), longitude = c('lon', 'longitude')),
return_vars = list(time = 'file_date', latitude = NULL, longitude = NULL),
retrieve = TRUE)
obs_dates <- attr(obs, 'Variables')$common$time