Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • startR startR
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 29
    • Issues 29
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 7
    • Merge requests 7
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Earth SciencesEarth Sciences
  • startRstartR
  • Issues
  • #86
Closed
Open
Issue created Jan 25, 2021 by aho@ahoMaintainer

Start: Problem of time_tolerance and time attributes

The issue was detected with the case in #85 (closed). The case uses the time attributes of exp to retrieve obs data. However, the time attributes of exp and obs are different. For example, the exp time is 2005-05-16 12:00:00 UTC while the corresponding obs time is 2005-05-01 UTC; the exp time is 2005-06-16 00:00:00 UTC while the corresponding obs time is 2005-06-01 UTC. The difference is 15 days or 15.5 days.

The parameter 'time_tolerance' can be used to loosen the standard of matching time values. However, even 'time_tolerance' is set to be big enough, Start() doesn't retrieve all the files and later returns the error message: Error in while (indices_chunk[i + 1] == indices_chunk[i] & i < length(indices_chunk)) { : missing value where TRUE/FALSE needed The error may due to the inconsistency between the expected dimension length and the number of actual found files. (Part 1)

Another possible reason for incomplete file finding is the wrong time attributes retrieval of obs. If obs is read independently from exp (i.e., exp time values are not used in the obs call), The returned data is correct but the time metadata is wrong. For example, the first time should be 2005-05-01 UTC but it becomes 2005-08-28 12:00:00 UTC. (Part 2)

library(startR)

# ATL
lonmin <- -80
lonmax <-  50
latmin <- -60
latmax <-  50

# exp
repos_exp <-  paste0('/esarchive/scratch/eexarcho/Eleftheria/TRIATLAS/Analysis_a33d_a33e_a33f/',
                       'Data/a33d/tos/',
                      'tos_Amon_EC-Earth3-CC_historical_S$sdate$_$member$_gr_$chunk$.nc'
                                          )
sdates <-  paste0(c(2005:2006), '0501')

exp <- Start(dat = repos_exp,
             var = 'tos',
             member = 'all',
             sdate = sdates,
             chunk = 'all',
          #   time = 'all',
             time = indices(1:12),  #first time step per day
             chunk_depends = 'sdate',
             time_across = 'chunk',
             merge_across_dims = TRUE,
             lat = values(list(latmin, latmax)),
             lat_reorder = Sort(decreasing = T),
             lon = values(list(lonmin, lonmax)),
             lon_reorder = CircularSort(0, 360),
             transform = CDORemapper,
             transform_extra_cells = 2,
             transform_params = list(grid = 'r360x180',
                                     method = 'conservative',
                                     crop = c(lonmin, lonmax, latmin, latmax)),
             transform_vars = c('lat', 'lon'),
             synonims = list(lat = c('lat', 'latitude'),
                             lon = c('lon', 'longitude')),
             return_vars = list(lon = 'dat',
                                lat = 'dat',
                                time = 'sdate'),
             retrieve = T)
lons <- attr(exp, 'Variables')$common$tos$dim[[1]]$vals
lats <- attr(exp, 'Variables')$common$tos$dim[[2]]$vals
dates <- attr(exp, 'Variables')$common$time
dim(dates)
#sdate  time 
#    2    12 

#================Part 1========================
dates_file <- sort(unique(gsub('-', '', sapply(as.character(dates), substr, 1, 7))))
repos_obs <- '/esarchive/obs/ukmo/hadisst_v1.1/monthly_mean/$var$/$var$_$date$.nc'
obs <- Start(dat = repos_obs,
             var = 'tos',
             date = dates_file,
             time = values(dates),  #dim: [sdate = 2, time = 12]
             lat = values(lats),
             lon = values(lons),
             time_var = 'time',
             # PROBLEM!!! Cannot find a proper value for time_tolerance
             time_tolerance = as.difftime(372, units = 'hours'),
             #time values are across all the files
             time_across = 'date',
             merge_across_dims = TRUE,
             merge_across_dims_narm = TRUE,
             split_multiselected_dims = TRUE,
             synonims = list(lat = c('lat', 'latitude'),
                             lon = c('lon', 'longitude')),
             return_vars = list(latitude = NULL,
                                longitude = NULL,
                                time = 'date'),
                retrieve = TRUE)

#Error in while (indices_chunk[i + 1] == indices_chunk[i] & i < length(indices_chunk)) { : 
#  missing value where TRUE/FALSE needed


#================Part 2====================
obs <- Start(dat = repos_obs,
             var = 'tos',
             date = dates_file,
             time = 'all', 
             lat = values(lats),
             lon = values(lons),
             time_across = 'date',  
             #combine time and file_date dims 
             merge_across_dims = TRUE,
             #exclude the additional NAs generated by merge_across_dims
             merge_across_dims_narm = TRUE,
             synonims = list(lat = c('lat', 'latitude'),
                             lon = c('lon', 'longitude')),
             return_vars = list(latitude = 'dat',
                                longitude = 'dat',
                                time = 'date'),
             retrieve = TRUE)

attr(obs, 'Variables')$common$time  #WRONG!!!!
 [1] "2005-08-28 12:00:00 UTC" "2005-09-28 00:00:00 UTC"
 [3] "2005-10-28 12:00:00 UTC" "2005-11-28 00:00:00 UTC"
 [5] "2005-12-28 12:00:00 UTC" "2006-01-28 00:00:00 UTC"
 [7] "2006-02-27 12:00:00 UTC" "2006-03-30 00:00:00 UTC"
 [9] "2006-04-29 12:00:00 UTC" "2006-05-30 00:00:00 UTC"
[11] "2006-06-29 12:00:00 UTC" "2006-07-30 00:00:00 UTC"
[13] "2006-08-29 12:00:00 UTC" "2006-09-29 00:00:00 UTC"
[15] "2006-10-29 12:00:00 UTC" "2006-11-29 00:00:00 UTC"
[17] "2006-12-29 12:00:00 UTC" "2007-01-29 00:00:00 UTC"
[19] "2007-02-28 12:00:00 UTC" "2007-03-31 00:00:00 UTC"
[21] "2007-04-30 12:00:00 UTC" "2007-05-31 00:00:00 UTC"
[23] "2007-06-30 12:00:00 UTC" "2007-07-31 00:00:00 UTC"

@nperez I tag you to keep you in the loop.

Assignee
Assign to
Time tracking