Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • startR startR
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 29
    • Issues 29
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 7
    • Merge requests 7
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Earth SciencesEarth Sciences
  • startRstartR
  • Merge requests
  • !239

Remove NA values from inner_dim_lengths

  • Review changes

  • Download
  • Patches
  • Plain diff
Merged vagudets requested to merge dev-NA_inner_dim_lengths into master Sep 06, 2024
  • Overview 0
  • Commits 8
  • Pipelines 2
  • Changes 2

When the 'all' selector is used for a depending file dimension, but the size of the depending file dimension varies along its depended dimension, Start() raises an error because the inner_dim_lengths variable contains unexpected NA values.

Example for illustration:

path <- "/esarchive/exp/CMIP6/$dcpp$/HadGEM3-GC31-MM/DCPP/MOHC/HadGEM3-GC31-MM/$dcpp$/r1i1p1f2/Omon/tos/gn/v20200417/$var$_Omon_HadGEM3-GC31-MM_$dcpp$_s$sdate$-r1i1p1f2_gn_$chunk$.nc"

dat1 <- Start(dat = path,
              var = 'tos',
              chunk = 'all', 
              time = indices(1:14),
              time_across = 'chunk',
              sdate = sdates,
              dcpp = list('2018' = "dcppA-hindcast", '2019' = "dcppB-forecast"),
              dcpp_depends = 'sdate',
              chunk_depends = 'sdate',
              merge_across_dims = TRUE,
              largest_dims_length = TRUE,
              i = indices(450:460),
              j = indices(685:700),
              return_vars = list(time = c('chunk', 'sdate')),
              retrieve = TRUE)

Let's suppose we only had the following files:

# sdate = 2018; 'chunk' has a total of three files
/esarchive/exp/CMIP6/dcppA-hindcast/HadGEM3-GC31-MM/DCPP/MOHC/HadGEM3-GC31-MM/dcppA-hindcast/r1i1p1f2/Omon/tos/gn/v20200417/tos_Omon_HadGEM3-GC31-MM_dcppA-hindcast_s2018-r1i1p1f2_gn_201811-201812.nc

/esarchive/exp/CMIP6/dcppA-hindcast/HadGEM3-GC31-MM/DCPP/MOHC/HadGEM3-GC31-MM/dcppA-hindcast/r1i1p1f2/Omon/tos/gn/v20200417/tos_Omon_HadGEM3-GC31-MM_dcppA-hindcast_s2018-r1i1p1f2_gn_201901-201912.nc

/esarchive/exp/CMIP6/dcppA-hindcast/HadGEM3-GC31-MM/DCPP/MOHC/HadGEM3-GC31-MM/dcppA-hindcast/r1i1p1f2/Omon/tos/gn/v20200417/tos_Omon_HadGEM3-GC31-MM_dcppA-hindcast_s2018-r1i1p1f2_gn_202001-202012.nc

# sdate = 2019; 'chunk' only has two files
/esarchive/exp/CMIP6/dcppB-forecast/HadGEM3-GC31-MM/DCPP/MOHC/HadGEM3-GC31-MM/dcppB-forecast/r1i1p1f2/Omon/tos/gn/v20200417/tos_Omon_HadGEM3-GC31-MM_dcppB-forecast_s2019-r1i1p1f2_gn_201911-201912.nc

/esarchive/exp/CMIP6/dcppB-forecast/HadGEM3-GC31-MM/DCPP/MOHC/HadGEM3-GC31-MM/dcppB-forecast/r1i1p1f2/Omon/tos/gn/v20200417/tos_Omon_HadGEM3-GC31-MM_dcppB-forecast_s2019-r1i1p1f2_gn_202001-202012.nc

With this fix, Start() can retrieve the necessary data if the only common denominator of files is required. For example in this case, time = indices(1:14) only requires the first two files for each sdate.

If the required files are not available for any of the selectors of the depended dimension, Start() raises an error stating the maximum of indices it can retrieve. For example, if time = indices(1:15), Start() will complain that only indices 1:14 are available.

Edited Sep 06, 2024 by vagudets
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: dev-NA_inner_dim_lengths