Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • startR startR
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 29
    • Issues 29
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 7
    • Merge requests 7
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Earth SciencesEarth Sciences
  • startRstartR
  • Issues
  • #198
Closed
Open
Issue created May 15, 2024 by vagudets@vagudetsMaintainer

Start(): Retrieve correct time steps when time is across file dimension and the time steps of the first files are skipped

(This is a template to report problems or suggest a new development. Please fill in the relevant information and remove the rest.)

Hi me,

Summary

When loading time steps across multiple files, Start() does not load the correct indices if the first index is not inside the first file and the files have different lengths for the time dimension.

For example, in the HadGEM3-GC31-MM model, the files have the following structure:

tas_Amon_HadGEM3-GC31-MM_dcppA-hindcast_s1991-r3i1p1f2_gn_199111-199112.nc
tas_Amon_HadGEM3-GC31-MM_dcppA-hindcast_s1991-r3i1p1f2_gn_199201-199212.nc
tas_Amon_HadGEM3-GC31-MM_dcppA-hindcast_s1991-r3i1p1f2_gn_199301-199312.nc
...

The model is initialized in November and the first file contains the first two time steps for November and December. The subsequent files contain all the months in each year from January to December, meaning 12 time steps per file.

The following Start() call can be used to load the time steps in order:

library(startR)

path_list <- paste0("/esarchive/exp/CMIP6/dcppA-hindcast/HadGEM3-GC31-MM/DCPP/MOHC/HadGEM3-GC31-MM/dcppA-hindcast/",
                    "$ensemble$/Amon/$var$/gn/v20200417/",
                    "$var$_Amon_*_dcppA-hindcast_s$syear$-$ensemble$_gn_$chunk$.nc")

# Where `$chunk$` refers to each of the strings designating the time steps 
# in the file: `199111-199112`, `199201-199212`, `199301-199312`, etc.

sdates_hcst <- c("1990", "1991", "1992", "1993")
time_ind <- seq(2, 24)
lats.min <- 10
lats.max <- 20
lons.min <- 0
lons.max <- 15

exp <- Start(dat = path_list,
             var = "tas",
             syear = paste0(sdates_hcst),
             chunk = 'all',
             chunk_depends = 'syear',
             time = indices(time_ind),
             time_across = 'chunk',
             merge_across_dims = TRUE,
             largest_dims_length = TRUE,
             latitude = values(list(10, 20)),
             latitude_reorder = Sort(decreasing = TRUE),
             longitude = values(list(0, 15)),
             longitude_reorder = CircularSort(0, 360),
             ensemble = c("r1i1p1f2", "r2i1p1f2", "r3i1p1f2"),
             synonims = list(longitude = c('lon', 'longitude'),
                             latitude = c('lat', 'latitude')),
             return_vars = list(latitude = NULL, longitude = NULL,
                                time = c('syear', 'chunk')),
             retrieve = TRUE)

# The first time step is forecast time 2, December 1990, as expected.
attr(exp, "Variables")$common$time[1]
# [1] "1990-12-16 UTC"

However, if the first time step (nth index, where n > 2) falls outside of the first file defined by $chunk$, the resulting dates are wrong, because Start() retrieves the forecast times starting from the nth index of the second file:


time_ind <- seq(3, 24)

exp <- Start(dat = path_list,
             var = "tas",
             syear = paste0(sdates_hcst),
             chunk = 'all',
             chunk_depends = 'syear',
             time = indices(time_ind),
             time_across = 'chunk',
             merge_across_dims = TRUE,
             largest_dims_length = TRUE,
             latitude = values(list(10, 20)),
             latitude_reorder = Sort(decreasing = TRUE),
             longitude = values(list(0, 15)),
             longitude_reorder = CircularSort(0, 360),
             ensemble = c("r1i1p1f2", "r2i1p1f2", "r3i1p1f2"),
             synonims = list(longitude = c('lon', 'longitude'),
                             latitude = c('lat', 'latitude')),
             return_vars = list(latitude = NULL, longitude = NULL,
                                time = c('syear', 'chunk')),
             retrieve = TRUE)

# The first time step is actually forecast time 5! and not forecast time 3.
attr(exp, "Variables")$common$time[1]
[1] "1991-03-16 UTC"

Furthermore, many of the time steps in the array are filled with NA values.

Module and Package Version

startR_2.3.1 with R/4.2.1 on Hub (pending testing with R/4.1.2)

Other Relevant Information

This bugfix is needed in order to correctly load forecast times in SUNSET for these decadal models.+

Victòria

Edited May 15, 2024 by vagudets
Assignee
Assign to
Time tracking