Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • startR startR
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 29
    • Issues 29
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 7
    • Merge requests 7
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Earth SciencesEarth Sciences
  • startRstartR
  • Issues
  • #85
Closed
Open
Issue created Jan 23, 2021 by Eleftheria Exarchou@eexarchouDeveloper

load exp / obs interpolated

Hi @nperez @aho

I have a code below, which attempts to load data from a selected region in a seasonal prediction exp (a33d) SST and hadisst in same structure and same grid. I thought to load data in the exp interpolated in 360x180 (which is the grid in hadisst), and then use this information to load hadisst. However it does not work for me and I am lost to as to how I proceed.

The code is

library(s2dv)
library(startR)
library(ncdf4)
var   <- 'tos'     # tos/mxl_heatc 
obs   <- "hadisst_v1.1"

# ATL
lonmin <- -80
lonmax <-  50
latmin <- -60
latmax <-  50


# exp
repos_exp <-  paste0('/esarchive/scratch/eexarcho/Eleftheria/TRIATLAS/Analysis_a33d_a33e_a33f/',
                       'Data/a33d/tos/', 
                      'tos_Amon_EC-Earth3-CC_historical_S$sdate$_$member$_gr_$chunk$.nc'
				          )
sdates <-  paste0(c(2005:2006), '0501')

exp <- Start(dat = repos_exp,
             var = 'tos',
             member = 'all',
             sdate = sdates,
             chunk = 'all', 
          #   time = 'all',
             time = indices(1:12),  #first time step per day
             chunk_depends = 'sdate',
             time_across = 'chunk',
             merge_across_dims = TRUE,
             lat = values(list(latmin, latmax)),
             lat_reorder = Sort(decreasing = T),
             lon = values(list(lonmin, lonmax)),
             lon_reorder = CircularSort(0, 360),
             transform = CDORemapper,
             transform_extra_cells = 2,
             transform_params = list(grid = 'r360x180',
                                     method = 'conservative',
                                     crop = c(lonmin, lonmax, latmin, latmax)),
             transform_vars = c('lat', 'lon'),
             synonims = list(lat = c('lat', 'latitude'),
                             lon = c('lon', 'longitude')),
             return_vars = list(lon = 'dat',
                                lat = 'dat',
                                time = 'sdate'),
             retrieve = T)

# Retrieve attributes for the following observation.
# Because latitude order in experiment is [-90, 90] but in observation is [90, -90],
# latitude values need to be retrieved and used below.
lons <- ( (attr(exp, 'Variables')$common$tos$dim)[[1]] )$vals
lats <- ( (attr(exp, 'Variables')$common$tos$dim)[[2]] )$vals

# The 'time' attribute is dependent on 'sdate'. You can see the dimension below.
dates <- attr(exp, 'Variables')$common$time
# dim(dates)
#sdate time 
#    4    3 
# Manually retrieve the observation dates in the required format 
dates_file <- sort(unique(gsub('-', '', sapply(as.character(dates), substr, 1, 7))))

#-------------------------------------------

# obs
# 1. For lat, use experiment attribute. For lon, it is not necessary because they have
# same values.
# 2. For dimension 'date', it is a vector involving the first 3 months (ftime) of the four years (sdate).
# 3. Dimension 'time' is assigned by the matrix, so we can seperate 'sdate' and 'time' 
# using 'split_multiselected_dims' later.
# 4. Because the 'time' is actually across all the files, so we need to specify 
# 'time_across'. Then, use 'merge_across_dims' to make dimension 'date' disappears. 
# At this moment, the dimension is 'time = 12'. 
# 5. However, we want to seperate year and month (which are 'sdate' and 'ftime' in 
# experimental data). So we use 'split_multiselected_dims' to split the two dimensions 
# of dimension 'time'.

repos_obs <- '/esarchive/obs/ukmo/hadisst_v1.1/monthly_mean/$var$/$var$_$date$.nc'

obs <- Start(dat = repos_obs,
             var = 'tos',
             file_date = dates_file,
             time = values(dates),  #dim: [sdate = 4, time = 3]
             lat = values(lats),
             lon = values(lons),
      #because time is assigned by 'values', set the tolerance to avoid too distinct match
             time_var = 'time',
             time_tolerance = as.difftime(1, units = 'hours'), 
             #time values are across all the files
             time_across = 'file_date',  
             #combine time and file_date dims 
             merge_across_dims = TRUE,
             #exclude the additional NAs generated by merge_across_dims
             merge_across_dims_narm = TRUE,
             #split time dim, because it is two-dimensional
             split_multiselected_dims = TRUE,
             synonims = list(lat = c('lat', 'latitude'),
                             lon = c('lon', 'longitude')),
             return_vars = list(latitude = NULL,
                                   longitude = NULL,
                                   time = 'file_date'),
                retrieve = TRUE)

The exp seems to be fine. The error is when I try to load the obs, which is

*   on the issued request and the performance of the file server...
Error in Start(dat = repos_obs, var = "tos", file_date = dates_file, time = values(dates),  :
  All relationships specified in '_across' parameters must be between a inner dimension and a file dimension. Found wrong specification for 'dat1', which has the following file dimensions: 'dat', 'var', and the following inner dimensions: 'time', 'file_date', 'lat', 'lon'.
In addition: Warning messages:
1: ! Warning: Parameter 'pattern_dims' not specified. Taking the first dimension,
!   'dat' as 'pattern_dims'.
2: ! Warning: Could not find any pattern dim with explicit data set descriptions (in
!   the form of list of lists). Taking the first pattern dim, 'dat', as
!   dimension with pattern specifications.

Also, bonus question: how do I mask the land? The land points in exp are not masked, so they need to be masked BEFORE interpolation so as to make sure the land points are not considered in the interpolation. And after this first masking, how to apply the hadisst mask (/esarchive/exp/ecearth/constant/land_sea_mask_360x180.nc) ?

Thanks a lot!

Assignee
Assign to
Time tracking