Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • startR startR
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 29
    • Issues 29
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 7
    • Merge requests 7
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Earth SciencesEarth Sciences
  • startRstartR
  • Wiki
  • Examples

Examples · Changes

Page history
Update Examples authored Oct 04, 2019 by aho's avatar aho
Hide whitespace changes
Inline Side-by-side
Examples.md 0 → 100644
View page @ 207ce486
In this page, you can find some example scripts for various demands. For the beginners, it is highly recommended to read the [practical guide](https://earth.bsc.es/gitlab/es/startR/blob/master/inst/doc/practical_guide.md) carefully first. You can find basic scripts and the configuration for different machines there.
## Function working on time dimension
<details><summary>CLICK ME</summary>
</p>
<br/>
```r
# -----------------------------------------------------------------
# Function working on time dimension e.g.: Season
# ------------------------------------------------------------------
repos <- '/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc'
data <- Start(dat = repos,
var = 'tas',
sdate = c('20170101', '20180101'),
ensemble = indices(1:20),
time = 'all',
latitude = 'all',
longitude = indices(1:40),
return_vars = list(latitude = 'dat', longitude = 'dat', time = 'sdate'),
retrieve = FALSE)
library(multiApply)
fun_spring <- function(x) {
source("/esarchive/scratch/nperez/Season_v2.R")
y <- Season_v2(x, monini = 1, moninf = 3, monsup = 5)
return(y)
}
step1 <- Step(fun = fun_spring,
target_dims = c('time'),
output_dims = c('time'))
wf1 <- AddStep(data, step1)
# Locally
res1 <- Compute(wf1,
chunks = list(ensemble = 2,
sdate = 2))
dim(res1$output1)
str(res1$output1)
summary(res1$output1)
# -----------------------------------------------------------------
#> dim(res1$output1)
# time dat var sdate ensemble latitude longitude
# 1 1 1 2 20 640 40
#> str(res1$output1)
# num [1, 1, 1, 1:2, 1:20, 1:640, 1:40] 260 257 257 260 257 ...
#> summary(res1$output1)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 220.5 273.3 282.7 279.9 295.6 306.8
# ----------------------------------------------------------------
#on Power9
#-----------modify according to your personal info---------
queue_host = 'p1' #your own host name for power9
temp_dir = '/gpfs/scratch/bsc32/bsc32734/startR_hpc/'
ecflow_suite_dir = '/home/Earth/aho/startR_local/' #your own local directory
#------------------------------------------------------------
res <- Compute(wf1,
chunks = list(ensemble = 20,
sdate = 2),
threads_load = 2,
threads_compute = 4,
cluster = list(queue_host = queue_host,
queue_type = 'slurm',
cores_per_job = 2,
temp_dir = temp_dir,
r_module = 'R/3.5.0-foss-2018b',
polling_period = 10,
job_wallclock = '01:00:00',
max_jobs = 40,
bidirectional = FALSE),
ecflow_suite_dir = ecflow_suite_dir,
wait = TRUE)
```
</p>
</details>
## Function using attributes of the data
<details><summary>CLICK ME</summary>
</p>
<br/>
Using attributes is only available in startR_v0.1.3 or above.
```r
#-----------------------------------------------------------------
# Function using attributes of the data e.g.: latitudinal correction
# ------------------------------------------------------------------
repos <- '/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc'
data <- Start(dat = repos,
var = 'tas',
sdate = c('20170101', '20180101'),
ensemble = indices(1:20),
time = 'all',
latitude = 'all',
longitude = indices(1:40),
return_vars = list(latitude = 'dat', longitude = 'dat', time = 'sdate'),
retrieve = FALSE)
funp <- function(x) {
lat = attributes(x)$Variables$dat1$latitude
weight = sqrt(cos(lat * pi / 180))
corrected = Apply(list(x), target_dims = "latitude",
fun = function(x) {x * weight})
}
step2 <- Step(fun = funp,
target_dims = 'latitude',
output_dims = 'latitude',
use_attributes = list(data = "Variables"))
wf2 <- AddStep(data, step2)
#locally
res2 <- Compute(workflow = wf2,
chunks = list(ensemble = 2,
sdate = 2))
dim(res2$output1)
head(res2$output1)
summary(res2$output1)
# ------------------------------------------------------------------
# Output:
#> dim(res2$output1)
# latitude dat var sdate ensemble time longitude
# 640 1 1 2 20 7 40
#> head(res2$output1)
#[1] 15.22683 23.09543 28.94978 33.82982 38.11695 41.99680
#> summary(res2$output1)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 13.19 169.60 237.60 217.90 284.10 305.10
# -----------------------------------------------------------------
#on Power9
#-----------modify according to your personal info---------
queue_host = 'p1' #your own host name for power9
temp_dir = '/gpfs/scratch/bsc32/bsc32734/startR_hpc/'
ecflow_suite_dir = '/home/Earth/aho/startR_local/' #your own local directory
#------------------------------------------------------------
res2_P <- Compute(wf2,
chunks = list(ensemble = 20,
sdate = 2),
threads_load = 2,
threads_compute = 4,
cluster = list(queue_host = queue_host, #your own host name for power9
queue_type = 'slurm',
cores_per_job = 2,
temp_dir = temp_dir,
r_module = 'R/3.5.0-foss-2018b',
#extra_queue_params = list('#SBATCH --mem-per-cpu=3000'),
polling_period = 10,
job_wallclock = '01:00:00',
max_jobs = 40,
bidirectional = FALSE),
ecflow_suite_dir = ecflow_suite_dir, #your own local directory
wait = TRUE)
```
</p>
</details>
## Function doing regridding CDO
<details><summary>CLICK ME</summary>
</p>
<br/>
Using 'CDO_module' is only available in startR_v0.1.3 or above.
```r
# --------------------------------------------------------------
# Function doing regridding CDO (e.g.: s2dverification::CDORemap)
#---------------------------------------------------------------
repos <- '/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc'
data <- Start(dat = repos,
var = 'tas',
sdate = c('20170101', '20180101'),
ensemble = indices(1:2),
time = 'all',
latitude = 'all',
longitude = 'all',
return_vars = list(latitude = 'dat', longitude = 'dat', time = 'sdate'),
retrieve = FALSE)
fun_deb2 <- function(x) {
lons_data = as.vector(attr(x, 'Variables')$dat1$longitude)
lats_data = as.vector(attr(x, 'Variables')$dat1$latitude)
resgrid = "r360x180" # prlr
r <- s2dverification::CDORemap(x, lons_data, lats_data, resgrid,
'bil', crop = FALSE, force_remap = TRUE)[[1]]
return(r)
}
step3 <- Step(fun = fun_deb2,
target_dims = c('latitude','longitude'),
output_dims = c('latitude', 'longitude'),
use_attributes = list(data = "Variables"))
wf3 <- AddStep(data, step3)
#locally
res3 <- Compute(workflow = wf3,
chunks = list(ensemble = 2,
sdate = 2))
dim(res3$output1)
head(res3$output1)
summary(res3$output1)
# --------------------------------------------------------------------
# Output:
#> dim(res3$output1)
# latitude longitude dat var sdate ensemble time
# 180 360 1 1 2 2 7
#> head(res3$output1)
#[1] 245.2156 244.6700 245.1513 244.7413 244.4198 245.0041
#> summary(res3$output1)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 206.3 269.2 283.3 278.8 296.4 314.2
# --------------------------------------------------------------------
#on Power9
#-----------modify according to your personal info---------
queue_host = 'p1' #your own host name for power9
temp_dir = '/gpfs/scratch/bsc32/bsc32734/startR_hpc/'
ecflow_suite_dir = '/home/Earth/aho/startR_local/' #your own local directory
#------------------------------------------------------------
res3_P <- Compute(wf3,
chunks = list(ensemble = 2,
sdate = 2),
threads_load = 2,
threads_compute = 4,
cluster = list(queue_host = queue_host, #your own host name for power9
queue_type = 'slurm',
cores_per_job = 1,
temp_dir = temp_dir,
r_module = 'R/3.5.0-foss-2018b',
CDO_module = 'CDO/1.9.5-foss-2018b',
extra_queue_params = list('#SBATCH --mem-per-cpu=3000'),
polling_period = 100,
job_wallclock = '01:00:00',
max_jobs = 4,
bidirectional = FALSE),
ecflow_suite_dir = ecflow_suite_dir, #your own local directory
wait = TRUE)
```
</p>
</details>
## Two functions
<details><summary>CLICK ME</summary>
</p>
<br/>
```r
# --------------------------------------------------------------
# Two functions (e.g.: s2dverification::CDORemap and Season)
#---------------------------------------------------------------
repos <- '/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc'
data <- Start(dat = repos,
var = 'tas',
sdate = c('20170101', '20180101'),
ensemble = indices(1:2),
time = 'all',
latitude = 'all',
longitude = 'all',
return_vars = list(latitude = 'dat', longitude = 'dat', time = 'sdate'),
retrieve = FALSE)
fun_deb3 <- function(x) {
source("/esarchive/scratch/nperez/Season_v2.R")
lons_data = as.vector(attr(x, 'Variables')$dat1$longitude)
lats_data = as.vector(attr(x, 'Variables')$dat1$latitude)
resgrid = "r360x180" # prlr
y = Season_v2(x, posdim = 'time', monini = 1, moninf = 1, monsup = 3)
r <- s2dverification::CDORemap(y, lons_data, lats_data, resgrid,
'bil', crop = FALSE, force_remap = TRUE)[[1]]
return(r)
}
step4 <- Step(fun = fun_deb3,
target_dims = c('latitude','longitude', 'time'),
output_dims = c('latitude', 'longitude', 'time'),
use_attributes = list(data = "Variables"))
wf4 <- AddStep(data, step4)
#locally
res4 <- Compute(workflow = wf4,
chunks = list(ensemble = 2,
sdate = 2))
dim(res4$output1)
head(res4$output1)
summary(res4$output1)
# ------------------------------------------------------------------
# Output:
#> dim(res4$output1)
# latitude longitude time dat var sdate ensemble
# 180 360 1 1 1 2 2
#> head(res4$output1)
#[1] 237.1389 237.2601 238.0882 238.0312 237.7883 238.4835
#> summary(res4$output1)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 227.3 259.6 280.8 277.1 296.2 306.7
# ------------------------------------------------------------------
#on Power9
#-----------modify according to your personal info-----------
queue_host = 'p1' #your own host name for power9
temp_dir = '/gpfs/scratch/bsc32/bsc32734/startR_hpc/'
ecflow_suite_dir = '/home/Earth/aho/startR_local/' #your own local directory
#------------------------------------------------------------
res4_P <- Compute(wf4,
chunks = list(ensemble = 2,
sdate = 2),
threads_load = 2,
threads_compute = 4,
cluster = list(queue_host = queue_host, #your own host name for power9
queue_type = 'slurm',
cores_per_job = 1,
temp_dir = temp_dir,
r_module = 'R/3.5.0-foss-2018b',
CDO_module = 'CDO/1.9.5-foss-2018b',
extra_queue_params = list('#SBATCH --mem-per-cpu=3000'),
polling_period = 10,
job_wallclock = '01:00:00',
max_jobs = 6,
bidirectional = FALSE),
ecflow_suite_dir = ecflow_suite_dir, #your own local directory
wait = TRUE)
```
</p>
</details>
## Using experimental and (date-corresponding) observational data
<details><summary>CLICK ME</summary>
</p>
<br/>
```r
repos <- paste0('/esnas/exp/ecmwf/system4_m1/6hourly/',
'$var$/$var$_$sdate$.nc')
system4 <- Start(dat = repos,
var = 'sfcWind',
#sdate = paste0(1981:2015, '1101'),
sdate = paste0(1981:1984, '1101'),
#time = indices((30*4+1):(120*4)),
time = indices((30*4+1):(30*4+4)),
ensemble = 'all',
#ensemble = indices(1:6),
#latitude = 'all',
latitude = indices(1:10),
#longitude = 'all',
longitude = indices(1:10),
return_vars = list(latitude = NULL,
longitude = NULL,
time = c('sdate')))
repos <- paste0('/esnas/recon/ecmwf/erainterim/6hourly/',
'$var$/$var$_$file_date$.nc')
dates <- attr(system4, 'Variables')$common$time
dates_file <- sort(unique(gsub('-', '', sapply(as.character(dates),
substr, 1, 7))))
erai <- Start(dat = repos,
var = 'sfcWind',
file_date = dates_file,
time = values(dates),
#latitude = 'all',
latitude = indices(1:10),
#longitude = 'all',
longitude = indices(1:10),
time_var = 'time',
time_tolerance = as.difftime(1, units = 'hours'),
time_across = 'file_date',
return_vars = list(latitude = NULL,
longitude = NULL,
time = 'file_date'),
merge_across_dims = TRUE,
split_multiselected_dims = TRUE)
step <- Step(eqmcv_atomic,
list(a = c('ensemble', 'sdate'),
b = c('sdate')),
list(c = c('ensemble', 'sdate')))
res <- Compute(step, list(system4, erai),
chunks = list(latitude = 5,
longitude = 5,
time = 2),
cluster = list(queue_host = 'bsceslogin01.bsc.es',
max_jobs = 4,
cores_per_job = 2),
shared_dir = '/esnas/scratch/nmanuben/test_bychunk',
wait = FALSE)
```
</p>
</details>
## Computation of weekly means
<details><summary>CLICK ME</summary>
</p>
<br/>
</p>
</details>
## Data on an irregular grid with selection of a region
<details><summary>CLICK ME</summary>
</p>
<br/>
</p>
</details>
## CTE-Power using GPUs
<details><summary>CLICK ME</summary>
</p>
<br/>
</p>
</details>
## Seasonal forecast verification example on cca
<details><summary>CLICK ME</summary>
</p>
<br/>
```r
crps <- function(x, y) {
mean(SpecsVerification::EnsCrps(x, y, R.new = Inf))
}
library(startR)
repos <- '/perm/ms/spesiccf/c3ah/qa4seas/data/seasonal/g1x1/ecmf-system4/msmm/atmos/seas/tprate/12/ecmf-system4_msmm_atmos_seas_sfc_$date$_tprate_g1x1_init12.nc'
data <- Start(dat = repos,
var = 'tprate',
date = 'all',
time = 'all',
number = 'all',
latitude = 'all',
longitude = 'all',
return_vars = list(time = 'date'))
dates <- attr(data, 'Variables')$common$time
repos <- '/perm/ms/spesiccf/c3ah/qa4seas/data/ecmf-ei_msmm_atmos_seas_sfc_19910101-20161201_t2m_g1x1_init02.nc'
obs <- Start(dat = repos,
var = 't2m',
time = values(dates),
latitude = 'all',
longitude = 'all',
split_multiselected_dims = TRUE)
s <- Step(crps, target_dims = list(c('date', 'number'), c('date')),
output_dims = NULL)
wf <- AddStep(list(data, obs), s)
r <- Compute(wf,
chunks = list(latitude = 10,
longitude = 3),
cluster = list(queue_host = 'cca',
queue_type = 'pbs',
max_jobs = 10,
init_commands = list('module load ecflow'),
r_module = 'R/3.3.1',
extra_queue_params = list('#PBS -l EC_billing_account=spesiccf')),
ecflow_output_dir = '/perm/ms/spesiccf/c3ah/startR_test/',
is_ecflow_output_dir_shared = FALSE
)
```
</p>
</details>
Clone repository
  • Examples
  • FAQs
  • Home