README.md 7.04 KB
Newer Older
# s2dverification

s2dverification (seasonal to decadal verification) is an R framework that aids in analysis of forecasts from data retrieval, through computation of statistics and skill scores, to visualisation of data and results. While some of its components are only targeted to verification of seasonal to decadal climate forecasts, it provides with tools that can be useful for verification of forecasts in any field.

Find out more in the overview below, on the wiki page at <https://earth.bsc.es/gitlab/es/s2dverification/wikis/home> or on the CRAN website at <https://cran.r-project.org/web/packages/s2dverification/index.html>. You can also sign up to the s2dverification mailing list by sending a message with the subject 'subscribe' to s2dverification-request@bsc.es if you want to keep abreast of internal discussons or latest development releases.

## Installation

s2dverification has a system dependency, the CDO libraries, for interpolation of grid data. Make sure you have these libraries installed or install from <https://code.zmaw.de/projects/cdo>

You can then install the released version from CRAN:
```R
install.packages("s2dverification")
```

Or the development version from the GitLab repository:
```R
# install.packages("devtools")
devtools::install_git("https://earth.bsc.es/gitlab/es/s2dverification.git")
```

## Overview

Nicolau Manubens's avatar
Nicolau Manubens committed
The following diagram depicts the modules of s2dverification and how they interact:

![s2dverification module diagram](vignettes/s2dv_modules.png)

First, s2dverification allows you to gather and homogenize NetCDF data files stored in a local or remote file system. The first step, however, is to set up some configuration parameters so that the *Data retrieval* functions can locate the source files and recognize their format. In [Data retrieval](https://earth.bsc.es/gitlab/es/s2dverification/wikis/data_retrieval.md) you can find a full explanation of the configuration stage and the capabilities and constraints of the retrieval module.

Once the data has been loaded into an R object, basic statistics can be computed, such as climatologies, trends, bias correction, smoothing, ... You can check a more detailed explanation in [Basic statistics](https://earth.bsc.es/gitlab/es/s2dverification/wikis/basic_statistics.md).

Either after computing basic statistics, or directly from the original data, the functions in the verification module allow you to compute deterministic and some probabilistic scores and skill scores, such as root mean square error, time or spatial correlation or brier score. Find a full review in [Verification](https://earth.bsc.es/gitlab/es/s2dverification/wikis/verification.md).

Visualisation functions are also provided to plot the results obtained from any of the modules above. Check [Visualisation](https://earth.bsc.es/gitlab/es/s2dverification/wikis/visualisation.md) for a complete explanation. 

Next, you can see an example of usage of s2dverification with a complete workflow spanning its four modules.

### Data retrieval

```R
library(s2dverification)
data <- Load('tas', c('ExperimentID_A', 'ExperimentID_B'), 
                    c('ObservationID_X', 'ObservationID_Y'),
                    sdates = c('19901101', '19951101', '20001101'),
                    lonmin = 100, lonmax = 250, latmin = -10, latmax = 60,
                    leadtimemin = 1, leadtimemax = 30,
                    output = 'lonlat', grid = 't106grid', 
                    method = 'distance-weighted',
                    configfile = '/example/path/to/configfile.conf')
# * The load call you issued is:
Nicolau Manubens's avatar
Nicolau Manubens committed
# *   Load(var = "tas", exp = c("ExperimentID_A", "ExperimentID_B"), 
# *                     obs = c("ObservationID_X", "ObservationID_Y"), 
# *                     sdates = c("19901101", "19951101", "20001101"), 
# *                     grid = "t106grid", output = "lonlat", 
# *                     storefreq = "monthly", ...)
# * See the full call in '$load_parameters' after Load() finishes.
# * Reading configuration file: /example/path/to/configfile.conf 
# * Config file read successfully.
# * All pairs (var, exp) and (var, obs) have matching entries.
# * Fetching first experimental files to work out 'var_exp' size...
# * Exploring dimensions... /path/to/experimentA/monthly_mean/tas/tas_19901101.nc
# * Success. Detected dimensions of experimental data: 2, 5, 3, 30, 63, 134
# * Fetching first observational files to work out 'var_obs' size...
# * Exploring dimensions... /path/to/observationX/monthly_mean/tas/tas_199011.nc
# * Success. Detected dimensions of observational data: 2, 1, 3, 30, 63, 134
# * Will now proceed to read and process 96 data files:
# * The list is long. You can check after Load() finishes in '$source_files'.
# * Total size of requested data:  72938880 bytes.
# *   - Experimental data:  ( 2 x 5 x 3 x 30 x 63 x 134 ) x 8 bytes = 60782400 bytes.
# *   - Observational data: ( 2 x 1 x 3 x 30 x 63 x 134 ) x 8 bytes = 12156480 bytes.
# * If size of requested data is close to or above the free shared RAM memory, R will crash.
# * Loading... This may take several minutes...
# * Progress: 0% + 10% + 70% + 10% + 10%
str(data)
# List of 11
#  $ mod            : num [1:2, 1:5, 1:3, 1:30, 1:63, 1:134] 273 273 273 273 273 ...
#  $ obs            : num [1:2, 1, 1:3, 1:30, 1:63, 1:130] 273 273 273 NA 273 ...
#  $ lat            : num [1:63(1d)] 60 58.9 57.8 56.6 55.5 ...
#  $ lon            : num [1:134(1d)] 100 101 102 104 105 ...
Nicolau Manubens's avatar
Nicolau Manubens committed
#  $ source_files   : chr [1:96] "/path/to/experimentA/monthly_mean/tas/tas_19901101.nc"
#                                "/path/to/experimentA/monthly_mean/tas/tas_19951101.nc"
#                                "/path/to/experimentA/monthly_mean/tas/tas_20001101.nc"
#                                ...
#  $ not_found_files: NULL
#  $ load_parameters:List of 29
#   ..$ var         : chr "tas"
#   ..$ exp         : chr "ExperimentID_A" "ExperimentID_B"
#   ..$ obs         : chr "ObservationID_X" "ObservationID_Y"
#   ..$ sdates      : chr [1:3] "19901101" "19951101" "20001101"
#   ..$ grid        : chr "t106grid"
#   ..$ output      : chr "lonlat"
#   ..$ storefreq   : chr "monthly"
#   ..$ configfile  : /example/path/to/configfile.conf
#   ..$ dimnames    : NULL
#   ..$ latmax      : num 60
#   ..$ latmin      : num -10
#   ..$ leadtimemax : num 30
#   ..$ leadtimemin : num 1
#   ..$ lonmax      : num 250
#   ..$ lonmin      : num 100
#   ..$ maskmod     :List of 15
#   .. ..$ : NULL
#   .. ..$ : NULL
#   ...
#   ..$ maskobs     :List of 15
#   .. ..$ : NULL
#   .. ..$ : NULL
#   ...
#   ..$ method      : chr "distance-weighted"
#   ..$ nleadtime   : NULL
#   ..$ nmember     : NULL
#   ..$ nmemberobs  : NULL
#   ..$ nprocs      : NULL
#   ..$ remapcells  : num 2
#   ..$ sampleperiod: num 1
#   ..$ silent      : logi FALSE
#   ..$ suffixexp   : NULL
#   ..$ suffixobs   : NULL
#   ..$ varmax      : NULL
#   ..$ varmin      : NULL
#  $ when           : POSIXct[1:1], format: "2015-11-09 14:49:11"
#  $ dimnames       : chr [1:6] "dataset" "member" "sdate" "time" ...
#  $ units          : chr "K"
#  $ var_long_name  : chr "Sea surface temperature"
```
Nicolau Manubens's avatar
Nicolau Manubens committed

### Basic statistics
```R
```

### Verification
```R
```

### Visualisation
```R
```