README.md

s2dverification
===============

s2dverification (seasonal to decadal verification) is an R framework
that aids in the analysis of forecasts from the data retrieval stage,
through computation of statistics and skill scores against observations,
to visualisation of data and results. While some of its components are
only targeted to verification of seasonal to decadal climate forecasts,
it provides tools that can be useful for verification of forecasts
in any field.

Find out more in the overview below, on the wiki page at
<https://earth.bsc.es/gitlab/es/s2dverification/wikis/home> or on the
CRAN website at
<https://cran.r-project.org/web/packages/s2dverification/index.html>.  
You can also sign up to the s2dverification mailing list by sending a
message with the subject 'subscribe' to <s2dverification-request@bsc.es>
if you want to keep abreast of internal discussons or latest development
releases.

Installation
------------

s2dverification has a system dependency, the CDO libraries, for
interpolation of grid data and retrieval of metadata. Make sure you have
these libraries installed in the system or download and install from
<https://code.zmaw.de/projects/cdo>.

You can then install the publicly released version of s2dverification from CRAN:
```r
    install.packages("s2dverification")
```
Or the development version from the GitLab repository:
```r
    # install.packages("devtools")
    devtools::install_git("https://earth.bsc.es/gitlab/es/s2dverification.git")
```
Overview
--------

The following diagram depicts the modules of s2dverification and how
they interact:

<p align="center">
  <img src="vignettes/s2dv_modules.png" width="800" />
</p>

The [**Data
retrieval**](https://earth.bsc.es/gitlab/es/s2dverification/wikis/data_retrieval.md)
module allows you to gather and homogenize NetCDF data files stored in a
local or remote file system. Some simple previous steps are required, however, 
to set up some configuration parameters so that the module can locate the
source files and recognize the variables of interest.

Once the data has been loaded into an R object, [**Basic
statistics**](https://earth.bsc.es/gitlab/es/s2dverification/wikis/basic_statistics.md)
can be computed, such as climatologies, trends, bias correction,
smoothing, ...

Either after computing basic statistics or directly from the original
data, the functions in the
[**Verification**](https://earth.bsc.es/gitlab/es/s2dverification/wikis/verification.md)
module allow you to compute deterministic and probabilistic scores and
skill scores, such as root mean square error, time or spatial
correlation or brier score.

[**Visualisation**](https://earth.bsc.es/gitlab/es/s2dverification/wikis/visualisation.md)
functions are also provided to plot the results obtained from any of the
modules above.

If it's your first time using s2dverification you can review the 
[**Tutorials**](https://earth.bsc.es/gitlab/es/s2dverification/wikis/tutorials.ms)
section or check the example below. You will find more detailed examples in the 
documentation page of each module.  
You can also check the examples of usage of each function after attaching the 
package as follows:
```r
    ls('package:s2dverification')
    ##  [1] "ACC"                        "Alpha"                     
    ##  [3] "Ano"                        "Ano_CrossValid"            
    ##  [5] "Clim"                       "ColorBar"                  
    ##  [7] "ConfigAddEntry"             "ConfigApplyMatchingEntries"
    ##  [9] "ConfigEditDefinition"       "ConfigEditEntry"           
    ## [11] "ConfigFileCreate"           "ConfigFileOpen"            
    ## [13] "ConfigFileSave"             "ConfigRemoveDefinition"    
    ## [15] "ConfigRemoveEntry"          "ConfigShowDefinitions"     
    ## [17] "ConfigShowSimilarEntries"   "ConfigShowTable"           
    ## [19] "Consist_Trend"              "Corr"                      
    ## [21] "CRPS"                       "Enlarge"                   
    ## [23] "Eno"                        "EnoNew"                    
    ## [25] "Filter"                     "FitAcfCoef"                
    ## [27] "FitAutocor"                 "GenSeries"                 
    ## [29] "Histo2Hindcast"             "IniListDims"               
    ## [31] "InsertDim"                  "LeapYear"                  
    ## [33] "Load"                       "Mean1Dim"                  
    ## [35] "MeanListDim"                "Plot2VarsVsLTime"          
    ## [37] "PlotACC"                    "PlotAno"                   
    ## [39] "PlotClim"                   "PlotEquiMap"               
    ## [41] "PlotSection"                "PlotStereoMap"             
    ## [43] "PlotVsLTime"                "ProbBins"                  
    ## [45] "RatioRMS"                   "RatioSDRMS"                
    ## [47] "Regression"                 "RMS"                       
    ## [49] "RMSSS"                      "sampleDepthData"           
    ## [51] "sampleMap"                  "sampleTimeSeries"          
    ## [53] "Season"                     "SelIndices"                
    ## [55] "Smoothing"                  "Spectrum"                  
    ## [57] "Spread"                     "Trend"
```
```r 
    ?FunctionName
```

Example
-------

Next you can see an example of usage of s2dverification spanning its
four modules.

### Data retrieval

First the package is loaded and attached.  
Then a list is built with information on the location of each dataset to load.
```r
    library(s2dverification, lib.loc = '~/s2dverification/s2dverification.Rcheck')

    expA <- list(name = 'experimentA',
                 path = file.path('/path/to/experiments/$EXP_NAME$/monthly_mean', 
                                  '$VAR_NAME$/$VAR_NAME$_$START_DATE$.nc'))
    expB <- list(name = 'experimentB',
                 path = file.path('/path/to/experiments/$EXP_NAME$/monthly_mean',
                                  '$VAR_NAME$/$VAR_NAME$_$START_DATE$.nc'))
    obsX <- list(name = 'observationX',
                 path = file.path('/path/to/observations/$OBS_NAME$/monthly_mean',
                                  '$VAR_NAME$/$VAR_NAME$_$YEAR$$MONTH$.nc'))
```
Finally the data is loaded with `Load()` providing the previously built lists 
to specify the desired datasets and other parameters to select the Earth 
surface region, starting dates and time period to load data from.  
In this example, the requested format is 2-dimensional: all the loaded data 
will be remapped onto the specified common grid via CDO libraries.
```r
    data <- Load('tas', list(expA, expB), list(obsX),
                 sdates = c('19851101', '19911101', '19971101'),
                 lonmin = 100, lonmax = 250, latmin = -10, latmax = 60,
                 leadtimemin = 2, leadtimemax = 7,
                 output = 'lonlat', grid = 't106grid',
                 method = 'distance-weighted')

    ## * The load call you issued is:
    ## *   Load(var = "tas", exp = list(structure(list(name = "experimentA", path
    ## *        =
    ## *        "/path/to/experiments/$EXP_NAME$/monthly_mean/$VAR_NAME$/$VAR_NAME$_$START_DATE$.nc"),
    ## *        .Names = c("name", "path")), structure(list(name =
    ## *        "experimentB", path =
    ## *        "/path/to/experiments/$EXP_NAME$/monthly_mean/$VAR_NAME$/$VAR_NAME$_$START_DATE$.nc"),
    ## *        .Names = c("name", "path"))), obs =
    ## *        list(structure(list(name = "observationX", path =
    ## *        "/path/to/observations/$OBS_NAME$/monthly_mean/$VAR_NAME$/$VAR_NAME$_$YEAR$$MONTH$.nc"),
    ## *        .Names = c("name", "path"))), sdates = c("19851101",
    ## *        "19911101", "19971101"), grid = "t106grid", output =
    ## *        "lonlat", storefreq = "monthly", ...)
    ## * See the full call in '$load_parameters' after Load() finishes.
    ## * Fetching first experimental files to work out 'var_exp' size...
    ## * Exploring dimensions... /path/to/experiments/experimentA/monthly_mean/tas/tas_19851101.nc 
    ## * Success. Detected dimensions of experimental data: 2, 11, 3, 6, 63, 134
    ## * Fetching first observational files to work out 'var_obs' size...
    ## * Exploring dimensions... /path/to/observations/observationX/monthly_mean/tas/tas_198512.nc 
    ## * Success. Detected dimensions of observational data: 1, 1, 3, 6, 63, 134
    ## * Will now proceed to read and process 24 data files:
    ## *   /path/to/experiments/experimentA/monthly_mean/tas/tas_19851101.nc
    ## *   /path/to/experiments/experimentA/monthly_mean/tas/tas_19911101.nc
    ## *   /path/to/experiments/experimentA/monthly_mean/tas/tas_19971101.nc 
    ## *   /path/to/experiments/experimentB/monthly_mean/tas/tas_19851101.nc
    ## *   /path/to/experiments/experimentB/monthly_mean/tas/tas_19911101.nc
    ## *   /path/to/experiments/experimentB/monthly_mean/tas/tas_19971101.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_198512.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_198601.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_198602.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_198603.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_198604.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_198605.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_199112.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_199201.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_199202.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_199203.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_199204.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_199205.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_199712.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_199801.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_199802.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_199803.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_199804.nc
    ## *   /path/to/observations/observationX/monthly_mean/tas/tas_199805.nc
    ## * Total size of requested data:  27959904 bytes.
    ## *   - Experimental data:  ( 2 x 11 x 3 x 6 x 63 x 134 ) x 8 bytes = 26744256 bytes.
    ## *   - Observational data: ( 1 x 1 x 3 x 6 x 63 x 134 ) x 8 bytes = 1215648 bytes.
    ## * If size of requested data is close to or above the free shared RAM memory, R will crash.
    ## * Loading... This may take several minutes...
    ## * Progress: 0% + 33% + 33% + 33%
```
The output consists of two arrays of data (experimental and observational 
data) with labelled dimensions, a list of loaded files, a list of not found
files and a call stamp to exactly reproduce as needed, among others.  
See [**Data
retrieval**](https://earth.bsc.es/gitlab/es/s2dverification/wikis/data_retrieval.md)
for a full explanation of the capabilities and outputs of `Load()`.

### Basic statistics


### Verification

### Visualisation

First, a slice of raw data of each dataset is visualised on an equidistant 
projection with `PlotEquiMap()` to check that the load process has worked
without relevant issues.
```r
    PlotEquiMap(data$mod[1, 1, 1, 1, , ], data$lon, data$lat)
    PlotEquiMap(data$mod[2, 1, 1, 1, , ], data$lon, data$lat)
    PlotEquiMap(data$obs[1, 1, 1, 1, , ], data$lon, data$lat)
```
<p align="center">
  <img src="vignettes/equi_map_raw_all.png" width="800" />
</p>
See the <a onclick="document.getElementById('example1').display = 'inline';">code</a> used to generate this figure.
<div id="example1" style="display:none">
example code
</div>
<p>