vagudets · 4029ec2b
--- a/home.md
+++ b/home.md
@@ -4,9 +4,9 @@ This the GitLab repository for the SUNSET (SUbseasoNal to decadal climate foreca
 
 The datasets, forecast horizon, time period, skill metrics to compute and other parameters are specified by the user in a configuration file, called "recipe".

-After definining the recipe, the user can create a script using the functions available in the tool, which use apply the information in the recipe to do the desired computations. You an example script [in the SUNSET code Snippets](https://earth.bsc.es/gitlab/es/sunset/-/snippets/111).
+After definining the recipe, the user can create a script using the functions available in the tool, which use apply the information in the recipe to do the desired computations. You can find an example script [in the SUNSET code Snippets](https://earth.bsc.es/gitlab/es/sunset/-/snippets/111).

- Modules currently available: Loading, Units, Calibration, Downscaling, Anomalies, Skill, Indices, Saving, Visualization, Scorecards
+- Modules currently available: Loading, Units, Calibration, Downscaling, Anomalies, Skill, Indices, Visualization, Scorecards
 - Modules in development: Indicators, Statistics
 - Future modules: Aggregation

@@ -57,6 +57,7 @@ Analysis:
    type: to_system # Interpolate to: 'to_system', 'to_reference', 'none',
                    # or CDO-accepted grid. (Mandatory, str)
  Workflow:
+    # This is the section of the recipe where the parameters for each module are specified
    Calibration:
      method: mse_min # Calibration method. (Mandatory, str)
      save: 'all' # Options: 'all', 'none', 'exp_only', 'fcst_only' (Mandatory, str) 
@@ -71,27 +72,13 @@ Analysis:
      percentiles: [[1/3, 2/3], [1/10, 9/10], [1/4, 2/4, 3/4]] # Thresholds 
      # for quantiles and probability categories. Each set of thresholds should be
      # enclosed within brackets. (Optional)
-      save: 'percentiles_only' # Options: 'all', 'none', 'bins_only', 'percentiles_only' (Mandatory, str)
-    Indices: 
-      ## Indices available: NAO, Nino1+2, Nino3, Nino3.4, Nino4.
-      ## Each index can only be computed if its area is within the selected region. 
-      # obsproj: NAO computation method (see s2dv::NAO()) Default is yes/true. (Optional, bool)
-      # save: What to save. Options: 'all'/'none'. Default is 'all'. 
-      # plot_ts: Generate time series plot? Default is yes/true. (Optional, bool)
-      # plot_sp: Generate spatial pattern plot? Default is yes/true. (Optional, bool)
-      NAO: {obsproj: yes, save: 'all', plot_ts: yes, plot_sp: yes}
-      Nino1+2: {save: 'all', plot_ts: yes, plot_sp: yes}
-      Nino3: {save: 'all', plot_ts: yes, plot_sp: yes}
-      Nino3.4: {save: 'all', plot_ts: yes, plot_sp: yes}
-      Nino4: {save: 'all', plot_ts: yes, plot_sp: yes}      
+      save: 'percentiles_only' # Options: 'all', 'none', 'bins_only', 'percentiles_only' (Mandatory, str) 
    Visualization:
      plots: skill_metrics, most_likely_terciles, forecast_ensemble_mean # Types of plots to generate (Optional, str)
      multi_panel: yes # Multi-panel plot or single-panel plots. Default is 'no/false'. (Optional, bool)
      projection: 'cylindrical_equidistant' # Options: 'cylindrical_equidistant', 'robinson', 'lambert_europe'. Default is cylindrical equidistant. (Optional, str)
      mask_terciles: no # Whether to mask the non-significant points by rpss in the most likely tercile plot. yes/true, no/false or 'both'. Default is no/false. (Optional, str)
      dots_terciles: yes # Whether to dot the non-significant by rpss in the most likely tercile plot. yes/true, no/false or 'both'. Default is no/false. (Optional, str)
-    Indicators:
-      index: no # This feature is not implemented yet
  ncores: 10 # Number of cores to be used in parallel computation.
             # If left empty, defaults to 1. (Optional, int)
  remove_NAs: yes # Whether to remove NAs. 
@@ -216,45 +203,76 @@ For more details, see the [CSTools documentation](https://CRAN.R-project.org/pac
 - Monthly data: `'bias'`, `'evmos'`, `'mse_min'`, `'crps_min'`, and `'rpc-based'`.
 For more details, see the  [CSTools documentation](https://CRAN.R-project.org/package=CSTools) for `CST_Calibration()`.

+### Recipe template:
+
+```yaml
+    Calibration:
+      method: mse_min # Calibration method. (Mandatory, str)
+      save: 'all' # Options: 'all', 'none', 'exp_only', 'fcst_only' (Mandatory, str) 
+```
+
 ## Anomalies module

 The Anomalies module computes the anomalies of the data with respect to the climatological mean; with or without cross-validation, depending on what it specified in the recipe. It accepts the output of either the Loading or the Calibration module as input, and also requires the recipe. It makes use of the CSTools function `CST_Anomaly()`.

 The output of the main function, `Anomalies()`, is a list of `s2dv_cube` objects containing the anomalies for the hcst, fcst and obs, as well as the original hcst and obs full fields in case they are needed for later computations.

+If cross-validation is chosen, leave-one-out cross-validation will be applied. The cross-vaidation option is only available when the hindcast and the observations share the same grid. 
+
+### Recipe template:
+
+```yaml
+    Anomalies:
+      compute: yes # Either yes/true or no/false (Mandatory, bool)
+      cross_validation: no # Either yes/true or no/false (Mandatory if 'compute: yes', bool)
+      save: 'fcst_only' # Options: 'all', 'none', 'exp_only', 'fcst_only' (Mandatory if 'compute: yes', str) 
+```
+
 ## Downscaling Module

-The Downscaling module performs downscaling on the anomalies. It accepts the output of the Anomalies module as input and also requires the recipe. The module applies the selected downscaling method to the hindcast anomalies using observed anomalies as the reference and returns the downscaled data and its metadata as an s2dv_cube object.
+The Downscaling module performs downscaling on the anomalies making use of the functions in the [CSDownscale package](https://earth.bsc.es/gitlab/es/csdownscale). It accepts the output of the Anomalies module as input and also requires the recipe. The module applies the selected downscaling method to the hindcast anomalies using observed anomalies as the reference and returns the downscaled data and its metadata as an s2dv_cube object.

-The output of the main function, **Downscaling()**, is a list containing the downscaled hindcast and observations, named **hcst** and **obs**.
+The output of the main function, **Downscaling()**, is a list containing the downscaled hindcast and observations, named **hcst** and **obs**. Currently it is not possible to apply Downscaling to the forecast. 

 **Downscaling methods currently available:**

-The first step is to specify the type of downscaling, choosing from the following options:
- 'none' 
- 'analogs'
- 'int'
- 'intbc'
- 'intlr'
- 'logreg'. 
+The first step is to specify the type of downscaling, choosing from the following options: `'none'`, `'analogs'`, `'int'`, `'intbc'`, `'intlr'`, `'logreg'`. Only one downscaling type can be chosen per recipe. Detailed information about each methodology can be found in the CSDownscale documentation.

-This specification is a mandatory requirement and must be defined in the recipe under **Workflow:Downscaling:type**. The downscaling method can be further specified through the following sections:
+This specification is a mandatory requirement and must be defined in the recipe under **Workflow:Downscaling:type**. The downscaling method can be further specified through `_method` options. Each type of downscaling has different methods available:

- **Workflow:Downscaling:int_method** # optional; "NULL", 'con', 'bil', 'bic', 'nn', 'con2',"dis","laf".
- **Workflow:Downscaling:bc_method** # optional; NULL, "quantile_mapping" (or "qm"), "bias", "evmos", "mse_min", "crps_min" or "rpc-based".
- **Workflow:Downscaling:lr_method** # optional; 'NULL', 'basic', 'large-scale', '4nn'.
- **Workflow:Downscaling:log_reg_method** # optional; "NULL" ,"ens_mean", "ens_mean_sd", "sorted_members".
- **Workflow:Downscaling:nanalogs** # optional; number of analogs to be searched.
+- **int_method (for int, int_bc and intlr):** 'con', 'bil', 'bic', 'nn', 'con2', 'dis', 'laf'.
+- **bc_method (for int_bc):** 'quantile_mapping' (or 'qm'), 'bias', 'evmos', 'mse_min', 'crps_min' or 'rpc-based'.
+- **lr_method (for int_lr):** # 'basic', 'large-scale', '4nn'.
+- **log_reg_method (for logreg)** 'ens_mean', 'ens_mean_sd', 'sorted_members'.
+- **nanalogs (for analogs)** # Number of analogs to be searched. The default is 3.

-The user can only request one downscaling type per recipe. When selecting the downscaling method 'intbc,' both interpolation and bias correction methods should be specified; for 'intlr,' both interpolation and linear interpolation methods are required; and for 'logreg,' both interpolation and logistic regression methods should be provided. Leave-one-out cross-validation is always applied for all the methods in the module.
+When selecting the downscaling method `'intbc'`, both interpolation and bias correction methods should be specified; for 'intlr,' both interpolation and linear interpolation methods are required; and for 'logreg,' both interpolation and logistic regression methods should be provided. Leave-one-out cross-validation is always applied for all the methods in the module.

 Another option in the recipe is **Workflow:Downscaling:target_grid**. This argument is a character vector indicating the target grid (i.e., to which grid system the dataset will be downscaled). It can be the path to a netCDF file or a grid string or grid description file accepted by CDO.

 The **Workflow:Downscaling:size** argument can be used, if the type of downscaling is analogs and the input dataset is of daily/daily_mean frequency. It indicates the window size (in terms of days) along which the analogs will be searched (target_month -(+) size days). To utilize this argument, the user is required to supply observed anomalies for both the preceding and succeeding months, in addition to the target month. The input data must be provided within the `smonth` dimension of the observed dataset, with the centralization of the target month (i.e.,, the `smonth` dimension should be 3 for the observation, and the second dimension of smonth should correspond to the data for the target month.).
+An example of this is available here: [Code Snippet: Daily Downscaling with time window](https://earth.bsc.es/gitlab/es/sunset/-/snippets/122). 
+
+### Recipe template:
+
+```yaml
+ Downscaling:
+   # Assumption 1: leave-one-out cross-validation is always applied
+   # Assumption 2: for analogs, we select the best analog (minimum distance)
+   type: intbc  # mandatory, 'none', 'int', 'intbc', 'intlr', 'analogs', 'logreg'.
+   int_method: conservative  # regridding method accepted by CDO.
+   bc_method: bias  # If type intbc. Options: 'bias', 'calibration', 'quantile_mapping', 'qm', 'evmos', 'mse_min', 'crps_min', 'rpc-based'.
+   lr_method: # If type intlr. Options: 'basic', 'large_scale', '4nn'
+   log_reg_method: # If type logreg. Options: 'ens_mean', 'ens_mean_sd', 'sorted_members'
+   target_grid: /esarchive/recon/ecmwf/era5/monthly_mean/tas_f1h/tas_200002.nc  # nc file or grid accepted by CDO
+   nanalogs: # If type analgs. Number of analogs to be searched
+   save: 'all'  # 'all'/'none'/'exp_only'
+```

 ## Indices module

 The Indices module aggregates the hindcast and reference data to compute climatological indices such as the North Atlantic Oscillation (NAO) or El Niño indices. 
+
 The main function, `Indices()`, returns the hcst and obs s2dv_cube objects for each requested index, in the form of a list of lists. The 'latitude' and 'longitude' dimensions of the original arrays are aggregated into a single 'region' dimension.

 Indices currently available: 
@@ -267,15 +285,33 @@ Indices currently available:
 | **Niño 3.4**                   | Nino3.4       | -170º to -120º    | -5º to 5º        |
 | **Niño 4**                     | Nino4         | 160º to -150º     | -5º to 5º        |

+### Recipe template:
+
+```yaml
+Indices: 
+  ## Indices available: NAO, Nino1+2, Nino3, Nino3.4, Nino4.
+  ## Each index can only be computed if its area is within the selected region. 
+  # obsproj: NAO computation method (see s2dv::NAO()) Default is yes/true. (Optional, bool)
+  # save: What to save. Options: 'all'/'none'. Default is 'all'. 
+  # plot_ts: Generate time series plot? Default is yes/true. (Optional, bool)
+  # plot_sp: Generate spatial pattern plot? Default is yes/true. (Optional, bool)
+  # alpha: Significance threshold. Default value is 0.05 (Optional, numeric)
+  NAO: {obsproj: yes, save: 'all', plot_ts: yes, plot_sp: yes}
+  Nino1+2: {save: 'all', plot_ts: yes, plot_sp: yes, alpha: 0.05}
+  Nino3: {save: 'all', plot_ts: yes, plot_sp: yes, alpha: 0.05}
+  Nino3.4: {save: 'all', plot_ts: yes, plot_sp: yes, alpha: 0.05}
+  Nino4: {save: 'all', plot_ts: yes, plot_sp: yes, alpha: 0.05}
+```

 ## Skill module

 The Skill module is the part of the workflow that computes the metrics to assess the quality of a forecast. It accepts the output of the Calibration module as input, and also requires the recipe. It is comprised of two main functions:

-**Skill()**: Computes the verification metrics requested in `Workflow:Skill:metric`. The user can request an unlimited number of verification metrics per recipe. The following metrics are currently available:
+**Skill()**: Computes the verification metrics requested in `Workflow:Skill:metric`. The user can request an unlimited number of verification metrics per recipe. 
+The following metrics are currently available:

 - `EnsCorr`: Ensemble Mean Correlation.
- `Corr`: Ensemble Correlation.
+- `corr_individual_members`: Correlation for each individual member of the ensemble.
 - `RPS`: Ranked Probability Score.
 - `RPSS`: Ranked Probability Skill Score.
 - `FRPS`: Fair Ranked Probability Score.
@@ -294,8 +330,10 @@ The Skill module is the part of the workflow that computes the metrics to assess

 The output of `Skill()` is a list containing one or more arrays with named dimensions; usually 'var', ‘time’, ‘longitude’ and ‘latitude’. For more details on the specific output for each metric, see the documentation for [s2dv](https://CRAN.R-project.org/package=s2dv) and [SpecsVerification](https://CRAN.R-project.org/package=SpecsVerification).

-**Probabilities()** returns a list of two elements with arrays containing the values corresponding to the thresholds in `Workflow:Probabilities:percentiles` (`$percentiles`), as well as their probability bins (`$probs`). Each list contains arrays with named dimensions ‘time’, ‘longitude’ and ‘latitude’.
-For example, if the extremes (1/10, 9/10) are requested, the output will be:
+**Probabilities()** returns a list of lists. Inside the lists there are arrays containing the values corresponding to the thresholds defined in the recipe, in `Workflow:Probabilities:percentiles` (`$percentiles`), as well as their probability bins (`$probs`). 
+
+Each list contains arrays with named dimensions ‘time’, ‘longitude’ and ‘latitude’.
+For example, if the extremes ([1/10, 9/10]) are requested, the output will be:

 `$percentiles`:
 - `percentile_10`: The 10th percentile, or lower extreme.
@@ -313,6 +351,44 @@ For example, if the extremes (1/10, 9/10) are requested, the output will be:

 **Note**: When naming the variables, the probability thresholds are converted to percentiles and rounded to the nearest integer to avoid dots in variable or file names. However, this is just a naming convention; the computations are performed based on the original thresholds specified in the recipe.

+### Recipe template:
+
+```yaml
+    Skill:
+      metric: RPSS CRPSS # Skill metrics separated by spaces or commas. (Mandatory, str)
+      save: 'all' # Options: 'all', 'none' (Mandatory, str)
+```
+```yaml
+    Probabilities:
+      percentiles: [[1/3, 2/3], [1/10, 9/10]] # Thresholds for quantiles and probability categories. Each set of thresholds should be enclosed within brackets.
+      save: 'percentiles_only' # Options: 'all', 'none', 'bins_only', 'percentiles_only' (Mandatory, str)
+```
+
+## Scorecards module
+
+The Scorecards module takes the output netCDF files that are saved from the Skill module when SUNSET is run, and creates [Scorecard visualizations](https://earth.bsc.es/gitlab/ess/csscorecards) for the different systems and variables that were requested in the recipe.
+
+### Recipe template:
+
+```yaml
+    Scorecards:
+      execute: yes # yes/no
+      regions: # Mandatory: Define regions for which the spatial aggregation will be performed.
+               # The regions must be included within the area defined in the 'Analysis:Region' section.
+        Extra-tropical NH:  {lon.min: 0,  lon.max: 360, lat.min: 30, lat.max: 90}
+        Tropics: {lon.min: 0,  lon.max: 360, lat.min: -30, lat.max: 30}
+        Extra-tropical SH : {lon.min: 0,  lon.max: 360, lat.min: -90, lat.max: -30}
+      start_months: 1, 2, 3 # Mandatory, int: start months to visualise in scorecard table. Options: 'all' or a sequence of numbers.
+      metric: mean_bias enscorr rpss crpss enssprerr # Mandatory: metrics to visualise in scorecard table
+      metric_aggregation: 'score' # Mandatory, str: level of aggregation for skill scores. Options: 'score' or 'skill'
+      inf_to_na: True # Optional, bool: set inf values in data to NA, default is no/False
+      table_label: NULL # Optional, str: extra information to add in scorecard table title
+      fileout_label: NULL # Optional, str: extra information to add in scorecard output filename
+      col1_width: NULL # Optional, int: to adjust width of first column in scorecards table
+      col2_width: NULL # Optional, int: to adjust width of second column in scorecards table
+      calculate_diff: False # Mandatory, bool: True/False
+```
+
 ## Saving

 The Saving module contains several functions that export the data (the calibrated hindcast and forecast, the corresponding observations, the skill metrics, percentiles and probabilities) to netCDF files and save them.
@@ -418,4 +494,13 @@ The three functions that `Visualization()` calls can also be called independentl

 **plot_most_likely_terciles(recipe, archive, fcst, percentiles, outdir)**: Computes the forecast tercile probability bins with respect to the terciles provided in 'percentiles', then generates a figure with one plot per time step and saves it to the directory `outdir` as `forecast_most_likely_terciles.png`

+### Recipe template:

+```yaml
+    Visualization:
+      plots: skill_metrics, most_likely_terciles, forecast_ensemble_mean # Types of plots to generate (Optional, str)
+      multi_panel: yes # Multi-panel plot or single-panel plots. Default is 'no/false'. (Optional, bool)
+      projection: 'cylindrical_equidistant' # Options: 'cylindrical_equidistant', 'robinson', 'lambert_europe'. Default is cylindrical equidistant. (Optional, str)
+      mask_terciles: no # Whether to mask the non-significant points by rpss in the most likely tercile plot. yes/true, no/false or 'both'. Default is no/false. (Optional, str)
+      dots_terciles: yes # Whether to dot the non-significant by rpss in the most likely tercile plot. yes/true, no/false or 'both'. Default is no/false. (Optional, str)
+```
\ No newline at end of file