README.md 3.32 KB
Newer Older
# C3S-512 CDS Data Checker
Joan Sala Calero's avatar
Joan Sala Calero committed


## Install & Run

```bash
Joan Sala Calero's avatar
Joan Sala Calero committed
conda create -y -n dqc python=3
conda activate dqc
Joan Sala Calero's avatar
Joan Sala Calero committed
git clone https://earth.bsc.es/gitlab/external/c3s512-wp1-datachecker.git
cd c3s512-wp1-datachecker
pip install -r requirements.txt
cd dqc_chekcer

python checker.py <config_file>
Joan Sala Calero's avatar
Joan Sala Calero committed

```
**Note**: In the following section you will find information on how to write your own **config_file**.

## Configure

```bash
- In order to run the checker you must write a simple config
- There is a general section where general dataset and path options are specified
Joan Sala Calero's avatar
Joan Sala Calero committed
- Each config section represents a **check** (ex: file_format or temporal_completeness)
- Each config section might have specific parameters related to the specific check (see example below)
Joan Sala Calero's avatar
Joan Sala Calero committed

Joan Sala Calero's avatar
Joan Sala Calero committed
```
Joan Sala Calero's avatar
Joan Sala Calero committed
**Note 1**: Config examples for **ALL** available checks can be found in the **dqc_wrapper/conf** folder.<br></br>
Joan Sala Calero's avatar
Joan Sala Calero committed
**Note 2**: The following config checks for temporal consistency. Multiple checks can be stacked in one file.
Joan Sala Calero's avatar
Joan Sala Calero committed

````
[general]
Joan Sala Calero's avatar
Joan Sala Calero committed
input = /shared/cds_downloads/seasonal/seasonal-original-single-levels/2m_temperature
fpattern = ecmwf-5-*.grib
log_dir = /tmp/dqc_logs
type = grib
Joan Sala Calero's avatar
Joan Sala Calero committed

[temporal_completeness]
Joan Sala Calero's avatar
Joan Sala Calero committed
forms_dir = /data/cds-forms-c3s
Joan Sala Calero's avatar
Joan Sala Calero committed
cds_dataset = seasonal-original-single-levels
cds_variable = 2m_temperature
Joan Sala Calero's avatar
Joan Sala Calero committed
origin = ecmwf
system = 5
````

## Config options (detailed)

The **config** is defined in the .ini format compatible with the python ConfigParser package. Each section represents an independent data **check**.<br></br>

````
Joan Sala Calero's avatar
Joan Sala Calero committed
[general]:
Joan Sala Calero's avatar
Joan Sala Calero committed
input: Directory or file to be checked.
pattern: If a directory is provided the pattern can be used to filter the files.
log_dir: Directory where DQC logs are stored
type: grib or CF
Joan Sala Calero's avatar
Joan Sala Calero committed
variable: Variable to analyze (if grib, see grib_ls command) **OPTIONAL**
datatype: Data type to analyze (if grib, see grib_ls command) **OPTIONAL**
Joan Sala Calero's avatar
Joan Sala Calero committed

[file_format]:
No parameters required

[standard_compliance]:
No parameters required

[spatial_completeness]:
Joan Sala Calero's avatar
Joan Sala Calero committed
mask_file: Land/Sea mask for nodata lookup. **OPTIONAL**
mask_var: if mask is a grib file (specify variable). See grib_ls for details. **OPTIONAL**
mask_dim: if mask is a grib file (specify dimension). See grib_ls for details. **OPTIONAL**

[temporal_completeness]
forms_dir: directory where c3s forms metadata is stored
cds_dataset: dataset identifier as seen in the CDS
cds_variable: variable identifier as seen in the CDS
origin: origin as seen in the CDS
system: system as seen in the CDS 
Joan Sala Calero's avatar
Joan Sala Calero committed

[spatial_consistency]:
grid_interval: Resolution of the grid (positive value), typically xinc
grid_type: Type of Grid (gaussian, lonlat, ...)

[temporal_consistency]:</font>
time_step: Time step, positive integer value
time_granularity: Time unit (Hour, Day, Month, Year)

[valid_ranges]:
Joan Sala Calero's avatar
Joan Sala Calero committed
valid_min: if defined used as minimum threshold **OPTIONAL**
valid_max: if defined used as maximum threshold **OPTIONAL**
Joan Sala Calero's avatar
Joan Sala Calero committed
````
Joan Sala Calero's avatar
Joan Sala Calero committed

## Recent updates

You can find an updated LOG to track new major modifications here:<br>
* [UPDATE LOG](UPDATE_LOG.md) 
Joan Sala Calero's avatar
Joan Sala Calero committed

## Description

The main function of this Gitlab Project is to join all the efforts done in the evaluation of the **C**limate **D**ata **S**tore (**CDS**).<br><br>
You can find these things:
* [Summary of Avalable Data Checkers](01_summary_data_checkers.md)
* [Fist dataset inventory of the CDS](02_cds_inventory.md)
* [First CF check LOG using existing cfchecker for NetCDF files](CF_checker_log/)
Joan Sala Calero's avatar
Joan Sala Calero committed