# C3S-512 CDS Data Checker ## Install & Run ```bash conda create -y -n dqc python=3 conda activate dqc git clone https://earth.bsc.es/gitlab/external/c3s512-wp1-datachecker.git cd c3s512-wp1-datachecker pip install -r requirements.txt cd dqc_chekcer python checker.py ``` **Note**: In the following section you will find information on how to write your own **config_file**. ## Configure ```bash - In order to run the checker you must write a simple config - There is a general section where general dataset and path options are specified - Each config section represents a **check** (ex: file_format or temporal_completeness) - Each config section might have specific parameters related to the specific check (see example below) ``` **Note 1**: Config examples for **ALL** available checks can be found in the **dqc_wrapper/conf** folder.

**Note 2**: The following config checks for temporal consistency. Multiple checks can be stacked in one file. ```` [general] input = /shared/cds_downloads/seasonal/seasonal-original-single-levels/2m_temperature fpattern = ecmwf-5-*.grib log_dir = /tmp/dqc_logs type = grib [temporal_completeness] forms_dir = /data/cds-forms-c3s cds_dataset = seasonal-original-single-levels cds_variable = 2m_temperature origin = ecmwf system = 5 ```` ## Config options (detailed) The **config** is defined in the .ini format compatible with the python ConfigParser package. Each section represents an independent data **check**.

```` [general]: input: Directory or file to be checked. pattern: If a directory is provided the pattern can be used to filter the files. log_dir: Directory where DQC logs are stored type: grib or CF variable: Variable to analyze (if grib, see grib_ls command) **OPTIONAL** datatype: Data type to analyze (if grib, see grib_ls command) **OPTIONAL** [file_format]: No parameters required [standard_compliance]: No parameters required [spatial_completeness]: mask_file: Land/Sea mask for nodata lookup. **OPTIONAL** mask_var: if mask is a grib file (specify variable). See grib_ls for details. **OPTIONAL** mask_dim: if mask is a grib file (specify dimension). See grib_ls for details. **OPTIONAL** [temporal_completeness] forms_dir: directory where c3s forms metadata is stored cds_dataset: dataset identifier as seen in the CDS cds_variable: variable identifier as seen in the CDS origin: origin as seen in the CDS system: system as seen in the CDS [spatial_consistency]: grid_interval: Resolution of the grid (positive value), typically xinc grid_type: Type of Grid (gaussian, lonlat, ...) [temporal_consistency]: time_step: Time step, positive integer value time_granularity: Time unit (Hour, Day, Month, Year) [valid_ranges]: valid_min: if defined used as minimum threshold **OPTIONAL** valid_max: if defined used as maximum threshold **OPTIONAL** ```` ## Recent updates You can find an updated LOG to track new major modifications here:
* [UPDATE LOG](UPDATE_LOG.md) ## Description The main function of this Gitlab Project is to join all the efforts done in the evaluation of the **C**limate **D**ata **S**tore (**CDS**).

You can find these things: * [Summary of Avalable Data Checkers](01_summary_data_checkers.md) * [Fist dataset inventory of the CDS](02_cds_inventory.md) * [First CF check LOG using existing cfchecker for NetCDF files](CF_checker_log/)