Newer
Older
# C3S-512 CDS Data Checker
git clone https://earth.bsc.es/gitlab/external/c3s512-wp1-datachecker.git
cd c3s512-wp1-datachecker
pip install -r requirements.txt
cd dqc_chekcer
python checker.py <config_file>
```
**Note**: In the following section you will find information on how to write your own **config_file**.
## Configure
```bash
- In order to run the checker you must write a simple config
- There is a general section where general dataset and path options are specified
- Each config section represents a **check** (ex: file_format or temporal_completeness)
- Each config section might have specific parameters related to the specific check (see example below)
**Note 1**: Config examples for **ALL** available checks can be found in the **dqc_wrapper/conf** folder.<br></br>
**Note 2**: The following config checks for temporal consistency. Multiple checks can be stacked in one file.
input = /shared/cds_downloads/seasonal/seasonal-original-single-levels/2m_temperature
fpattern = ecmwf-5-*.grib
log_dir = /tmp/dqc_logs
type = grib
cds_dataset = seasonal-original-single-levels
cds_variable = 2m_temperature
origin = ecmwf
system = 5
````
## Config options (detailed)
The **config** is defined in the .ini format compatible with the python ConfigParser package. Each section represents an independent data **check**.<br></br>
````
input: Directory or file to be checked.
pattern: If a directory is provided the pattern can be used to filter the files.
log_dir: Directory where DQC logs are stored
type: grib or CF
variable: Variable to analyze (if grib, see grib_ls command) **OPTIONAL**
datatype: Data type to analyze (if grib, see grib_ls command) **OPTIONAL**
[file_format]:
No parameters required
[standard_compliance]:
No parameters required
[spatial_completeness]:
mask_file: Land/Sea mask for nodata lookup. **OPTIONAL**
mask_var: if mask is a grib file (specify variable). See grib_ls for details. **OPTIONAL**
mask_dim: if mask is a grib file (specify dimension). See grib_ls for details. **OPTIONAL**
[temporal_completeness]
forms_dir: directory where c3s forms metadata is stored
cds_dataset: dataset identifier as seen in the CDS
cds_variable: variable identifier as seen in the CDS
origin: origin as seen in the CDS
system: system as seen in the CDS
[spatial_consistency]:
grid_interval: Resolution of the grid (positive value), typically xinc
grid_type: Type of Grid (gaussian, lonlat, ...)
[temporal_consistency]:</font>
time_step: Time step, positive integer value
time_granularity: Time unit (Hour, Day, Month, Year)
[valid_ranges]:
valid_min: if defined used as minimum threshold **OPTIONAL**
valid_max: if defined used as maximum threshold **OPTIONAL**
## Recent updates
You can find an updated LOG to track new major modifications here:<br>
* [UPDATE LOG](UPDATE_LOG.md)
The main function of this Gitlab Project is to join all the efforts done in the evaluation of the **C**limate **D**ata **S**tore (**CDS**).<br><br>
You can find these things:
* [Summary of Avalable Data Checkers](01_summary_data_checkers.md)
* [Fist dataset inventory of the CDS](02_cds_inventory.md)
* [First CF check LOG using existing cfchecker for NetCDF files](CF_checker_log/)
<br><br>