Newer
Older
# C3S-512 CDS Data Checker
The main function of this Gitlab Project is to join all the efforts done in the data evaluation of the **C**limate **D**ata **S**tore (**CDS**).<br></br>
git clone https://earth.bsc.es/gitlab/external/c3s512-wp1-datachecker.git
cd c3s512-wp1-datachecker
pip install -r requirements.txt
cd dqc_chekcer
python checker.py <config_file>
```
**Note**: In the following section you will find information on how to write your own **config_file**.
## Configure
```bash
- In order to run the checker you must write a simple config (ConfigParser ini format)
- There is a general section where general path options are specified
- There is a dataset section where dataset dependant information shall be specified
- Each config section represents a check/test (ex: file_format or temporal_completeness)
- Each config section might have specific parameters related to the specific check (see example below)
**Note 1**: Config examples for **ALL** available checks can be found in the **dqc_wrapper/conf** folder.<br></br>
**Note 2**: The following config checks for temporal consistency. Multiple checks can be stacked in one file.
input = /data/dqc_test_data/seasonal/seasonal-monthly-single-levels/2m_temperature
fpattern = ecmwf-5_fcmean*.grib
log_dir = /my/log/directory
res_dir = /my/output/directory
[dataset]
variable = t2m
datatype = fcmean
cds_dataset = seasonal-monthly-single-levels
The **config** is defined in the .ini format compatible with the python ConfigParser package.<br></br>
Each section represents an independent data **check**. The following example is for **ALL** available tests:<br></br>
# Directory or file to be checked.
input = /data/dqc_test_data/seasonal/seasonal-monthly-single-levels/2m_temperature
# If a directory is provided the pattern can be used to filter the files. Can be empty, then every file is taken
fpattern = ecmwf-5*.grib
# Directory where DQC logs are stored
log_dir = /tmp/dqc_logs
# Directory where DQC test results are stored (will be created if it does not exist)
res_dir = /tmp/dqc_res
# Directory with constrains.json per product (a.k.a c3sforms)
forms_dir = /data/cds-forms-c3s
[dataset]
# Variable to analyze (if grib, see grib_ls command) **OPTIONAL**
variable = t2m
# Data type to analyze (if grib, see grib_ls command) **OPTIONAL**
datatype = fcmean
# Dataset (as available in c3s catalogue form)
cds_dataset = seasonal-monthly-single-levels
# Variable (form variable)
cds_variable = 2m_temperature
# Land/Sea mask if available
mask_file =
# Variable name within the mask grib file (default is lsm)
mask_var =
# Origin (for seasonal products, otherwise optional)
origin = ecmwf
# System (for seasonal products, otherwise optional)
system = 5
# Flag indicating if dataset is seasonal (monthly, daily)
is_seasonal =
# Resolution of the grid (positive value), typically xinc
grid_interval = 1
# Type of Grid (gaussian, lonlat, ...)
grid_type = lonlat
# Time step, positive integer value
time_step = 1
# Time unit (Hour, Day, Month, Year) or (h,d,m,y)
time_granularity = month
[valid_ranges]
# In case the valid minimum for the data is known (Otherwise, thresholds are set statistically)
valid_min =
# In case the valid maximum for the data is known (Otherwise, thresholds are set statistically)
valid_max =
````
## Result
Each test run produces a result inside the **res_dir** specified in the **general** section.<br></br>
The result zip file contains a PDF report for each of the tests launched.<br></br>
The section _result contains (ok/err) indicating sucess and a short message and log location.<br></br>
````
[spatial_consistency]
grid_interval = 0.25
grid_type = lonlat
[spatial_consistency_result]
res = ok
msg = Files are spatially consistent
## Recent updates
You can find an updated LOG to track new major modifications here:<br>
* [UPDATE LOG](UPDATE_LOG.md)
<br><br>