This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
working_groups:ukurbangroup [2022/08/05 12:25] mhajji old revision restored (2022/07/19 12:19) |
working_groups:ukurbangroup [2022/09/23 09:10] (current) acriado |
||
---|---|---|---|
Line 4: | Line 4: | ||
Universal Kriging is a common geostatistic technique used for spatial interpolation, | Universal Kriging is a common geostatistic technique used for spatial interpolation, | ||
- | spatial structure of the data—. In our case, we have applied this methodology as a post-process of the CALIOPE-Urban dispersion model, developed by the Earth Science Department of the Barcelona Supercomputing Center (BSC). To implement it, we have used the hourly observational NO2 data coming | + | spatial structure of the data—. In our case, we have applied this methodology as a post-process of the CALIOPE-Urban dispersion model, developed by the Earth Science Department of the Barcelona Supercomputing Center (BSC). To implement it, we have used the hourly observational NO2 data from 12 monitoring stations as the principal variable and the CALIOPE-Urban hourly NO2 output as the covariate. In addition, we have studied the added value to incorporate as the second covariate our |
time-invariant microscale-Land Use Regression (LUR) model, developed by using two different NO2 passive dosimeters campaigns and 8 predictors (urban geometric variables, simulated vehicular traffic densities, annually-averaged data bi-linearly interpolated from the regional CALIOPE system and the annually-averaged NO2 output of CALIOPE-Urban) through a machine learning approach. Our implementation is a data-fusion procedure used as a spatial NO2 bias correction in an urban area, the city of Barcelona. Moreover, this correction can be applied directly to the daily maximum NO2 concentrations, | time-invariant microscale-Land Use Regression (LUR) model, developed by using two different NO2 passive dosimeters campaigns and 8 predictors (urban geometric variables, simulated vehicular traffic densities, annually-averaged data bi-linearly interpolated from the regional CALIOPE system and the annually-averaged NO2 output of CALIOPE-Urban) through a machine learning approach. Our implementation is a data-fusion procedure used as a spatial NO2 bias correction in an urban area, the city of Barcelona. Moreover, this correction can be applied directly to the daily maximum NO2 concentrations, | ||
- | the implementation of this methodology is under revision. | + | implementing |
{{ : | {{ : | ||
Line 16: | Line 16: | ||
__Meriem Hajji__, meriem.hajji@bsc.es | __Meriem Hajji__, meriem.hajji@bsc.es | ||
- | ===== First Steps ===== | ||
- | To follow a basic tutorial using this methodology, | + | ===== Visualization ===== |
- | === Using the GitLab repository and copying all the needed archives === | + | It is recommended to follow |
- | - Copy all the functions | + | To configure |
- | <code bash> git clone https:// | + | |
- | The copied files from the repository: https:// | + | |
+ | * Create the file init_rstudio.sh and save it : | ||
+ | < | ||
+ | module load RStudio/ | ||
+ | module load R/ | ||
+ | rstudio & | ||
+ | </ | ||
+ | * Open the Rstudio application: | ||
+ | <code bash> | ||
+ | bash init_rstudio.sh | ||
+ | </ | ||
+ | | ||
+ | To launch it in the workstation, | ||
+ | * Open your bashrc: | ||
+ | <code bash> | ||
+ | vi ~/.bashrc | ||
+ | </ | ||
- | After copying that repository, a list of different archives will appear. They are classified into **R scripts** (the principal script is named by // | + | |
+ | <code bash> | ||
+ | module load RStudio/1.1.463-foss-2015a | ||
+ | module load R/3.6.1-foss-2015a-bare | ||
+ | </code> | ||
+ | * Save your bashrc: | ||
+ | <code bash> | ||
+ | source ~/.bashrc | ||
+ | </ | ||
+ | * Launch the script: | ||
+ | <code bash> | ||
+ | Rscript [script] | ||
+ | </ | ||
+ | ===== First Steps : understanding the scripts ===== | ||
- | At this point, I recommend following | + | Please follow |
+ | |||
+ | === Using the GitLab repository and copying all the needed archives === | ||
+ | |||
+ | - Copy all the functions and archives needed | ||
+ | <code bash> git clone https:// | ||
+ | After that, a folder called by default // | ||
+ | |||
+ | A list of different | ||
=== The configuration file === | === The configuration file === | ||
- | The configuration file is an archive used as a setup structure, | + | The configuration file is an archive used as a setup structure, |
* Through the Rstudio visualization: | * Through the Rstudio visualization: | ||
- | {{ : | + | {{ : |
* Through the terminal. To do that, go to the directory where all archives copied from the repository are kept, and type the following (in this case, the visualization is done through the program MobaXterm): | * Through the terminal. To do that, go to the directory where all archives copied from the repository are kept, and type the following (in this case, the visualization is done through the program MobaXterm): | ||
<code bash> | <code bash> | ||
vi config_file.yml </ | vi config_file.yml </ | ||
- | {{ : | + | {{ : |
Now we are going to see what is and the implication of each of the items that have to be filled in the configuration file: | Now we are going to see what is and the implication of each of the items that have to be filled in the configuration file: | ||
- | * **//UniversalKriging_path//** : (a folder path) \\ this will be the main folder. | + | * **//repository_path//** : (a folder path) \\ this would be the folder |
- | * **// | + | |
+ | * **// | ||
+ | |||
+ | * **// | ||
+ | * **// | ||
+ | | ||
+ | * **// | ||
+ | * **// | ||
- | * **//year//** : (a number) \\ this is referred | + | * **//full_year//** : (//TRUE// or //FALSE//) \\ if the user wants to bias-correct |
- | * | + | * **//date_begin & date_end//** : (R-vector format: c(year, month, day) ) \\ if **//full_year//** is //FALSE//, the user has to fill this parameter as a vector of R that contains |
- | * **//GHOST_no2_path//** : (a folder path: "/gpfs/projects/bsc32/AC_cache/obs/ghost/EEA_AQ_eReporting/1.4/ | + | |
- | * **// | + | * **// |
- | * **// | + | * **// |
- | * **// | + | - // |
- | + | | |
- | * **// | + | |
- | * | + | |
- | * **// | + | |
- | * **// | + | |
- | * **// | + | |
* **// | * **// | ||
* **// | * **// | ||
- | * **// | + | * **// |
As a resume, notice that the user only has to do the following about the configuration file: | As a resume, notice that the user only has to do the following about the configuration file: | ||
- | * Choosing | + | * Choosing |
* Choosing one of the **// | * Choosing one of the **// | ||
- | * Choosing the Universal Kriging mode in terms of the covariates, the application, | + | * Choosing the Universal Kriging mode in terms of the covariates, the application, |
* (The rest of the configuration should not be changed, only fill it as is explained). | * (The rest of the configuration should not be changed, only fill it as is explained). | ||
=== The structure of the folders === | === The structure of the folders === | ||
- | Before applying the methodology and obtaining the results, is important | + | Before applying the methodology and obtaining the results, |
- | + | ||
- | Remember that the parallelization is carried out in terms of the day. We are applying the methodology on a mesh composed approximately of 49000 points, each hour of the period chosen. The output of this methodology is __daily__, which means that the output files are referred to each day. Thus, each file will contain the correction on the 49000 points, 24 times regarding the 24h of the day. Please, see the examples to visualize the outputs of this methodology. | + | |
This is an example of the structure using the 2019 dataset : | This is an example of the structure using the 2019 dataset : | ||
{{ : | {{ : | ||
+ | |||
+ | Remember that the parallelization is carried out in terms of the day. We are applying the methodology on a mesh of approximately 49000 points, each hour of the chosen period. The output of this methodology is __daily__, which means that the output files are referred to each day. Thus, each file will contain the correction on the 49000 points, 24 times regarding the 24h of the day. Please, see the examples to visualize the outputs of this methodology. | ||
=== The main script and its explanation | === The main script and its explanation | ||
- | The // | + | The // |
* The first section, **//Config file//**, is about setting up the procedure by reading the configuration file. | * The first section, **//Config file//**, is about setting up the procedure by reading the configuration file. | ||
* The section **//initial setting//** takes into account: | * The section **//initial setting//** takes into account: | ||
- | - The principal | + | - The top libraries that will be used and the coordinates reference. |
- The pollutant, in this case we have to type //NO2// and //no2// for the model and observations configurations respectively. | - The pollutant, in this case we have to type //NO2// and //no2// for the model and observations configurations respectively. | ||
- The variogram' | - The variogram' | ||
Line 95: | Line 133: | ||
* In the section **//initial setting: directories// | * In the section **//initial setting: directories// | ||
* In the section **//initial setting: variables// | * In the section **//initial setting: variables// | ||
- | - All regarding dates is being set up. | + | - All regarding dates are being set up. |
- | - The mesh where the bias correction | + | - The mesh where the bias correction |
* In the section **// | * In the section **// | ||
* In the section **//caliope evaluation, mean and max//**, different scripts are used to prepare the files regarding the model (CALIOPE-Urban) output at the monitoring stations (caliope evaluation), | * In the section **//caliope evaluation, mean and max//**, different scripts are used to prepare the files regarding the model (CALIOPE-Urban) output at the monitoring stations (caliope evaluation), | ||
* In the section **// | * In the section **// | ||
- | * The section // | + | * The section // |
* In the section // | * In the section // | ||
- | * In the section **// | + | * In the section **// |
+ | * In the section **// | ||
- | Notice that the majority of the __// | + | Notice that the majority of the __// |
- | === Submitting jobs === | + | === First Steps : Submitting jobs === |
- | *It is recommended | + | *It is recommended first to look at the machines' guidelines |
- | To apply the methodology, | + | To apply the methodology, |
- | * //#SBATCH --time=48: | + | * //#SBATCH --time=48: |
* //#SBATCH --time=01: | * //#SBATCH --time=01: | ||
+ | |||
+ | The user has to charge the model in the //bashrc// (referred to the machine): | ||
+ | * Open your bashrc: | ||
+ | <code bash> | ||
+ | vi ~/.bashrc | ||
+ | </ | ||
+ | |||
+ | * Save the modules: | ||
+ | <code bash> | ||
+ | module load R | ||
+ | </ | ||
+ | |||
+ | * Save your bashrc: | ||
+ | <code bash> | ||
+ | source ~/.bashrc | ||
+ | </ | ||
+ | |||
To submit the job regarding this methodology, | To submit the job regarding this methodology, | ||
Line 141: | Line 197: | ||
==== Universal Kriging using CALIOPE-Urban as the unique covariate and the whole 2019 data, using all the possible applications ==== | ==== Universal Kriging using CALIOPE-Urban as the unique covariate and the whole 2019 data, using all the possible applications ==== | ||
- | - Preparing the configuration file: in this case the same as the example above. | + | - Preparing the configuration file: This is the same as the example above. |
- Enter the machine: | - Enter the machine: | ||
<code bash> | <code bash> | ||
Line 148: | Line 204: | ||
| | | | | | ||
| .-.--_ | | .-.--_ | ||
- | | ,´,´.´ `. | | + | | ,','.' |
| | | | BSC | | | | | | | BSC | | | ||
- | | `.`.`. _ .´ | | + | | `.`.`. _ .' |
| `·`·· | | `·`·· | ||
| | | | | | ||
Line 183: | Line 239: | ||
* //#SBATCH --time// : the computational time required | * //#SBATCH --time// : the computational time required | ||
* //#SBATCH -n// : the machine' | * //#SBATCH -n// : the machine' | ||
- | * //#SBATCH --constraint// | + | * //#SBATCH --constraint// |
As it would be the first submitted job, we use the maximum computational time (48h) and in this case, we choose to use 50 cores. The queue has to be //bsc_es// in this case. | As it would be the first submitted job, we use the maximum computational time (48h) and in this case, we choose to use 50 cores. The queue has to be //bsc_es// in this case. | ||
Line 217: | Line 273: | ||
* //USER//: the user's number. | * //USER//: the user's number. | ||
* //ST//: the status of the job, first if it is pending (PD) or running (R). Other options are completed (CD), completing (CG), failed (F), preempted (PR), suspended (S) or stopped (ST). All of this can be seen in the machine' | * //ST//: the status of the job, first if it is pending (PD) or running (R). Other options are completed (CD), completing (CG), failed (F), preempted (PR), suspended (S) or stopped (ST). All of this can be seen in the machine' | ||
- | * //TIME//: the time that has passed since the job is running. | + | * //TIME//: the time that has passed since the job started. |
* //NODES//: the machine' | * //NODES//: the machine' | ||
* // | * // | ||
Line 224: | Line 280: | ||
{{ : | {{ : | ||
- | 6. Waiting until the job is finished. Notice that when a job is submitted, two files are created: the output and the error ones (the user has defined their names in the job directories). In the output file, the user can visualize some indications that appear while running the job. In the error one, the user can see secondary errors that maybe were not enough to cancel the job's running | + | 6. Waiting until the job is finished. Notice that when a job is submitted, two files are created: the output and the error ones (the user has defined their names in the job directories). In the output file, the user can visualize some indications that appear while running the job. In the error one, the user can see secondary errors that may not be enough to cancel the job's running |
7. The job is completed. A folder named by **//2019/ //** will be created at the **// | 7. The job is completed. A folder named by **//2019/ //** will be created at the **// | ||
Line 254: | Line 310: | ||
ssh bscXXXXX@nord4.bsc.es | ssh bscXXXXX@nord4.bsc.es | ||
</ | </ | ||
- | 3. Preparing the job, in this case we are going to reduce the number of cores and computational time, so change the queue too, and not require high memory. | + | 3. Preparing the job, in this case we will reduce the number of cores and computational time, so change the queue too, and not require high memory. |
<code bash> | <code bash> |