This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
working_groups:ukurbangroup [2022/09/19 09:32] acriado |
working_groups:ukurbangroup [2022/09/23 09:09] acriado |
||
---|---|---|---|
Line 4: | Line 4: | ||
Universal Kriging is a common geostatistic technique used for spatial interpolation, | Universal Kriging is a common geostatistic technique used for spatial interpolation, | ||
- | spatial structure of the data—. In our case, we have applied this methodology as a post-process of the CALIOPE-Urban dispersion model, developed by the Earth Science Department of the Barcelona Supercomputing Center (BSC). To implement it, we have used the hourly observational NO2 data coming | + | spatial structure of the data—. In our case, we have applied this methodology as a post-process of the CALIOPE-Urban dispersion model, developed by the Earth Science Department of the Barcelona Supercomputing Center (BSC). To implement it, we have used the hourly observational NO2 data from 12 monitoring stations as the principal variable and the CALIOPE-Urban hourly NO2 output as the covariate. In addition, we have studied the added value to incorporate as the second covariate our |
time-invariant microscale-Land Use Regression (LUR) model, developed by using two different NO2 passive dosimeters campaigns and 8 predictors (urban geometric variables, simulated vehicular traffic densities, annually-averaged data bi-linearly interpolated from the regional CALIOPE system and the annually-averaged NO2 output of CALIOPE-Urban) through a machine learning approach. Our implementation is a data-fusion procedure used as a spatial NO2 bias correction in an urban area, the city of Barcelona. Moreover, this correction can be applied directly to the daily maximum NO2 concentrations, | time-invariant microscale-Land Use Regression (LUR) model, developed by using two different NO2 passive dosimeters campaigns and 8 predictors (urban geometric variables, simulated vehicular traffic densities, annually-averaged data bi-linearly interpolated from the regional CALIOPE system and the annually-averaged NO2 output of CALIOPE-Urban) through a machine learning approach. Our implementation is a data-fusion procedure used as a spatial NO2 bias correction in an urban area, the city of Barcelona. Moreover, this correction can be applied directly to the daily maximum NO2 concentrations, | ||
- | the implementation of this methodology is under revision. | + | implementing |
{{ : | {{ : | ||
Line 19: | Line 19: | ||
===== Visualization ===== | ===== Visualization ===== | ||
- | At this point, it is recommended to follow the tutorial using the Rstudio program to visualize the different scripts. | + | It is recommended to follow the tutorial using the Rstudio program |
+ | |||
+ | To configure the Rstudio app and open it through your workstation, | ||
+ | |||
+ | * Create the file init_rstudio.sh and save it : | ||
+ | < | ||
+ | module load RStudio/ | ||
+ | module load R/ | ||
+ | rstudio & | ||
+ | </ | ||
+ | * Open the Rstudio application: | ||
+ | <code bash> | ||
+ | bash init_rstudio.sh | ||
+ | </ | ||
+ | |||
+ | To launch it in the workstation, | ||
+ | |||
+ | * Open your bashrc: | ||
+ | <code bash> | ||
+ | vi ~/.bashrc | ||
+ | </ | ||
+ | |||
+ | * Save the modules: | ||
+ | <code bash> | ||
+ | module load RStudio/ | ||
+ | module load R/ | ||
+ | </ | ||
+ | |||
+ | * Save your bashrc: | ||
+ | <code bash> | ||
+ | source ~/.bashrc | ||
+ | </ | ||
+ | * Launch the script: | ||
+ | <code bash> | ||
+ | Rscript [script] | ||
+ | </ | ||
===== First Steps : understanding the scripts ===== | ===== First Steps : understanding the scripts ===== | ||
- | To follow a basic tutorial using this methodology, please follow the next steps. The procedure is implemented using the R software. | + | Please follow the next steps to follow a basic tutorial using this methodology. The procedure is implemented using the R software. |
=== Using the GitLab repository and copying all the needed archives === | === Using the GitLab repository and copying all the needed archives === | ||
- | - Copy all the functions and archives needed to implement the procedure in your own directory. To do that, open a terminal and copy the following command: | + | - Copy all the functions and archives needed to implement the procedure in your directory. To do that, open a terminal and copy the following command: |
<code bash> git clone https:// | <code bash> git clone https:// | ||
- | After doing that, a folder called by default // | + | After that, a folder called by default // |
- | In the folder //general//, a list of different archives will appear. They are classified into **R scripts** (the principal script is named by // | + | A list of different archives will appear in the folder // |
=== The configuration file === | === The configuration file === | ||
- | The configuration file is an archive used as a setup structure, meaning that the variables that appear in it can be changed and produce a different output. It is a separate file, so the main advantage is that can be modified without varying the rest of the scripts. The first step to begin consists of filling it. Notice that this is the __only archive__ that has to be modified in terms of your goal. Before starting to modify it, its shape would look like this: | + | The configuration file is an archive used as a setup structure, meaning that the variables that appear in it can be changed and produce a different output. It is a separate file, so the main advantage is that it can be modified without varying the rest of the scripts. The first step to begin consists of filling it. Notice that this is the __only archive__ that has to be modified in terms of your goal. Before starting to modify it, its shape would look like this: |
* Through the Rstudio visualization: | * Through the Rstudio visualization: | ||
{{ : | {{ : | ||
Line 45: | Line 80: | ||
Now we are going to see what is and the implication of each of the items that have to be filled in the configuration file: | Now we are going to see what is and the implication of each of the items that have to be filled in the configuration file: | ||
- | * **//UniversalKriging_path//** : (a folder path) \\ this will be the main folder, the one that is created after copying | + | * **//repository_path//** : (a folder path) \\ this would be the folder |
- | * **// | + | |
- | + | * **// | |
- | * **// | + | |
- | * **// | + | * **// |
- | * **// | + | * **// |
+ | |||
+ | * **// | ||
+ | | ||
* **// | * **// | ||
Line 59: | Line 96: | ||
* **// | * **// | ||
* **// | * **// | ||
- | | + | |
- | * //cross//: If we want to apply the Leave-One-Out Cross-Validation, | + | - //cross//: If we want to apply the Leave-One-Out Cross-Validation, |
- | * //UK_max// and // | + | - //UK_max// and // |
- | * // | + | - // |
- | * // | + | - // |
* **// | * **// | ||
* **// | * **// | ||
- | * **// | + | * **// |
As a resume, notice that the user only has to do the following about the configuration file: | As a resume, notice that the user only has to do the following about the configuration file: | ||
* Choosing you own **// | * Choosing you own **// | ||
* Choosing one of the **// | * Choosing one of the **// | ||
- | * Choosing the Universal Kriging mode in terms of the covariates, the application, | + | * Choosing the Universal Kriging mode in terms of the covariates, the application, |
* (The rest of the configuration should not be changed, only fill it as is explained). | * (The rest of the configuration should not be changed, only fill it as is explained). | ||
=== The structure of the folders === | === The structure of the folders === | ||
- | Before applying the methodology and obtaining the results, is important | + | Before applying the methodology and obtaining the results, |
This is an example of the structure using the 2019 dataset : | This is an example of the structure using the 2019 dataset : | ||
{{ : | {{ : | ||
- | Remember that the parallelization is carried out in terms of the day. We are applying the methodology on a mesh composed approximately | + | Remember that the parallelization is carried out in terms of the day. We are applying the methodology on a mesh of approximately |
=== The main script and its explanation | === The main script and its explanation | ||
- | The // | + | The // |
* The first section, **//Config file//**, is about setting up the procedure by reading the configuration file. | * The first section, **//Config file//**, is about setting up the procedure by reading the configuration file. | ||
* The section **//initial setting//** takes into account: | * The section **//initial setting//** takes into account: | ||
- | - The principal | + | - The top libraries that will be used and the coordinates reference. |
- The pollutant, in this case we have to type //NO2// and //no2// for the model and observations configurations respectively. | - The pollutant, in this case we have to type //NO2// and //no2// for the model and observations configurations respectively. | ||
- The variogram' | - The variogram' | ||
Line 96: | Line 133: | ||
* In the section **//initial setting: directories// | * In the section **//initial setting: directories// | ||
* In the section **//initial setting: variables// | * In the section **//initial setting: variables// | ||
- | - All regarding dates is being set up. | + | - All regarding dates are being set up. |
- | - The mesh where the bias correction | + | - The mesh where the bias correction |
* In the section **// | * In the section **// | ||
* In the section **//caliope evaluation, mean and max//**, different scripts are used to prepare the files regarding the model (CALIOPE-Urban) output at the monitoring stations (caliope evaluation), | * In the section **//caliope evaluation, mean and max//**, different scripts are used to prepare the files regarding the model (CALIOPE-Urban) output at the monitoring stations (caliope evaluation), | ||
* In the section **// | * In the section **// | ||
- | * The section // | + | * The section // |
* In the section // | * In the section // | ||
* In the section **// | * In the section **// | ||
* In the section **// | * In the section **// | ||
- | Notice that the majority of the __// | + | Notice that the majority of the __// |
- | === Submitting jobs === | + | === First Steps : Submitting jobs === |
- | *It is recommended to first take a look at the guidelines | + | *It is recommended |
- | To apply the methodology, | + | To apply the methodology, |
- | * //#SBATCH --time=48: | + | * //#SBATCH --time=48: |
* //#SBATCH --time=01: | * //#SBATCH --time=01: | ||
+ | |||
+ | The user has to charge the model in the //bashrc// (referred to the machine): | ||
+ | * Open your bashrc: | ||
+ | <code bash> | ||
+ | vi ~/.bashrc | ||
+ | </ | ||
+ | |||
+ | * Save the modules: | ||
+ | <code bash> | ||
+ | module load R | ||
+ | </ | ||
+ | |||
+ | * Save your bashrc: | ||
+ | <code bash> | ||
+ | source ~/.bashrc | ||
+ | </ | ||
+ | |||
To submit the job regarding this methodology, | To submit the job regarding this methodology, | ||
Line 143: | Line 197: | ||
==== Universal Kriging using CALIOPE-Urban as the unique covariate and the whole 2019 data, using all the possible applications ==== | ==== Universal Kriging using CALIOPE-Urban as the unique covariate and the whole 2019 data, using all the possible applications ==== | ||
- | - Preparing the configuration file: in this case the same as the example above. | + | - Preparing the configuration file: This is the same as the example above. |
- Enter the machine: | - Enter the machine: | ||
<code bash> | <code bash> | ||
Line 150: | Line 204: | ||
| | | | | | ||
| .-.--_ | | .-.--_ | ||
- | | ,´,´.´ `. | | + | | ,','.' |
| | | | BSC | | | | | | | BSC | | | ||
- | | `.`.`. _ .´ | | + | | `.`.`. _ .' |
| `·`·· | | `·`·· | ||
| | | | | | ||
Line 185: | Line 239: | ||
* //#SBATCH --time// : the computational time required | * //#SBATCH --time// : the computational time required | ||
* //#SBATCH -n// : the machine' | * //#SBATCH -n// : the machine' | ||
- | * //#SBATCH --constraint// | + | * //#SBATCH --constraint// |
As it would be the first submitted job, we use the maximum computational time (48h) and in this case, we choose to use 50 cores. The queue has to be //bsc_es// in this case. | As it would be the first submitted job, we use the maximum computational time (48h) and in this case, we choose to use 50 cores. The queue has to be //bsc_es// in this case. | ||
Line 219: | Line 273: | ||
* //USER//: the user's number. | * //USER//: the user's number. | ||
* //ST//: the status of the job, first if it is pending (PD) or running (R). Other options are completed (CD), completing (CG), failed (F), preempted (PR), suspended (S) or stopped (ST). All of this can be seen in the machine' | * //ST//: the status of the job, first if it is pending (PD) or running (R). Other options are completed (CD), completing (CG), failed (F), preempted (PR), suspended (S) or stopped (ST). All of this can be seen in the machine' | ||
- | * //TIME//: the time that has passed since the job is running. | + | * //TIME//: the time that has passed since the job started. |
* //NODES//: the machine' | * //NODES//: the machine' | ||
* // | * // | ||
Line 226: | Line 280: | ||
{{ : | {{ : | ||
- | 6. Waiting until the job is finished. Notice that when a job is submitted, two files are created: the output and the error ones (the user has defined their names in the job directories). In the output file, the user can visualize some indications that appear while running the job. In the error one, the user can see secondary errors that maybe were not enough to cancel the job's running | + | 6. Waiting until the job is finished. Notice that when a job is submitted, two files are created: the output and the error ones (the user has defined their names in the job directories). In the output file, the user can visualize some indications that appear while running the job. In the error one, the user can see secondary errors that may not be enough to cancel the job's running |
7. The job is completed. A folder named by **//2019/ //** will be created at the **// | 7. The job is completed. A folder named by **//2019/ //** will be created at the **// | ||
Line 256: | Line 310: | ||
ssh bscXXXXX@nord4.bsc.es | ssh bscXXXXX@nord4.bsc.es | ||
</ | </ | ||
- | 3. Preparing the job, in this case we are going to reduce the number of cores and computational time, so change the queue too, and not require high memory. | + | 3. Preparing the job, in this case we will reduce the number of cores and computational time, so change the queue too, and not require high memory. |
<code bash> | <code bash> |