From 49a6ed51ed04378e45f3380e08989c4f272d12e8 Mon Sep 17 00:00:00 2001 From: Eva Rifa Date: Thu, 9 Jun 2022 15:38:31 +0200 Subject: [PATCH 1/7] add space --- inst/doc/practical_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/inst/doc/practical_guide.md b/inst/doc/practical_guide.md index 378f648..ef1a345 100644 --- a/inst/doc/practical_guide.md +++ b/inst/doc/practical_guide.md @@ -4,7 +4,7 @@ This guide includes explanations and practical examples for you to learn how to If you would like to start using startR rightaway on the BSC infrastructure, you can directly go through the "Configuring startR" section, copy/paste the basic startR script example shown at the end of the "Introduction" section onto the text editor of your preference, adjust the paths and user names specified in the `Compute()` call, and run the code in an R session after loading the R and ecFlow modules. -## Index +## Index 1. [**Motivation**](inst/doc/practical_guide.md#motivation) 2. [**Introduction**](inst/doc/practical_guide.md#introduction) -- GitLab From 6a125f06ae59d58c963c9f0f9841c703adaaa3b3 Mon Sep 17 00:00:00 2001 From: Eva Rifa Date: Mon, 13 Jun 2022 12:55:37 +0200 Subject: [PATCH 2/7] Added Nord3-v2 settings to the practical guide --- inst/doc/practical_guide.md | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/inst/doc/practical_guide.md b/inst/doc/practical_guide.md index ef1a345..226b548 100644 --- a/inst/doc/practical_guide.md +++ b/inst/doc/practical_guide.md @@ -1,10 +1,10 @@ # Practical guide for processing large data sets with startR -This guide includes explanations and practical examples for you to learn how to use startR to efficiently process large data sets in parallel on the BSC's HPCs (CTE-Power 9, Marenostrum 4, ...). See the main page of the [**startR**](README.md) project for a general overview of the features of startR, without actual guidance on how to use it. +This guide includes explanations and practical examples for you to learn how to use startR to efficiently process large data sets in parallel on the BSC's HPCs (Nord3v2, CTE-Power 9, Marenostrum 4, ...). See the main page of the [**startR**](README.md) project for a general overview of the features of startR, without actual guidance on how to use it. If you would like to start using startR rightaway on the BSC infrastructure, you can directly go through the "Configuring startR" section, copy/paste the basic startR script example shown at the end of the "Introduction" section onto the text editor of your preference, adjust the paths and user names specified in the `Compute()` call, and run the code in an R session after loading the R and ecFlow modules. -## Index +## Index 1. [**Motivation**](inst/doc/practical_guide.md#motivation) 2. [**Introduction**](inst/doc/practical_guide.md#introduction) @@ -53,7 +53,7 @@ Afterwards, you will need to understand and use five functions, all of them incl - **Compute()**, for specifying the HPC to be employed, the execution parameters (e.g. number of chunks and cores), and to trigger the computation - **Collect()** and the **EC-Flow graphical user interface**, for monitoring of the progress and collection of results -Next, you can see an example startR script performing the ensemble mean of a small data set on CTE-Power9, for you to get a broad picture of how the startR functions interact and the information that is represented in a startR script. Note that the `queue_host`, `temp_dir` and `ecflow_suite_dir` parameters in the `Compute()` call are user-specific. +Next, you can see an example startR script performing the ensemble mean of a small data set on an HPC cluster such as Nord3v2 or CTE-Power9, for you to get a broad picture of how the startR functions interact and the information that is represented in a startR script. Note that the `queue_host`, `temp_dir` and `ecflow_suite_dir` parameters in the `Compute()` call are user-specific. ```r library(startR) @@ -121,13 +121,23 @@ Specifically, you need to set up passwordless, userless access from your machine After following these steps for the connections in both directions (although from the HPC to the workstation might not be possible), you are good to go. -Do not forget adding the following lines in your .bashrc on CTE-Power if you are planning to run on CTE-Power: +Do not forget adding the following lines in your .bashrc on the HPC machine. + +If you are planning to run it on CTE-Power: ``` if [[ $BSC_MACHINE == "power" ]] ; then module unuse /apps/modules/modulefiles/applications module use /gpfs/projects/bsc32/software/rhel/7.4/ppc64le/POWER9/modules/all/ fi ``` +If you are on Nord3-v2, then you'll have to add: +``` +if [ $BSC_MACHINE == "nord3v2" ]; then + module purge + module use /gpfs/projects/bsc32/software/suselinux/11/modules/all + module unuse /apps/modules/modulefiles/applications /apps/modules/modulefiles/compilers /apps/modules/modulefiles/tools /apps/modules/modulefiles/libraries /apps/modules/modulefiles/environment +fi +``` You can add the following lines in your .bashrc file on your workstation for convenience: ``` @@ -356,7 +366,7 @@ It is not possible for now to define workflows with more than one step, but this Once the data sources are declared and the workflow is defined, you can proceed to specify the execution parameters (including which platform to run on) and trigger the execution with the `Compute()` function. -Next, a few examples are shown with `Compute()` calls to trigger the processing of a dataset locally (only on the machine where the R session is running) and on two different HPCs (the Earth Sciences fat nodes and CTE-Power9). However, let's first define a `Start()` call that involves a smaller subset of data in order not to make the examples too heavy. +Next, a few examples are shown with `Compute()` calls to trigger the processing of a dataset locally (only on the machine where the R session is running) and different HPCs (the Earth Sciences fat nodes, CTE-Power9 and other HPCs). However, let's first define a `Start()` call that involves a smaller subset of data in order not to make the examples too heavy. ```r library(startR) @@ -669,7 +679,7 @@ As mentioned above in the definition of the `cluster` parameters, it is strongly The `Compute()` call with the parameters to run the example in this section on the BSC ES fat nodes is provided below (you will need to adjust some of the parameters before using it). As you can see, the only thing that needs to be changed to execute startR on a different HPC is the definition of the `cluster` parameters. -The `cluster` configuration for the fat nodes, CTE-Power 9, Marenostrum 4, Nord III, Minotauro and ECMWF cca/ccb are all provided at the very end of this guide. +The `cluster` configuration for the fat nodes, CTE-Power 9, Marenostrum 4, Nord3, Minotauro and ECMWF cca/ccb are all provided at the very end of this guide. ```r res <- Compute(wf, @@ -1062,7 +1072,7 @@ cluster = list(queue_host = 'mn2.bsc.es', ) ``` -### Nord III +### Nord3 ```r cluster = list(queue_host = 'nord1.bsc.es', -- GitLab From 111a22ff83bebbb9987862a0730e7ce2706555de Mon Sep 17 00:00:00 2001 From: Eva Rifa Date: Tue, 14 Jun 2022 14:56:37 +0200 Subject: [PATCH 3/7] Changed queue_host to nord4 login --- inst/doc/practical_guide.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/inst/doc/practical_guide.md b/inst/doc/practical_guide.md index 226b548..647b708 100644 --- a/inst/doc/practical_guide.md +++ b/inst/doc/practical_guide.md @@ -1,6 +1,6 @@ # Practical guide for processing large data sets with startR -This guide includes explanations and practical examples for you to learn how to use startR to efficiently process large data sets in parallel on the BSC's HPCs (Nord3v2, CTE-Power 9, Marenostrum 4, ...). See the main page of the [**startR**](README.md) project for a general overview of the features of startR, without actual guidance on how to use it. +This guide includes explanations and practical examples for you to learn how to use startR to efficiently process large data sets in parallel on the BSC's HPCs (Nord3-v2, CTE-Power 9, Marenostrum 4, ...). See the main page of the [**startR**](README.md) project for a general overview of the features of startR, without actual guidance on how to use it. If you would like to start using startR rightaway on the BSC infrastructure, you can directly go through the "Configuring startR" section, copy/paste the basic startR script example shown at the end of the "Introduction" section onto the text editor of your preference, adjust the paths and user names specified in the `Compute()` call, and run the code in an R session after loading the R and ecFlow modules. @@ -53,7 +53,7 @@ Afterwards, you will need to understand and use five functions, all of them incl - **Compute()**, for specifying the HPC to be employed, the execution parameters (e.g. number of chunks and cores), and to trigger the computation - **Collect()** and the **EC-Flow graphical user interface**, for monitoring of the progress and collection of results -Next, you can see an example startR script performing the ensemble mean of a small data set on an HPC cluster such as Nord3v2 or CTE-Power9, for you to get a broad picture of how the startR functions interact and the information that is represented in a startR script. Note that the `queue_host`, `temp_dir` and `ecflow_suite_dir` parameters in the `Compute()` call are user-specific. +Next, you can see an example startR script performing the ensemble mean of a small data set on an HPC cluster such as Nord3-v2 or CTE-Power9, for you to get a broad picture of how the startR functions interact and the information that is represented in a startR script. Note that the `queue_host`, `temp_dir` and `ecflow_suite_dir` parameters in the `Compute()` call are user-specific. ```r library(startR) @@ -679,7 +679,7 @@ As mentioned above in the definition of the `cluster` parameters, it is strongly The `Compute()` call with the parameters to run the example in this section on the BSC ES fat nodes is provided below (you will need to adjust some of the parameters before using it). As you can see, the only thing that needs to be changed to execute startR on a different HPC is the definition of the `cluster` parameters. -The `cluster` configuration for the fat nodes, CTE-Power 9, Marenostrum 4, Nord3, Minotauro and ECMWF cca/ccb are all provided at the very end of this guide. +The `cluster` configuration for the fat nodes, CTE-Power 9, Marenostrum 4, Nord3-v2, Minotauro and ECMWF cca/ccb are all provided at the very end of this guide. ```r res <- Compute(wf, @@ -1072,10 +1072,10 @@ cluster = list(queue_host = 'mn2.bsc.es', ) ``` -### Nord3 +### Nord3-v2 ```r -cluster = list(queue_host = 'nord1.bsc.es', +cluster = list(queue_host = 'nord4.bsc.es', queue_type = 'lsf', data_dir = '/gpfs/projects/bsc32/share/startR_data_repos/', temp_dir = '/gpfs/scratch/bsc32/bsc32473/startR_hpc/', -- GitLab From 372a08c30c3a91e83edbdb168f262b01f2048e97 Mon Sep 17 00:00:00 2001 From: aho Date: Wed, 15 Jun 2022 13:18:15 +0200 Subject: [PATCH 4/7] Correct the cluster config of Nord3v2 and prioritize Nord3v2 in the guideline. --- inst/doc/practical_guide.md | 184 ++++++++++++++++++------------------ 1 file changed, 93 insertions(+), 91 deletions(-) diff --git a/inst/doc/practical_guide.md b/inst/doc/practical_guide.md index 647b708..a2b12bb 100644 --- a/inst/doc/practical_guide.md +++ b/inst/doc/practical_guide.md @@ -366,7 +366,7 @@ It is not possible for now to define workflows with more than one step, but this Once the data sources are declared and the workflow is defined, you can proceed to specify the execution parameters (including which platform to run on) and trigger the execution with the `Compute()` function. -Next, a few examples are shown with `Compute()` calls to trigger the processing of a dataset locally (only on the machine where the R session is running) and different HPCs (the Earth Sciences fat nodes, CTE-Power9 and other HPCs). However, let's first define a `Start()` call that involves a smaller subset of data in order not to make the examples too heavy. +Next, a few examples are shown with `Compute()` calls to trigger the processing of a dataset locally (only on the machine where the R session is running) and different HPCs (Nord3-v2, CTE-Power9 and other HPCs). However, let's first define a `Start()` call that involves a smaller subset of data in order not to make the examples too heavy. ```r library(startR) @@ -561,34 +561,38 @@ res <- Compute(wf, * max: 8.03660178184509 ``` -#### Compute() on CTE-Power 9 +#### Compute() on HPCs -In order to run the computation on a HPC, such as the BSC CTE-Power 9, you will need to make sure the passwordless connection with the login node of that HPC is configured, as shown at the beginning of this guide. If possible, in both directions. Also, you will need to know whether there is a shared file system between your workstation and that HPC, and will need information on the number of nodes, cores per node, threads per core, RAM memory per node, and type of workload used by that HPC (Slurm, PBS and LSF supported). +In order to run the computation on a HPC, you will need to make sure the passwordless connection with the login node of that HPC is configured, as shown at the beginning of this guide. If possible, in both directions. Also, you will need to know whether there is a shared file system between your workstation and that HPC, and will need information on the number of nodes, cores per node, threads per core, RAM memory per node, and type of workload used by that HPC (Slurm, PBS and LSF supported). You will need to add two parameters to your `Compute()` call: `cluster` and `ecflow_suite_dir`. The parameter `ecflow_suite_dir` expects a path to a folder in the workstation where to store temporary files generated for the automatic management of the workflow. As you will see later, the EC-Flow workflow manager is used transparently for this purpose. -The parameter `cluster` expects a list with a number of components that will have to be provided a bit differently depending on the HPC you want to run on. You can see next an example cluster configuration that will execute the previously defined workflow on CTE-Power 9. +The parameter `cluster` expects a list with a number of components that will have to be provided a bit differently depending on the HPC you want to run on. You can see next an example cluster configuration that will execute the previously defined workflow on Nord3-v2. ```r -res <- Compute(wf, - chunks = list(latitude = 2, - longitude = 2), - threads_load = 2, - threads_compute = 4, - cluster = list(queue_host = 'p9login1.bsc.es', - queue_type = 'slurm', - temp_dir = '/gpfs/scratch/bsc32/bsc32473/startR_hpc/', - r_module = 'R/3.5.0-foss-2018b', - cores_per_job = 4, - job_wallclock = '00:10:00', - max_jobs = 4, - extra_queue_params = list('#SBATCH --mem-per-cpu=3000'), - bidirectional = FALSE, - polling_period = 10 - ), - ecflow_suite_dir = '/home/Earth/nmanuben/startR_local/' - ) + # user-defined + temp_dir <- '/gpfs/scratch/bsc32/bsc32734/startR_hpc/' + ecflow_suite_dir <- '/home/Earth/aho/startR_local/' + + res <- Compute(wf, + chunks = list(latitude = 2, + longitude = 2), + threads_load = 2, + threads_compute = 4, + cluster = list(queue_host = 'nord4.bsc.es', + queue_type = 'slurm', + temp_dir = temp_dir, + cores_per_job = 4, + job_wallclock = '00:10:00', + max_jobs = 4, + extra_queue_params = list('#SBATCH --mem-per-cpu=3000'), + bidirectional = FALSE, + polling_period = 10 + ), + ecflow_suite_dir = ecflow_suite_dir, + wait = TRUE + ) ``` The cluster components and options are explained next: @@ -619,15 +623,15 @@ server is already started At this point, you may want to check the jobs are being dispatched and executed properly onto the HPC. For that, you can either use the EC-Flow GUI (covered in the next section), or you can `ssh` to the login node of the HPC and check the status of the queue with `squeue` or `qstat`, as shown below. ``` -[bsc32473@p9login1 ~]$ squeue +[bsc32734@login4 ~]$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) - 1142418 main /STARTR_ bsc32473 R 0:12 1 p9r3n08 - 1142419 main /STARTR_ bsc32473 R 0:12 1 p9r3n08 - 1142420 main /STARTR_ bsc32473 R 0:12 1 p9r3n08 - 1142421 main /STARTR_ bsc32473 R 0:12 1 p9r3n08 + 757026 main /STARTR_ bsc32734 R 0:46 1 s02r2b24 + 757027 main /STARTR_ bsc32734 R 0:46 1 s04r1b61 + 757028 main /STARTR_ bsc32734 R 0:46 1 s04r1b63 + 757029 main /STARTR_ bsc32734 R 0:46 1 s04r1b64 ``` -Here the output of the execution on CTE-Power 9 after waiting for about a minute: +Here the output of the execution after waiting for about a minute: ```r * Remaining time estimate (neglecting queue and merge time) (at * 2019-01-28 01:16:59): 0 mins (46.22883 secs per chunk) @@ -675,55 +679,33 @@ Usually, in use cases with larger data inputs, it will be preferrable to add the As mentioned above in the definition of the `cluster` parameters, it is strongly recommended to check the section on "How to choose the number of chunks, jobs and cores". -#### Compute() on the fat nodes and other HPCs - -The `Compute()` call with the parameters to run the example in this section on the BSC ES fat nodes is provided below (you will need to adjust some of the parameters before using it). As you can see, the only thing that needs to be changed to execute startR on a different HPC is the definition of the `cluster` parameters. +You can find the `cluster` configuration for other HPCs at the end of this guide [Compute() cluster templates](#compute-cluster-templates) -The `cluster` configuration for the fat nodes, CTE-Power 9, Marenostrum 4, Nord3-v2, Minotauro and ECMWF cca/ccb are all provided at the very end of this guide. - -```r -res <- Compute(wf, - chunks = list(latitude = 2, - longitude = 2), - threads_load = 2, - threads_compute = 4, - cluster = list(queue_host = 'bsceslogin01.bsc.es', - queue_type = 'slurm', - temp_dir = '/home/Earth/nmanuben/startR_hpc/', - cores_per_job = 2, - job_wallclock = '00:10:00', - max_jobs = 4, - bidirectional = TRUE - ), - ecflow_suite_dir = '/home/Earth/nmanuben/startR_local/') -``` ### Collect() and the EC-Flow GUI Usually, in use cases where large data inputs are involved, it is convenient to add the parameter `wait = FALSE` to your `Compute()` call. With this parameter, `Compute()` will immediately return an object with information about your startR execution. You will be able to store this object onto disk. After doing that, you will not need to worry in case your workstation turns off in the middle of the computation. You will be able to close your R session, and collect the results later on with the `Collect()` function. ```r -res <- Compute(wf, - chunks = list(latitude = 2, - longitude = 2), - threads_load = 2, - threads_compute = 4, - cluster = list(queue_host = 'p9login1.bsc.es', - queue_type = 'slurm', - temp_dir = '/gpfs/scratch/bsc32/bsc32473/startR_hpc/', - r_module = 'R/3.5.0-foss-2018b', - cores_per_job = 4, - job_wallclock = '00:10:00', - max_jobs = 4, - extra_queue_params = list('#SBATCH --mem-per-cpu=3000'), - bidirectional = FALSE, - polling_period = 10 - ), - ecflow_suite_dir = '/home/Earth/nmanuben/startR_local/', - wait = FALSE - ) - -saveRDS(res, file = 'test_collect.Rds') + res <- Compute(wf, + chunks = list(latitude = 2, + longitude = 2), + threads_load = 2, + threads_compute = 4, + cluster = list(queue_host = 'nord4.bsc.es', + queue_type = 'slurm', + temp_dir = '/gpfs/scratch/bsc32/bsc32734/startR_hpc/', + cores_per_job = 4, + job_wallclock = '00:10:00', + max_jobs = 4, + extra_queue_params = list('#SBATCH --mem-per-cpu=3000'), + bidirectional = FALSE, + polling_period = 10 + ), + ecflow_suite_dir = '/home/Earth/aho/startR_local/', + wait = FALSE + ) + saveRDS(res, file = 'test_collect.Rds') ``` At this point, after storing the descriptor of the execution and before calling `Collect()`, you may want to visually check the status of the execution. You can do that with the EC-Flow graphical user interface. You need to open a new terminal, load the EC-Flow module if needed, and start the GUI: @@ -1027,13 +1009,52 @@ r <- Compute(wf, ## Compute() cluster templates +### Nord3-v2 + +```r + res <- Compute(wf, + chunks = list(latitude = 2, + longitude = 2), + threads_load = 2, + threads_compute = 4, + cluster = list(queue_host = 'nord4.bsc.es', + queue_type = 'slurm', + temp_dir = '/gpfs/scratch/bsc32/bsc32734/startR_hpc/', + cores_per_job = 2, + job_wallclock = '01:00:00', + max_jobs = 4, + bidirectional = FALSE, + polling_period = 10 + ), + ecflow_suite_dir = '/home/Earth/aho/startR_local/', + wait = TRUE + ) +``` + +### Nord3 (deprecated) + +```r +cluster = list(queue_host = 'nord1.bsc.es', + queue_type = 'lsf', + data_dir = '/gpfs/projects/bsc32/share/startR_data_repos/', + temp_dir = '/gpfs/scratch/bsc32/bsc32473/startR_hpc/', + init_commands = list('module load intel/16.0.1'), + cores_per_job = 2, + job_wallclock = '00:10', + max_jobs = 4, + extra_queue_params = list('#BSUB -q bsc_es'), + bidirectional = FALSE, + polling_period = 10, + special_setup = 'marenostrum4' + ) +``` + ### CTE-Power9 ```r cluster = list(queue_host = 'p9login1.bsc.es', queue_type = 'slurm', temp_dir = '/gpfs/scratch/bsc32/bsc32473/startR_hpc/', - r_module = 'R/3.5.0-foss-2018b', cores_per_job = 4, job_wallclock = '00:10:00', max_jobs = 4, @@ -1042,7 +1063,7 @@ cluster = list(queue_host = 'p9login1.bsc.es', ) ``` -### BSC ES fat nodes +### BSC ES fat nodes (deprecated) ```r cluster = list(queue_host = 'bsceslogin01.bsc.es', @@ -1072,25 +1093,6 @@ cluster = list(queue_host = 'mn2.bsc.es', ) ``` -### Nord3-v2 - -```r -cluster = list(queue_host = 'nord4.bsc.es', - queue_type = 'lsf', - data_dir = '/gpfs/projects/bsc32/share/startR_data_repos/', - temp_dir = '/gpfs/scratch/bsc32/bsc32473/startR_hpc/', - init_commands = list('module load intel/16.0.1'), - r_module = 'R/3.3.0', - cores_per_job = 2, - job_wallclock = '00:10', - max_jobs = 4, - extra_queue_params = list('#BSUB -q bsc_es'), - bidirectional = FALSE, - polling_period = 10, - special_setup = 'marenostrum4' - ) -``` - ### MinoTauro ```r -- GitLab From b361b861134c3c098e4f2aee37663cc668e072db Mon Sep 17 00:00:00 2001 From: aho Date: Wed, 15 Jun 2022 14:34:04 +0200 Subject: [PATCH 5/7] Revise nord3v2 cluster template --- inst/doc/practical_guide.md | 26 +++++++++----------------- 1 file changed, 9 insertions(+), 17 deletions(-) diff --git a/inst/doc/practical_guide.md b/inst/doc/practical_guide.md index a2b12bb..45897ee 100644 --- a/inst/doc/practical_guide.md +++ b/inst/doc/practical_guide.md @@ -1012,23 +1012,15 @@ r <- Compute(wf, ### Nord3-v2 ```r - res <- Compute(wf, - chunks = list(latitude = 2, - longitude = 2), - threads_load = 2, - threads_compute = 4, - cluster = list(queue_host = 'nord4.bsc.es', - queue_type = 'slurm', - temp_dir = '/gpfs/scratch/bsc32/bsc32734/startR_hpc/', - cores_per_job = 2, - job_wallclock = '01:00:00', - max_jobs = 4, - bidirectional = FALSE, - polling_period = 10 - ), - ecflow_suite_dir = '/home/Earth/aho/startR_local/', - wait = TRUE - ) +cluster = list(queue_host = 'nord4.bsc.es', + queue_type = 'slurm', + temp_dir = '/gpfs/scratch/bsc32/bsc32734/startR_hpc/', + cores_per_job = 2, + job_wallclock = '01:00:00', + max_jobs = 4, + bidirectional = FALSE, + polling_period = 10 + ) ``` ### Nord3 (deprecated) -- GitLab From 9c8f313a205e06b7a718768e999dc4387a0ebd76 Mon Sep 17 00:00:00 2001 From: aho Date: Wed, 15 Jun 2022 14:40:49 +0200 Subject: [PATCH 6/7] Change Power9 to Nord3v2 --- inst/doc/practical_guide.md | 34 ++++++++++++++++++---------------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/inst/doc/practical_guide.md b/inst/doc/practical_guide.md index 45897ee..73e1ead 100644 --- a/inst/doc/practical_guide.md +++ b/inst/doc/practical_guide.md @@ -79,22 +79,24 @@ step <- Step(fun, wf <- AddStep(data, step) -res <- Compute(wf, - chunks = list(latitude = 2, - longitude = 2), - threads_load = 2, - threads_compute = 4, - cluster = list(queue_host = 'p9login1.bsc.es', - queue_type = 'slurm', - temp_dir = '/gpfs/scratch/bsc32/bsc32473/startR_hpc/', - r_module = 'R/3.5.0-foss-2018b', - job_wallclock = '00:10:00', - cores_per_job = 4, - max_jobs = 4, - bidirectional = FALSE, - polling_period = 10 - ), - ecflow_suite_dir = '/home/Earth/nmanuben/startR_local/') + res <- Compute(wf, + chunks = list(latitude = 2, + longitude = 2), + threads_load = 2, + threads_compute = 4, + cluster = list(queue_host = 'nord4.bsc.es', + queue_type = 'slurm', + temp_dir = '/gpfs/scratch/bsc32/bsc32734/startR_hpc/', + cores_per_job = 4, + job_wallclock = '00:10:00', + max_jobs = 4, + extra_queue_params = list('#SBATCH --mem-per-cpu=3000'), + bidirectional = FALSE, + polling_period = 10 + ), + ecflow_suite_dir = '/home/Earth/aho/startR_local/', + wait = TRUE + ) ``` ## Configuring startR -- GitLab From 37886ebf560057c760bb75c56b54f6be3997a7bb Mon Sep 17 00:00:00 2001 From: Eva Rifa Date: Thu, 16 Jun 2022 16:35:11 +0200 Subject: [PATCH 7/7] Add special_setup to cluster list --- R/Compute.R | 2 +- inst/doc/practical_guide.md | 25 +++++++++++-------------- 2 files changed, 12 insertions(+), 15 deletions(-) diff --git a/R/Compute.R b/R/Compute.R index 0aa9424..1450b01 100644 --- a/R/Compute.R +++ b/R/Compute.R @@ -25,7 +25,7 @@ #' to use for the computation. The default value is 1. #'@param cluster A list of components that define the configuration of the #' machine to be run on. The comoponents vary from the different machines. -#' Check \href{https://earth.bsc.es/gitlab/es/startR/}{startR GitLab} for more +#' Check \href{https://earth.bsc.es/gitlab/es/startR/-/blob/master/inst/doc/practical_guide.md}{Practical guide on GitLab} for more #' details and examples. Only needed when the computation is not run locally. #' The default value is NULL. #'@param ecflow_suite_dir A character string indicating the path to a folder in diff --git a/inst/doc/practical_guide.md b/inst/doc/practical_guide.md index 73e1ead..c56fc0b 100644 --- a/inst/doc/practical_guide.md +++ b/inst/doc/practical_guide.md @@ -125,14 +125,7 @@ After following these steps for the connections in both directions (although fro Do not forget adding the following lines in your .bashrc on the HPC machine. -If you are planning to run it on CTE-Power: -``` -if [[ $BSC_MACHINE == "power" ]] ; then - module unuse /apps/modules/modulefiles/applications - module use /gpfs/projects/bsc32/software/rhel/7.4/ppc64le/POWER9/modules/all/ -fi -``` -If you are on Nord3-v2, then you'll have to add: +If you are planning to run it on Nord3-v2, you have to add: ``` if [ $BSC_MACHINE == "nord3v2" ]; then module purge @@ -140,6 +133,13 @@ if [ $BSC_MACHINE == "nord3v2" ]; then module unuse /apps/modules/modulefiles/applications /apps/modules/modulefiles/compilers /apps/modules/modulefiles/tools /apps/modules/modulefiles/libraries /apps/modules/modulefiles/environment fi ``` +If you are using CTE-Power: +``` +if [[ $BSC_MACHINE == "power" ]] ; then + module unuse /apps/modules/modulefiles/applications + module use /gpfs/projects/bsc32/software/rhel/7.4/ppc64le/POWER9/modules/all/ +fi +``` You can add the following lines in your .bashrc file on your workstation for convenience: ``` @@ -585,6 +585,7 @@ The parameter `cluster` expects a list with a number of components that will hav cluster = list(queue_host = 'nord4.bsc.es', queue_type = 'slurm', temp_dir = temp_dir, + r_module = 'R/4.1.2-foss-2019b' cores_per_job = 4, job_wallclock = '00:10:00', max_jobs = 4, @@ -609,6 +610,7 @@ The cluster components and options are explained next: - `extra_queue_params`: list of character strings with additional queue headers for the jobs to be submitted to the HPC. Mainly used to specify the amount of memory to book for each job (e.g. '#SBATCH --mem-per-cpu=30000'), to request special queuing (e.g. '#SBATCH --qos=bsc_es'), or to request use of specific software (e.g. '#SBATCH --reservation=test-rhel-7.5'). - `bidirectional`: whether the connection between the R workstation and the HPC login node is bidirectional (TRUE) or unidirectional from the workstation to the login node (FALSE). - `polling_period`: when the connection is unidirectional, the workstation will ask the HPC login node for results each `polling_period` seconds. An excessively small value can overload the login node or result in temporary banning. +- `special_setup`: name of the machine if the computation requires an special setup. Only Marenostrum 4 needs this parameter (e.g. special_setup = 'marenostrum4'). After the `Compute()` call is executed, an EC-Flow server is automatically started on your workstation, which will orchestrate the work and dispatch jobs onto the HPC. Thanks to the use of EC-Flow, you will also be able to monitor visually the progress of the execution. See the "Collect and the EC-Flow GUI" section. @@ -903,10 +905,6 @@ res <- Compute(step, list(system4, erai), wait = FALSE) ``` -### Example of computation of weekly means - -### Example with data on an irregular grid with selection of a region - ### Example on MareNostrum 4 ```r @@ -1038,8 +1036,7 @@ cluster = list(queue_host = 'nord1.bsc.es', max_jobs = 4, extra_queue_params = list('#BSUB -q bsc_es'), bidirectional = FALSE, - polling_period = 10, - special_setup = 'marenostrum4' + polling_period = 10 ) ``` -- GitLab