From d5d4d564e799e0a6bd790a661cc6a535d151f519 Mon Sep 17 00:00:00 2001 From: aho Date: Thu, 21 Dec 2023 15:07:23 +0100 Subject: [PATCH 1/3] faq for Collect() --- inst/doc/faq.md | 44 ++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 42 insertions(+), 2 deletions(-) diff --git a/inst/doc/faq.md b/inst/doc/faq.md index ffe91a5..7ff7604 100644 --- a/inst/doc/faq.md +++ b/inst/doc/faq.md @@ -31,6 +31,8 @@ This document intends to be the first reference for any doubts that you may have 25. [What to do if your function has too many target dimensions](#25-what-to-do-if-your-function-has-too-many-target-dimensions) 26. [Use merge_across_dims_narm to remove NAs](#26-use-merge_across_dims_narm-to-remove-nas) 27. [Utilize chunk number in the function](#27-utilize-chunk-number-in-the-function) + 28. [Run startR in the background](#28-run-startr-in-the-background) + 29. [Collect result on HPCs](#29-collect-result-on-hpcs) 2. **Something goes wrong...** @@ -1008,6 +1010,38 @@ shows how to get start date for each chunk using chunk number; (2) [ex2_14](inst There are many other possible applications of this parameter. Please share with us other uses cases you may create. +### 28. Run startR in the background + +For heavy execution, we usually launch the jobs on HPCs with parallel computation. Sometimes, it takes a lot of time (days, weeks) to finish all the jobs. +It'd be much handy to let the jobs run in the background, so we don't need to make R session on workstation open during the whole process. +To do this: + +(1) Use parameter `wait = FALSE` in Compute() call. The execution therefore won't block the R session. + +(2) Save the object as a .Rds file by saveRDS(). In this file, you have all the information needed for collecting the result later. You can close the R session and turn off the workstation now. + +(3) When you want to collect the result, use Collect() with the saved .Rds file. +You can choose to use parameter `wait = TRUE` and the command will keep running until all the jobs are finished and can be collected. +Or, by `wait = FALSE`, it will tell you the jobs are still running and you can try again later. + +Note that if you use ecFlow as job manager and with Compute(wait = FALSE), the ecFlow-UI won't be updated due to uni-directional connection. +Check [ecFlow UI remains blue and does not update status](#2-ecflow-ui-remains-blue-and-does-not-update-status) for details. + +### 29. Collect result on HPCs +After using Compute() to run execution on HPCs, you can choose to collect the result on local workstation or on HPCs. Here is the instruction of how to do it on HPCs. + +(1) Run the startR workflow as usual on workstation until Compute(). + +(2) In Compute(), use `wait = FALSE`. The execution therefore won't block the R session. + +(3) Save the object as a .Rds file somewhere can be found on HPCs. E.g. `saveRDS(res, "/esarchive/scratch//res_startR_Collect.rds")` + +(4) ssh to HPCS (e.g., Nord3), open an R session. + +(5) Read the saved .Rds file. E.g. `obj_startR <- readRDS("/esarchive/scratch//res_startR_Collect.rds")` + +(6) Collect() the result with parameter `on_remote = TRUE`. E.g. `res <- Collect(obj_startR, on_remote = TRUE)` + # Something goes wrong... @@ -1042,9 +1076,15 @@ To solve this problem, use `Collect()` in the R terminal after running Compute() ### 3. Compute() successfully but then killed on R session -When Compute() on HPCs, the machines are able to process data which are much larger than the local workstation, so the computation works fine (i.e., on ec-Flow UI, the chunks show yellow in the end.) However, after the computation, the output will be sent back to local workstation. **If the returned data is larger than the available local memory space, your R session will be killed.** Therefore, always pre-check if the returned data will fit in your workstation free memory or not. If not, subset the input data or reduce the output size through more computation. +When we use Compute() and run jobs to HPCs, each job/chunk is finished and the result is saved as .Rds file individually. +When all the jobs are finished, the next step is to merge all the chunks into one array and return to workstation. +**If the returned data is larger than the available local memory space on your workstation, +your R session will be killed.** Therefore, it is better to always pre-check if the returned data will fit in your workstation free memory or not. + +If the result can fit on HPCs, you can also choose to collect the data there. Check [How-to-28](#29-collect-result-on-hpcs) for details. -Further explanation: though the complete output (i.e., merging all the chunks into one returned array) cannot be sent back to workstation, but the chunking results (.Rds file) are completed and saved in the directory '/STARTR_CHUNKING_'. If you still want to use the chunking results, you can find them there. +Note that even though the complete output (i.e., merging all the chunks into one returned array) cannot be sent back to workstation and the R session is killed, +the chunking results (.Rds files) are completed and saved in the local directory '/STARTR_CHUNKING_', and you can still utilize the chunk files. ### 4. My jobs work well in workstation and fatnodes but not on Power9 (or vice versa) -- GitLab From c81fc57121b14c6188f1445b0b3feaa8b522403a Mon Sep 17 00:00:00 2001 From: aho Date: Thu, 21 Dec 2023 15:44:03 +0100 Subject: [PATCH 2/3] version bump --- .Rbuildignore | 2 +- DESCRIPTION | 10 ++++++---- NEWS.md | 6 ++++++ 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/.Rbuildignore b/.Rbuildignore index 98316cc..aa7059a 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -9,7 +9,7 @@ ^inst/doc$ ^\.gitlab-ci\.yml$ ## unit tests should be ignored when building the package for CRAN -#^tests$ +^tests$ ^inst/PlotProfiling\.R$ ^.gitlab$ # Suggested by http://r-pkgs.had.co.nz/package.html diff --git a/DESCRIPTION b/DESCRIPTION index 90b03a7..8fd5ee1 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,14 +1,16 @@ Package: startR Title: Automatically Retrieve Multidimensional Distributed Data Sets -Version: 2.3.0 +Version: 2.3.1 Authors@R: c( person("Nicolau", "Manubens", , "nicolau.manubens@bsc.es", role = c("aut")), - person("An-Chi", "Ho", , "an.ho@bsc.es", role = c("aut", "cre")), + person("An-Chi", "Ho", , "an.ho@bsc.es", role = c("aut", "cre"), comment = c(ORCID = "0000-0002-4182-5258")), person("Nuria", "Perez-Zanon", , "nuria.perez@bsc.es", role = c("aut"), comment = c(ORCID = "0000-0001-8568-3071")), + person("Eva", "Rifa", , "eva.rifarovira@bsc.es", role = "ctb"), + person("Victoria", "Agudetse", , "victoria.agudetse@bsc.es", role = "ctb"), + person("Bruno", "de Paula Kinoshita", , "bruno.depaulakinoshita@bsc.es", role = "ctb"), person("Javier", "Vegas", , "javier.vegas@bsc.es", role = c("ctb")), person("Pierre-Antoine", "Bretonniere", , "pierre-antoine.bretonniere@bsc.es", role = c("ctb")), - person("Roberto", "Serrano", , "rsnotivoli@gmal.com", role = c("ctb")), - person("Eva", "Rifa", , "eva.rifarovira@bsc.es", role = "ctb"), + person("Roberto", "Serrano", , "rsnotivoli@gmail.com", role = c("ctb")), person("BSC-CNS", role = c("aut", "cph"))) Description: Tool to automatically fetch, transform and arrange subsets of multi- dimensional data sets (collections of files) stored in local and/or diff --git a/NEWS.md b/NEWS.md index 9219f96..c19d7a3 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,9 @@ +# startR v2.3.1 (Release date: 2023-12-22) +- Use Autosubmit as workflow manager on hub +- New feature: Collect result by Collect() on HPCs +- Bugfix: Correct Collect_autosubmit() .Rds files update +- Bugfix: Collect() correctly recognize the finished chunk (.Rds file) in local ecFlow folder. Prevent neverending Collect() when using `wait = F` in Compute() and Collect() the result later on + # startR v2.3.0 (Release date: 2023-08-31) - Load variable metadata when retreive = F - Change Compute() "threads_load" to 1 to be consistent with documentation -- GitLab From 254ced13d8e2635351561d67cd679a00607d73c7 Mon Sep 17 00:00:00 2001 From: aho Date: Thu, 21 Dec 2023 16:23:55 +0100 Subject: [PATCH 3/3] fix syntax error --- R/Start.R | 2 +- man/Start.Rd | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/R/Start.R b/R/Start.R index 89f87e9..5bfb3bf 100644 --- a/R/Start.R +++ b/R/Start.R @@ -674,7 +674,7 @@ #' to recognize files such as \cr #' \code{'/path/to/dataset/precipitation_zzz/19901101_yyy_foo.nc'}).\cr\cr #'Note that each glob expression can only represent one possibility (Start() -#'chooses the first). Because /code{*} is not the tag, which means it cannot +#'chooses the first). Because \code{*} is not the tag, which means it cannot #'be a dimension of the output array. Therefore, only one possibility can be #'adopted. For example, if \cr #'\code{'/path/to/dataset/precipitation_*/19901101_*_foo.nc'}\cr diff --git a/man/Start.Rd b/man/Start.Rd index 25eb8d7..640c5a9 100644 --- a/man/Start.Rd +++ b/man/Start.Rd @@ -651,7 +651,7 @@ For example, a path pattern could be as follows: \cr to recognize files such as \cr \code{'/path/to/dataset/precipitation_zzz/19901101_yyy_foo.nc'}).\cr\cr Note that each glob expression can only represent one possibility (Start() -chooses the first). Because /code{*} is not the tag, which means it cannot +chooses the first). Because \code{*} is not the tag, which means it cannot be a dimension of the output array. Therefore, only one possibility can be adopted. For example, if \cr \code{'/path/to/dataset/precipitation_*/19901101_*_foo.nc'}\cr -- GitLab