@@ -17,7 +17,7 @@ This document intends to be the first reference for any doubts that you may have
11.[Select the longitude/latitude region](#11-select-the-longitudelatitude-region)
12.[What will happen if reorder function is not used](#12-what-will-happen-if-reorder-function-is-not-used)
13.[Load specific grid points data](#13-load-specific-grid-points-data)
14.[Find the error log when jobs are launched on Power9](#14-find-the-error-log-when-jobs-are-launched-on-power9)
14.[Find the error log when jobs are launched on Power9/Nord3](#14-find-the-error-log-when-jobs-are-launched-on-power9nord3)
15.[Specify extra function arguments in the workflow](#15-specify-extra-function-arguments-in-the-workflow)
16.[Use parameter 'return_vars' in Start()](#16-use-parameter-return_vars-in-start)
17.[Use parameter 'split_multiselected_dims' in Start()](#17-use-parameter-split_multiselected_dims-in-start)
...
...
@@ -463,7 +463,7 @@ When trying to load both start dates at once using Start(), the order in which t
To retrieve all the members, we can use an argument `largest_dims_length = TRUE` in Start(). It makes Start() to look for the largest dimension length among all the files. By this means, the returned array will have the member dimension as 51 no matter which start date is the first one.
You can find examples to reproduce this behaviour in the use case [ex_1_4](/inst/doc/usecase/ex1_4_variable_nmember.R).
You can find examples to reproduce this behaviour in the use case [ex_1_4](/inst/doc/usecase/ex1_4_variable_nmember.R). See more explanation about `largest_dims_length` at [How-to-21](#21-retrieve-the-complete-data-when-the-dimension-length-varies-among-files).
### 11. Select the longitude/latitude region
...
...
@@ -517,10 +517,12 @@ If the values does not match the defined spatial point in the files, Start() wil
An example of how to load several gridpoints and how to transform the data could be found in use case [ex_1_6](/inst/doc/usecase/ex1_6_gridpoint_data.R).
### 14. Find the error log when jobs are launched on Power9
### 14. Find the error log when jobs are launched on Power9/Nord3
Due to connection problem, when Compute() dispatches jobs to Power9, each job in ecFlow ui has a 'Z', zombie, beside, no matter the job is complete or failed.
The zombie blocks the error log to be shown in ecFlow ui output frame. Therefore, you need to log in Power9, go to 'temp_dir' listed in the cluster list in Compute() and enter the job folder. You will find another folder with the same name as the previous layer, then go down to the most inner folder. You will see 'Chunk.1.err'.
Due to uni-directional configuration, Power9 and Nord3 cannot send the log file back to workstation.
when Compute() dispatches jobs to Power9 or Nord3, each job in the ecFlow UI has a 'Z', zombie, beside, no matter the job is completed or failed.
The zombie blocks the error log to be shown in the ecFlow UI output frame. Therefore, you need to ssh to the machine, go to 'temp_dir' listed in the cluster list in Compute() and enter the job folder.
You will find another folder with the same name as the previous layer, then go down to the most inner folder. You will see 'Chunk.1.err'.
For example, the path can be: "/gpfs/scratch/bsc32/bsc32734/startR_hpc/STARTR_CHUNKING_1665710775/STARTR_CHUNKING_1665710775/computation/lead_year_CHUNK_1/lon_CHUNK_1/lat_CHUNK_1/sdate_CHUNK_1/var_CHUNK_1/dataset_CHUNK_1/Chunk.1.err".
### 15. Specify extra function arguments in the workflow
...
...
@@ -528,7 +530,7 @@ For example, the path can be: "/gpfs/scratch/bsc32/bsc32734/startR_hpc/STARTR_CH
The input arguments of the function may not only be the data, sometimes the extra information is required.
The additional arguments should be specified in 'AddStep()'. The following example shows how to assign 'na.rm' in mean().
@@ -546,8 +548,8 @@ The parameter 'return_vars' is used to request such variables.
This parameter expects to receive a named variable list. The names are the variable names to be fetched in the netCDF files, and the corresponding value can be:
(1) NULL, if the variable is common along all the file dimensions (i.e., it will be retrieved only once from the first involved files)
(2) a vector of the file dimension name which to retrieve the variable for
(3) a vector which includes the file dimension for path pattern specification (i.e., 'dat' in the example below)
(2) a vector of the file dimension name for which to retrieve the variable
(3) a vector that includes the file dimension for path pattern specification (i.e., 'dat' in the example below)
For the first and second options, the fetched variable values will be saved in *$Variables$common$<variable_name>*.
For the third option, the fetched variable values will be saved in *$Variables$<dataset_name>$<variable_name>*.
...
...
@@ -556,7 +558,7 @@ Notice that if the variable is specified by values(), it will be automatically a
@@ -581,7 +583,7 @@ Here is an example showing the above three ways.
In the return_vars list, we require information of three variables. 'time' values differ from each sdate, while longitude and latitude are common variable among all the files.
You can use `str(data)` to see the information structure.