multiApply issueshttps://earth.bsc.es/gitlab/ces/multiApply/-/issues2023-09-04T17:13:19+02:00https://earth.bsc.es/gitlab/ces/multiApply/-/issues/16The usage of registerDoParallel()2023-09-04T17:13:19+02:00ahoThe usage of registerDoParallel()In Apply() https://earth.bsc.es/gitlab/ces/multiApply/-/blob/master/R/Apply.R#L717
```r
if (parallel) registerDoParallel(ncores)
```
While in the documentation https://www.rdocumentation.org/packages/doParallel/versions/1.0.17/topics...In Apply() https://earth.bsc.es/gitlab/ces/multiApply/-/blob/master/R/Apply.R#L717
```r
if (parallel) registerDoParallel(ncores)
```
While in the documentation https://www.rdocumentation.org/packages/doParallel/versions/1.0.17/topics/registerDoParallel
> registerDoParallel(cl, cores=NULL, …)
Should "ncores" be the 2nd input? Then what is `cl`?https://earth.bsc.es/gitlab/ces/multiApply/-/issues/12Wrong result or error when the result dimension length of each chunk differs2023-03-27T15:55:38+02:00ahoWrong result or error when the result dimension length of each chunk differsThe problem occurs when the output of the chunks Apply() loops over don't share the same dimension length. For example, input is an array with dimensions [sdate = 10, member = 3] and the target dimension is `sdate`. Apply() will applies ...The problem occurs when the output of the chunks Apply() loops over don't share the same dimension length. For example, input is an array with dimensions [sdate = 10, member = 3] and the target dimension is `sdate`. Apply() will applies function to each member, so there will be three chunks. If the three chunks all have output with length = 2, the final array will be [2, member = 3]. However, if the outputs of 1st and 2nd chunks have length = 2, but the 3rd one has length = 1, Apply() doesn't know how to merge them into one array. It either returns a wrong array and warning
```
In arrays_of_results[[component]][(1:prod(component_dims)) + ... :
number of items to replace is not a multiple of replacement length
```
or directly returns an error:
```
Error in arrays_of_results[[component]][(1:prod(component_dims)) + (m - :
replacement has length zero
```
The problem is found in https://earth.bsc.es/gitlab/external/cstools/-/issues/96 and also mentioned in https://earth.bsc.es/gitlab/ces/multiApply/-/issues/7#note_116762
There are two solutions I can think of:
(1) If the dimension lengths are not the same, return a meaningful error message. It's reasonable that Apply() expects the same length of chunk outputs because we can be sure that the returned array has robust meaning (that is, the index 1 means the first time step, 33% percentile, etc.) and the results of all the chunks are aligned.
(2) Detect the largest length of the chunk output, and use NAs to fill the shorter output. So, take the example above, the final array will be [2, member = 3] and the value of [1, 3] is NA. A warning also needs to be returned so user can be aware of this situation.
@nperez What do you think? In my opinion, the 2nd option is more flexible. If it makes sense to you, I can fix the function by (2). Please let me know, thanks!
Best,
An-Chi
FYI @erifarov you may be interested in this issue, toohttps://earth.bsc.es/gitlab/ces/multiApply/-/issues/13MultiApply not working when using package gstat2023-01-12T18:02:45+01:00jmateuMultiApply not working when using package gstatHi @aho ,
MultiApply is not working when using the function krige.cv from the package gstat. I've prepared a reproducible example in /esarchive/scratch/jmateu/Codes/geostats/UK_multiapply_test.R. There is a logical variable `force_worki...Hi @aho ,
MultiApply is not working when using the function krige.cv from the package gstat. I've prepared a reproducible example in /esarchive/scratch/jmateu/Codes/geostats/UK_multiapply_test.R. There is a logical variable `force_working` that you can set to true to avoid the krige.cv function and see that the parallelized function is actually working. Also, when setting ncores=1, it works perfectly fine.
I've tried to run it in RStudio, in the terminal, and launching a job... nothing seems to work.
It bothers me because we have already parallelized a script with this krige.cv function, and it works well: https://earth.bsc.es/gitlab/es/universalkriging/-/blob/production/general/cross_UK2.R
I don't know what else I can try, do you have any suggestions?
Thanks so much and happy new year!
Jan
FYI: @mhajji, @nperez, @acriadohttps://earth.bsc.es/gitlab/ces/multiApply/-/issues/10'C_cor' not found when using cor() in Apply2021-12-15T16:19:22+01:00Nuria Pérez-Zanón'C_cor' not found when using cor() in ApplyAs reported by @bertvs, Apply returns an error using 'cor' and 'cov' functions.
```
mod <- seq(1, 2 * 3)
obs <- seq(1, 2 * 3)
dim(mod) <- c(dataset = 2, member = 3)
dim(obs) <- c(dataset = 2, member = 3)
outp <- Apply(data = list(x = ...As reported by @bertvs, Apply returns an error using 'cor' and 'cov' functions.
```
mod <- seq(1, 2 * 3)
obs <- seq(1, 2 * 3)
dim(mod) <- c(dataset = 2, member = 3)
dim(obs) <- c(dataset = 2, member = 3)
outp <- Apply(data = list(x = obs, x = mod),
target_dims = list(x = c("member"), y = c("member")),
fun = cor)
```
`Error in (function (x, y = NULL, use = "everything", method = c("pearson", :
object 'C_cor' not found`
To avoid it there are two options:
1.- Declare the function before using it
```
C_cor <- stats:::C_cor
outp <- Apply(data = list(x = obs, x = mod),
target_dims = list(x = c("member"), y = c("member")),
fun = cor)
```
2.- Define a function using cor:
```
outp <- Apply(data = list(x = obs, y = mod),
target_dims = list(x = c("member"), y = c("member")),
fun = function(x, y){cor(x, y)})
```
This last case is the one we can use in apply:
```
apply(mod, 1, function(x){cor(x, c(1,2,3))})
```
For now, I don't know any fix I can do in Apply code since I think this is a problem from cor function.
Suggestions and ideas are more than welcome.
Cheers,
Núriahttps://earth.bsc.es/gitlab/ces/multiApply/-/issues/8Cleaning memory failure on Cran pretests submission2021-01-21T18:41:26+01:00Nuria Pérez-ZanónCleaning memory failure on Cran pretests submissionHi,
While submitting package CSTools (V2.0.0) to CRAN, the following error was reported:
```
> #Example 2: using CST_RainFARM for a CSTools object with parallel processing,
> #dropping the "realization" dimension to be able to save re...Hi,
While submitting package CSTools (V2.0.0) to CRAN, the following error was reported:
```
> #Example 2: using CST_RainFARM for a CSTools object with parallel processing,
> #dropping the "realization" dimension to be able to save results using
> #\code{CST_SaveExp}.
> #Load dataset included in CSTools pacakge
> data <- lonlat_prec
> nf <- 8
> # Create a test array of weights
> ww <- array(1., dim = c(dim(data$lon) * nf, dim(data$lat) * nf))
> res <- CST_RainFARM(data, nf, weights=ww, nprocs=2, drop_realization_dim = TRUE)
Warning in RainFARM(data$data, data$lon, data$lat, nf, weights, nens, slope, :
Selected time dim: ftime
Warning: <anonymous>: ... may be used in an incorrect context: '.fun(piece, ...)'
Warning: <anonymous>: ... may be used in an incorrect context: '.fun(piece, ...)'
> dim(res$data)
dataset member sdate ftime lat lon
1 6 3 31 32 32
> # dataset member sdate ftime lat lon
> # 1 6 3 31 32 32
> rm(res)
>
>
>
> base::assign(".dptime", (proc.time() - get(".ptime", pos = "CheckExEnv")), pos = "CheckExEnv")
> base::cat("CST_RainFARM", base::get(".format_ptime", pos = 'CheckExEnv')(get(".dptime", pos = "CheckExEnv")), "\n", file=base::get(".ExTimings", pos = 'CheckExEnv'), append=TRUE, sep="\t")
> cleanEx()
Error: connections left open:
<-CRANwin.fb05.statistik.uni-dortmund.de:11694 (sockconn)
<-CRANwin.fb05.statistik.uni-dortmund.de:11694 (sockconn)
Execution halted
```
The difference between Example 1 and Example 2 is just that Example 2 uses 'ncores' parameter from Apply.
Could it be related to the clean of memory in the chunking process?
(see original issue in CSTools [28](https://earth.bsc.es/gitlab/external/cstools/issues/28)
Núriahttps://earth.bsc.es/gitlab/ces/multiApply/-/issues/11Assessing functions performance by compilation2020-11-13T17:35:49+01:00Nuria Pérez-ZanónAssessing functions performance by compilationI have done some tests using compiled functions, following the issue in s2dv https://earth.bsc.es/gitlab/es/s2dv/-/issues/15 and the JIT compiler package https://earth.bsc.es/gitlab/es/requests/-/issues/1311.
The code is in /esarchive/s...I have done some tests using compiled functions, following the issue in s2dv https://earth.bsc.es/gitlab/es/s2dv/-/issues/15 and the JIT compiler package https://earth.bsc.es/gitlab/es/requests/-/issues/1311.
The code is in /esarchive/scratch/nperez/git/Flor/s2dverification_tests/Compiled_test.R
| test |replications |elapsed |relative| user.self| sys.self|
|-------------------|-------------|--------|--------|----------|---------|
|2 Comp_.Regression | 10 | 46.083 | 1.028| 45.998| 0.079|
|3 Comp_Regression | 10 | 44.818 | 1.000| 44.806| 0.008|
|1 Regression | 10 | 44.837 | 1.000| 44.828| 0.003|
For the case in the test, the current version of Regression() in s2dv seems good enough (last line in table).
The compiler could be evaluated in the case of Apply (given results showed by An-Chi in requests issue). It would be useful to use Profvis() to find the bottlenecks.https://earth.bsc.es/gitlab/ces/multiApply/-/issues/9quantile , Apply() incompatibility with R/3.6.1-foss-2015a-bare2020-07-07T10:56:40+02:00Andreaquantile , Apply() incompatibility with R/3.6.1-foss-2015a-bareHello,
I have been testing the R version R/3.6.1-foss-2015a-bare
When using Apply() with the stats::quantile function I get an error:
With R/3.2.0-foss-2015a-bare it worked fine
```
Error in UseMethod("quantile") :
no applicable...Hello,
I have been testing the R version R/3.6.1-foss-2015a-bare
When using Apply() with the stats::quantile function I get an error:
With R/3.2.0-foss-2015a-bare it worked fine
```
Error in UseMethod("quantile") :
no applicable method for 'quantile' applied to an object of class "c#('matrix', 'double', 'numeric')"
```
Here is some reproducible code:
```
library(multiApply)
library(startR)
ecmwf_path_hc <- paste0('/esarchive/exp/ecmwf/s2s-monthly_ensforhc/weekly_mean/$var$_f24h/$sdate$/$var$_$syear$.nc')
hcst<-Start(dat = ecmwf_path_hc,
var = 'tas',
#sdate = '20170105',#forecast.fulldate,
sdate = "20161222",
syear = 'all',
time ='all',
latitude = indices(1),
longitude = indices(1),
ensemble = 'all',
syear_depends = 'sdate',
return_vars = list(latitude = 'dat',
longitude = 'dat'),
retrieve = T)
terciles_hcst<-Apply(hcst,c('syear','ensemble'),quantile,c(1/3,2/3))[[1]]
```https://earth.bsc.es/gitlab/ces/multiApply/-/issues/6Apply() Error: Dimension names of arrays in 'data' must be at least one cha...2019-10-14T14:44:13+02:00AndreaApply() Error: Dimension names of arrays in 'data' must be at least one character long.Hi,
When using Apply() with 2 vectors with no named dimensions and specifying dimensions by target_dims=c(1)
I get the following error:
Error in Apply(data = list(RPSS.m.separate, RPSS.m.win.separate), fun = random_walk, :
Dimensi...Hi,
When using Apply() with 2 vectors with no named dimensions and specifying dimensions by target_dims=c(1)
I get the following error:
Error in Apply(data = list(RPSS.m.separate, RPSS.m.win.separate), fun = random_walk, :
Dimension names of arrays in 'data' must be at least one character long.
Do the arrays need name dimensions?
Thanks,
Andreahttps://earth.bsc.es/gitlab/ces/multiApply/-/issues/5Apply not finding an object created inside a function2019-07-16T18:41:40+02:00AndreaApply not finding an object created inside a functionHi @nmanubens,
With the new release, a simple function that was previously working fails. Here is the function and reproduction of the error:
```
library(multiApply)
#create input
forecast<-array(dim=c('31','12','4'),rnorm(31*12*4)...Hi @nmanubens,
With the new release, a simple function that was previously working fails. Here is the function and reproduction of the error:
```
library(multiApply)
#create input
forecast<-array(dim=c('31','12','4'),rnorm(31*12*4))
names(dim(forecast))<-c('sday','syear','ensemble')
anomaly_simple<-function(data){
avg<-Apply(data,c('syear','ensemble'),mean)[[1]]
anom<-Apply(data,c('sday'),function(x) x-avg)[[1]]
return(anom)
}
anomaly<-anomaly_simple(forecast)
```
I get the following error:
Error in (function (x) : object 'avg' not found
If I try line by line, it works and with previous versions of multiApply also workedhttps://earth.bsc.es/gitlab/ces/multiApply/-/issues/4Dimension names not propagated to atomic function2019-01-20T19:16:07+01:00AndreaDimension names not propagated to atomic functionHi @nmanubens , since the latest release of multiApply, the dimension names are not accessible from within the atomic function.Hi @nmanubens , since the latest release of multiApply, the dimension names are not accessible from within the atomic function.https://earth.bsc.es/gitlab/ces/multiApply/-/issues/1Apply() on amdahl2018-11-20T18:27:08+01:00ncortesiApply() on amdahlIt seems this function is properly accelerating computations on gustafson but not on amdahl:
a <- array(10, c(1000,100,100))
system.time(b <- apply(a, c(2,3), mean))
user system elapsed ...It seems this function is properly accelerating computations on gustafson but not on amdahl:
a <- array(10, c(1000,100,100))
system.time(b <- apply(a, c(2,3), mean))
user system elapsed
1.914 27.396 29.339
system.time(b <- Apply(a, c(2,3), "mean", parallel=TRUE))
user system elapsed
1.908 46.335 48.227
Computation times increase instead of decreasing, while on gustafon the same test takes only 10 seconds with Apply(). I didn't test if this issue is also affecting Moore.