Should Apply() return attributes?
Apply() doesn't have attributes returned, even when the input data has attributes and parameter use_attributes
is used. In the documentation, if the attributes are expected to be returned is not clear. The definition of use_attributes
is as follow:
#' @param use_attributes List of vectors of character strings with names of attributes of each object in 'data' to be propagated to the subsets of data sent as inputs to the function specified in 'fun'. If this parameter is not specified (NULL), all attributes are dropped. This parameter can be specified as a named list (then the names of this list must match those of the names of parameter 'data'), or as an unnamed list (then the vectors of attribute names will be assigned in order to the input arrays in 'data').
So we know that if use_attributes = NULL
, the attributes are not taken by Apply() so it makes sense to not have attributes along with the returned array. However, with use_attributes
defined, the attributes are still lost.
Here is a minimum example. From the print message in func(), we can see that use_attributes
does work as described, passing attributes to Apply(). But none of the results from Apply() have attributes. If we call func() directly, the attributes are preserved. The attributes are lost **after iteration(), an inner function in Apply().
library(multiApply)
arr <- array(1:60, dim = c(sdate = 10, time = 2, region = 3))
attributes(arr)$units <- 'K'
time_attr <- c(paste0(1961:1970, "-11-01 12:00:00"), paste0(1961:1970, "-12-01 12:00:00"))
time_attr <- as.POSIXct(time_attr, tz = 'UTC')
dim(time_attr) <- dim(arr)[1:2]
attributes(arr)$time <- time_attr
func <- function(x) {
print(str(x))
attributes(x)$new_attr <- 'A new attribute!'
return(x)
}
# res1 and res2 are the same; with res3, the attributes are not passed to Apply()
res1 <- Apply(list(data = arr), func, target_dims = 'sdate', output_dims = 'sdate', use_attributes = list(data = c('units', 'time')))
res2 <- Apply(arr, func, target_dims = 'sdate', output_dims = 'sdate', use_attributes = list(c('units', 'time')))
res3 <- Apply(arr, func, target_dims = 'sdate', output_dims = 'sdate')
# Call func directly. Attributes are returned
res4 <- func(arr)
There are three types of attributes: (1) All the attributes of the input data (2) the ones in use_attributes
(3) the ones returned by fun
, i.e., $new_attr
in the example above. (1) and (2) are doable since the function just needs to paste the original attributes to the returned object; however, (1) doesn't make much sense to me since not all the attributes are wanted. The (3) one sounds reasonable, but in fact it is difficult because I don't know how to combine the attributes of all the chunks together.
As I understood, @vagudets you want to have (3) to facilitate the Compute() case in SUNSET. For the "normal" Apply() usage and startR case, I would say that manually saving and attaching the attributes to the result should be enough, but I understand it is difficult to generalize the code by this means.
This is all I have now. @nperez I tag you in case you have some insight about this issue. Please let me know what you think, thanks!
Best,
An-Chi
**To be specific, in iteration(), result
is the result of fun
applied on data
, which has attributes. sub_arrays_of_results
is the final one returned by iteration(), which doesn't have attributes anymore.