Performance of CST_Anomaly() with large datasets

Hi @erifarov and @aho

I had to run the Verification Suite with global data on Nord3v2 and noticed that CST_Anomaly() was very slow, in particular when asking for the anomalies to be computed in cross-validation (taking several hours to finish with 1.7 GB of hindcast data), even though I requested multiple cores.

I then noticed that the parameter ncores is missing from the calls to s2dv::Ano_CrossValid() and s2dv::Clim() inside CST_Anomaly(). I have added them in the branch dev-CST_Anomaly-ncores.

I have run a simple test with this sample data and 12 cores on the medmem nodes in Nord3v2:

obs_array <- rnorm(9383040)
dim(obs_array) <- c(sdate = 24, ftime = 6, lat = 181, lon = 360, ensemble = 1)
exp_array <- rnorm(9383040*25)
dim(exp_array) <- c(sdate = 24, ftime = 6, lat = 181, lon = 360, ensemble = 25)

You can find the full script in my personal gitlab.

For cross = F, we have:

master branch: "Time difference of 3.49159 mins"
new fix: "Time difference of 43.56318 secs"

For cross = T:

master branch: at least 2 hours (The run did not finish before my session ended)
new fix: "Time difference of 10.94001 mins"

While this already seems to improve the situation, I am still surprised by the huge difference between cross = T and cross = F. Do you think it might be worth it to investigate if the performance of Ano_CrossValid() can be improved?

Thanks,

Victòria