This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
library:computing:ifs_impi_troubles [2017/08/11 14:30] mcastril [IFS @ MN4] |
library:computing:ifs_impi_troubles [2017/08/24 13:56] (current) mcastril |
||
|---|---|---|---|
| Line 2: | Line 2: | ||
| ===== IFS @ MN4 ===== | ===== IFS @ MN4 ===== | ||
| + | |||
| + | ==== Issue 1: IFS memory corrupted when activating AXV512 ==== | ||
| **Environment: | **Environment: | ||
| Line 7: | Line 9: | ||
| * Intel 2017.4 & Intel MPI 2017.4 | * Intel 2017.4 & Intel MPI 2017.4 | ||
| - | The problem was reported when using -O2 & -O3 optimization flags in conjunction with activation of [[http:// | + | The problem was reported when using -O2 & -O3 optimization flags in conjunction with activation of [[http:// |
| - | **Problem: | + | **Problem: |
| < | < | ||
| Line 43: | Line 45: | ||
| </ | </ | ||
| - | The error was happening at different steps or processes depending on the run. | + | (The error was happening at different steps or in different |
| - | However, if the model is run using __-O2__ (and also -xCORE-AVX512) in the compilation, | + | However, if the model is run using __-O2__ (and also -xCORE-AVX512) in the compilation, |
| < | < | ||
| Line 74: | Line 76: | ||
| </ | </ | ||
| - | **Actions taken:** We first debugged the regions of the code referred in the error trace using Allinea DDT and though it is difficult to debug using -O2 or -O3 (because variables are optimized and its value hidden, and code lines does not always correspond to the ones actually being executed), we could see where the problem | + | **Actions taken:** We first debugged the regions of the code referred in the error trace by using Allinea DDT and, though it is difficult to debug using -O2 or -O3 (because variables are optimized and its value hidden, and code lines does not always correspond to the ones actually being executed), we could see where the problem |
| - | The part of the code that was exiting in the O3 mode is in the ludcmp routine: | + | The part of the code that was exiting in the -O3 mode was in the //ludcmp// routine: |
| < | < | ||
| Line 87: | Line 89: | ||
| </ | </ | ||
| - | Basically it is checking | + | Basically it is checking |
| - | The part of the code that crashes in the O2 mode is in surfexcdriver_ctl_mod module: | + | The part of the code that crashes in the O2 mode is in //surfexcdriver_ctl_mod// module: |
| < | < | ||
| Line 102: | Line 104: | ||
| </ | </ | ||
| - | Here the code is assigning a weigthed value to PSSRFLTI, | + | Here the code is assigning a weigthed value to PSSRFLTI, |
| - | The fact that using O2 provoked an "array index out of bounds" | + | The fact that using -O2 provoked an "array index out of bounds" |
| - | In order to have more information | + | In order to have more information |
| - | **Diagnosis: | + | **Diagnosis: |
| - | **Solution: | + | **Solution: |
| + | |||
| + | The first fix is to use [[https:// | ||
| + | |||
| + | < | ||
| + | !DIR$ NOVECTOR | ||
| + | DO JTILE=1, | ||
| + | !DIR$ NOVECTOR | ||
| + | DO JL=KIDIA, | ||
| + | ! Disaggregate solar flux but limit to 700 W/m2 (due to inconsistency | ||
| + | ! with albedo) | ||
| + | PSSRFLTI(JL, | ||
| + | & (1.0_JPRB-ZALB(JL)))*PSSRFL(JL) | ||
| + | IF (PSSRFLTI(JL, | ||
| + | LLHISSR(JL)=.TRUE. | ||
| + | PSSRFLTI(JL, | ||
| + | ENDIF | ||
| + | </ | ||
| + | |||
| + | |||
| + | The second is getting the conditional IF out of the loop and make two independent loops instead: | ||
| + | |||
| + | < | ||
| + | DO JTILE=1, | ||
| + | DO JL=KIDIA, | ||
| + | IF (LLHISSR(JL)) THEN | ||
| + | PSSRFLTI(JL, | ||
| + | ENDIF | ||
| + | ENDDO | ||
| + | |||
| + | DO JL=KIDIA, | ||
| + | ZSRFD(JL)=PSSRFLTI(JL, | ||
| + | ENDDO | ||
| + | </ | ||
| + | |||
| + | We could see in the vectorization report that, being some of the other loops in the same function merged for optimization, | ||
| + | |||
| + | Both fixes work with both __-O2__ and __-O3__, so the matrix is no detected as singular in //ludcmp//. | ||
| **More information: | **More information: | ||
| + | |||
| + | Intel® AVX-512 Instructions introduction: | ||
| + | |||
| + | [[https:// | ||
| + | |||
| + | Compiling for the Intel® Xeon Phi™ Processor and the Intel® Advanced Vector Extensions 512 ISA: | ||
| + | |||
| + | [[https:// | ||
| + | |||
| + | Quick Reference Guide to Optimization with Intel® C++ and Fortran Compilers v16: | ||
| + | |||
| + | [[https:// | ||
| + | |||
| + | Vectorization and Optimization Reports: | ||
| + | |||
| + | [[https:// | ||
| + | |||
| + | Generating a Vectorization Report: | ||
| + | |||
| + | [[https:// | ||
| + | |||
| + | General Compiler Directives: | ||
| + | |||
| + | [[https:// | ||
| + | |||