|
|
|
# How to use the Computational Kernel Analysis Tool
|
|
|
|
I've tried to use tool as simple as possible to be used in Marenostrum 3 but definitely it won't work anywhere else without a good number of modifications.
|
|
|
|
|
|
|
|
Here the steps to follow to evaluate a kernel:
|
|
|
|
1. Clone the repository to local workstation
|
|
|
|
```
|
|
|
|
git clone https://earth.bsc.es/gitlab/otinto/ComputationalKernelAnalysisTool.git ComputationalKernelAnalysisTool
|
|
|
|
```
|
|
|
|
|
|
|
|
2. Compress the folder and upload to Marenostrum 3. (Use the proper user)
|
|
|
|
```
|
|
|
|
tar -cf ComputationalKernelAnalysisTool.tar ComputationalKernelAnalysisTool
|
|
|
|
scp ComputationalKernelAnalysisTool.tar user@mn1.bsc.es:~
|
|
|
|
```
|
|
|
|
3. Connect to Marenostrum 3, untar the files and enter to the folder:
|
|
|
|
```
|
|
|
|
ssh user@mn1.bsc.es
|
|
|
|
cd ~
|
|
|
|
tar -xf ComputationalKernelAnalysisTool.tar
|
|
|
|
cd ComputationalKernelAnalysisTool
|
|
|
|
```
|
|
|
|
|
|
|
|
4. Modify the file (or not) in src/kernels.f90 and write the kernel that you want to analyze in the line with the comment ``! Line (do not remove this comment )``.
|
|
|
|
You can use A,B,C,D arrays and pa,pb,pc,pd parameters and jj,kk indices. (Be careful it has to compile or otherwise it will not work.)
|
|
|
|
```
|
|
|
|
module kernels
|
|
|
|
! Double Precision -----------------------------------------------------
|
|
|
|
INTEGER,PARAMETER :: wp = selected_real_kind(12, 307)
|
|
|
|
contains
|
|
|
|
subroutine kernel(A,B,C,D,pa,pb,pc,pd,asiz)
|
|
|
|
integer :: ii,jj
|
|
|
|
integer :: asiz
|
|
|
|
real(wp) , allocatable, dimension(:) :: A,B,C,D ! Arrays
|
|
|
|
real(wp) :: pa,pb,pc,pd ! Parameters
|
|
|
|
do ii=1,100
|
|
|
|
!dir$ vector aligned
|
|
|
|
do jj=1,asiz
|
|
|
|
!Simple kernel
|
|
|
|
a(jj) = pa*(pb+b(jj))+pc+pd*c(jj) ! Line (do not remove this comment )
|
|
|
|
end do
|
|
|
|
end do
|
|
|
|
end subroutine
|
|
|
|
end module
|
|
|
|
```
|
|
|
|
|
|
|
|
5. Execute one of the tests:
|
|
|
|
1. ``SingleCoreTest.sh`` will execute the kernel in a **single core** for different array sizes and will show how close to the theoretical peak performance your kernel is.
|
|
|
|
2. ``FullNodeTest.sh`` will execute the kernel in a **full node** for different array sizes and using a different number of cores (from 1 to 16) to see how the memory contention impact the computational kernel.
|
|
|
|
|
|
|
|
## That's all! |