otintopr · 16f7f697
--- a/how-to-run-the-ComputationalKernelAnalysisTool.md
+++ b/how-to-run-the-ComputationalKernelAnalysisTool.md
+# How to use the Computational Kernel Analysis Tool
+I've tried to use tool as simple as possible to be used in Marenostrum 3 but definitely it won't work anywhere else without a good number of modifications.
+Here the steps to follow to evaluate a kernel:
+1. Clone the repository to local workstation
+```
+git clone https://earth.bsc.es/gitlab/otinto/ComputationalKernelAnalysisTool.git ComputationalKernelAnalysisTool
+```
+2. Compress the folder and upload to Marenostrum 3. (Use the proper user)
+```
+tar -cf ComputationalKernelAnalysisTool.tar ComputationalKernelAnalysisTool
+scp ComputationalKernelAnalysisTool.tar user@mn1.bsc.es:~
+```
+3. Connect to Marenostrum 3, untar the files and enter to the folder:
+```
+ssh user@mn1.bsc.es
+cd ~
+tar -xf ComputationalKernelAnalysisTool.tar
+cd ComputationalKernelAnalysisTool
+```
+4. Modify the file (or not) in src/kernels.f90 and write the kernel that you want to analyze in the line with the comment ``! Line (do not remove this comment )``.
+You can use A,B,C,D arrays and pa,pb,pc,pd parameters and jj,kk indices. (Be careful it has to compile or otherwise it will not work.)
+```
+module kernels
+   ! Double Precision -----------------------------------------------------
+   INTEGER,PARAMETER :: wp = selected_real_kind(12, 307)
+contains
+   subroutine kernel(A,B,C,D,pa,pb,pc,pd,asiz)
+      integer :: ii,jj
+      integer :: asiz
+      real(wp) , allocatable, dimension(:) :: A,B,C,D ! Arrays
+      real(wp) :: pa,pb,pc,pd                         ! Parameters
+      do ii=1,100
+            !dir$ vector aligned
+            do jj=1,asiz
+               !Simple kernel
+               a(jj) =  pa*(pb+b(jj))+pc+pd*c(jj)   ! Line (do not remove this comment )
+            end do
+      end do
+   end subroutine
+end module
+```
+5. Execute one of the tests:
+   1. ``SingleCoreTest.sh`` will execute the kernel in a **single core** for different array sizes and will show how close to the theoretical peak performance your kernel is.
+   2. ``FullNodeTest.sh`` will execute the kernel in a **full node** for different array sizes and using a different number of cores (from 1 to 16) to see how the memory contention impact the computational kernel.
+## That's all!