|
There are many things that can limit the performance we achieve.
|
|
There are many things that can limit the performance we achieve.
|
|
|
|
A general overview of different GPU performance bottlenecks is given in the section below.
|
|
In order to overcome these issues, there are some debugging tools that can be of help.
|
|
In order to overcome these issues, there are some debugging tools that can be of help.
|
|
First, a general overview of different GPU performance bottlenecks is given.
|
|
These are discussed in the previous section [1.d. Profiling](1.d-Profiling).
|
|
Then a list of tools to combat these bottlenecks will be discussed.
|
|
|
|
|
|
|
|
## Potential problems
|
|
## Potential problems
|
|
There are 6 general problems that can limit the performance of GPUs.
|
|
There are 6 general problems that can limit the performance of GPUs.
|
... | @@ -54,12 +54,4 @@ There is also some implicit behavior by OpenMP which can penalize performance. |
... | @@ -54,12 +54,4 @@ There is also some implicit behavior by OpenMP which can penalize performance. |
|
The list below are the ones we know of, but there might be more:
|
|
The list below are the ones we know of, but there might be more:
|
|
|
|
|
|
**Array copying**
|
|
**Array copying**
|
|
If an array is copied (`a=b`, `a(:,:,:) = b(:,:,:)`, etc.) within an `$omp target parallel do`, all threads will perform this copy.
|
|
If an array is copied (`a=b`, `a(:,:,:) = b(:,:,:)`, etc.) within an `$omp target parallel do`, all threads will perform this copy. |
|
|
|
\ No newline at end of file |
|
## Debug tools
|
|
|
|
Many things discussed here are an extension to the previous section [1.d. Profiling](1.d-Profiling).
|
|
|
|
Please refer to that page for specifics on how to use the tools.
|
|
|
|
|
|
|
|
Debugging is possible at compiler time or at runtime.
|
|
|
|
The options are compiler and hardware vendor specific, so that distinction will be made first.
|
|
|
|
Then in their subsections, the different compiler and runtime diagnostics will be discussed. |
|
|
|
\ No newline at end of file |
|
|