mpi/pmix: address UCX perf. issues (custom list implementation)
Analysis indicates that slurm's `List` introduces significant overhead
(up to 7 us for append operation). More lightweight and less generic
version of a list container was developed for PMIx plugin with performance
considerations in mind.
Internal pingpong latency test results demonstrating improvements:
size UCXv1 UCX v2
1 46 30
2 56 30
4 72 30
8 85 30
16 93 30
32 122 31
64 120 31
128 126 31
256 137 32
512 155 33
1024 142 33
2048 153 34
4096 153 37
8192 170 42
16384 173 45
32768 177 52
65536 218 64
131072 236 88
262144 2735 2248
524288 3254 2879
1048576 4303 3770
2097152 6917 6234
4194304 11579 10473
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
Please register or sign in to comment