Hybrid Parallel Programming: Performance Problems and Chances
Rolf Rabenseifner
High-Performance Computing-Center Stuttgart (HLRS)
University
of Stuttgart
Allmandring 30
D-70550 Stuttgart
Germany
rabenseifner@hlrs.de
http://www.hlrs.de/people/rabenseifner/
- ABSTRACT:
-
Most HPC systems are clusters of shared memory nodes.
Parallel programming must combine the distributed memory
parallelization on the node inter-connect with the shared
memory parallelization inside of each node.
Various hybrid MPI+OpenMP programming models are compared with pure MPI.
Benchmark results of several platforms are presented.
This paper analyzes the strength and weakness of several parallel programming
models on clusters of SMP nodes.
Benchmark results show, that the hybrid-masteronly
programming model can be used more efficiently on some
vector-type systems, although this model suffers from sleeping application
threads while the master thread communicates.
This paper analyses strategies to overcome typical drawbacks of this
easily usable programming scheme on systems with weaker inter-connects.
Best performance can be achieved with overlapping communication and computation,
but this scheme is lacking in ease of use.
- KEYWORDS:
- OpenMP, MPI, Hybrid Parallel Programming, Threads and MPI, HPC, Performance.
- LOCAL LINKS:
-
Full paper as
PDF document,
postscript,
gzip'ed postscript
Slides as
PDF document,
- GLOBAL LINKS:
-
Full paper as
reference,
PDF document,
postscript,
gzip'ed postscript.
Slides as
reference,
PDF document.
Used benchmark code mpi_bench4
Information about MPI from the author