Automatic MPI Counter Profiling

Rolf Rabenseifner
High-Performance Computing-Center Stuttgart (HLRS)
Rechenzentrum Universität Stuttgart (RUS)
University of Stuttgart
Allmandring 30
D-70550 Stuttgart
Germany
rabenseifner@rus.uni-stuttgart.de
http://www.hlrs.de/people/rabenseifner/

ABSTRACT:
This paper presents an automatic counter instrumentation and profiling module added to the MPI library on Cray T3E and SGI Origin2000 systems. A detailed summary of the hardware performance counters and the MPI calls of any MPI production program is gathered during execution and written in MPI_Finalize on a special syslog file. The user can get the same information in a different file. Statistical summaries are computed weekly and monthly and the user specific part is sent by mail to each user.
The paper discusses scalability aspects of the new interface: How to obtain the right amount of performance data to the right person in time, and how to draw conclusions for the further optimization process, e.g. with the trace-based profiling tool Vampir.
The paper describes two different software designs that allow the integration of the profiling layer into a Unix MPI library and into a dynamic shared object MPI library without consuming the user's PMPI profiling interface.
Experiences with this library on the Cray T3E systems at HLRS Stuttgart and TU Dresden and a summary of 6 month are presented in this paper. It is the first time that all MPI applications on such a large system where automatically instrumented and profiled for such a period. The statistics give new insight in how efficiently the MPP system is really used by the MPI applications. Moreover, it gives hints which application and which MPI routine should be optimized.
After integrating the hardware performance counters into the MPI counter profiling, first results with these counters are presented. The software is portable to other systems.

KEYWORDS:
MPI, Counter Profiling, Instrumentation, Hardware Performance Counters, Trace-based Profiling, PerfAPI, PCL, Scalable User Interface.

LOCAL LINKS:
Full paper as PDF document, postscript, gzip'ed postscript
Slides as PDF document, postscript, gzip'ed postscript

GLOBAL LINKS:
Full paper as reference, PDF document, postscript, gzip'ed postscript
Slides as reference, PDF document, postscript, gzip'ed postscript
Further publications on automatic MPI counter profiling
Imformation about MPI from the author
Imformation about MPI on T3E
Imformation about MPI counter profiling on T3E