Trace Collection Mechanism

We used dumpi framework for collecting MPI traces. Dumpi tracing framework offers an efficient mechanism for trace collection of MPI traffics. It provides a library for runtime recording of events and post-processing utilities for exploration and analysis of the generated traces.

For systems that support weak symbols, the trace collection mechanism requires only linking with the profiling library. Dumpi uses this mechanism to redirect all MPI calls to the dumpi functions for processing. The dumpi routines record the call times and function arguments, in addition to the execution of the communication call. DUMPI stores the generated information in a binary format to reduce the impact on IO resources and also to reduce the trace file size.

The trace collection involves multiple steps:

  • Building dumpi libraries and tools. For more details on obtaining and building the dumpi library, please consult the dumpi website
  • Change the application build process to add linking the dumpi library.
  • Run the application to collect the MPI traces.

Linking with Dumpi

After building dumpi library and tools, the application build process need to be modified. For instance to collect traces for LULESH benchmark, we added the following Makefile line

LDFLAGS += -L/path/to/dumpi/lib -ldumpi 

This line is added after the definition of LDFLAGS. The path should point to the location of dumpi libraries on your machine.

Some applications need more complicated changes depending on the how build process works. Note also that you need to unload other profiling libraries to avoid conflicts. For instance, at NERSC, we typically unload Darshan module on Cray Machines. Please check the next section.

Conflicts between profiling libraries

Most profiling libraries, including dumpi, exploits the support of weak external symbols of MPI calls by the compiler and the linker to allow redirecting calls to their library. Weak symbols allow the existence of multiple definition of the same routine by multiple libraries. The linker can choose any of the weak routines, if no strong definition exists. If a library with strong symbols is linked with others with weak symbols, then the linker will favor using the strong symbol. Obviously, strong definition of a symbol should be unique, or the linker would complain about having multiple definitions.

Dumpi provides strong MPI symbols to intercept all MPI calls. At NERSC, Darshan uses the same mechanism to profile IO. As such to use dumpi, darshan module should be unloaded (module unload darshan).

If weak symbols are not supported, then macro preprocessing of the MPI is required to redirect calls to the profiler routines.