The scheme given here does not directly support the nesting of
profiling functions, since it provides only a single alternative name
for each MPI function. Consideration was given to an implementation
which would allow multiple levels of call interception, however we
were unable to construct an implementation of this which did not
have the following disadvantages
- assuming a particular implementation language.
- imposing a run time cost even when no profiling was taking place.
Since one of the objectives of MPI is to permit efficient, low latency
implementations, and it is not the business of a standard to require a
particular implementation language, we decided to accept the scheme
outlined above.
Note, however, that it is possible to use the scheme above to
implement a multi-level system, since the function called by the user
may call many different profiling functions before calling the
underlying MPI function.
Unfortunately such an implementation may require more cooperation
between the different profiling libraries than is required for the
single level implementation detailed above.
MPI-Standard for MARMOT