Since parts of the MPI library may themselves be implemented using
more basic MPI functions (e.g. a portable implementation of the
collective operations implemented using point to point communications),
there is potential for profiling functions to be called from within an
MPI function which was called from a profiling function. This could
lead to ``double counting'' of the time spent in the inner routine.
Since this effect could actually be useful under some circumstances
(e.g. it might allow one to answer the question ``How much time is
spent in the point to point routines when they're called from
collective functions ?''), we have decided not to enforce any
restrictions on the author of the MPI library which would overcome
this. Therefore the author of the profiling library should be aware of
this problem, and guard against it herself. In a single threaded
world this is easily achieved through use of a static variable in the
profiling code which remembers if you are already inside a profiling
routine. It becomes more complex in a multi-threaded environment (as
does the meaning of the times recorded !)
MPI-Standard for MARMOT