Implicit in MPI-/ is the idea of a contiguous chunk of memory accessible through a linear address space. MPI-/ copies data to and from this memory. An MPI-/ program specifies the location of data by providing memory addresses and offsets. In the C language, sequence association rules plus pointers provide all the necessary low-level structure.
In Fortran 90, user data is not necessarily stored contiguously. For example, the array section A(1:N:2) involves only the elements of A with indices 1, 3, 5, ... . The same is true for a pointer array whose target is such a section. Most compilers ensure that an array that is a dummy argument is held in contiguous memory if it is declared with an explicit shape (e.g., B(N)) or is of assumed size (e.g., B(*)). If necessary, they do this by making a copy of the array into contiguous memory. Both Fortran 77 and Fortran 90 are carefully worded to allow such copying to occur, but few Fortran 77 compilers do it.10.1
Because MPI-/ dummy buffer arguments are assumed-size arrays, this leads to a serious problem for a non-blocking call: the compiler copies the temporary array back on return but MPI-/ continues to copy data to the memory that held it. For example, consider the following code fragment:
real a(100) call MPI_IRECV(a(1:100:2), MPI_REAL, 50, ...)Since the first dummy argument to MPI_IRECV is an assumed-size array (<type> buf(*)), the array section a(1:100:2) is copied to a temporary before being passed to MPI_IRECV, so that it is contiguous in memory. MPI_IRECV returns immediately, and data is copied from the temporary back into the array a. Sometime later, MPI-/ may write to the address of the deallocated temporary. Copying is also a problem for MPI_ISEND since the temporary array may be deallocated before the data has all been sent from it.
Most Fortran 90 compilers do not make a copy if the actual argument is the whole of an explicit-shape or assumed-size array or is a `simple' section such as A(1:N) of such an array. (We define `simple' more fully in the next paragraph.) Also, many compilers treat allocatable arrays the same as they treat explicit-shape arrays in this regard (though we know of one that does not). However, the same is not true for assumed-shape and pointer arrays; since they may be discontiguous, copying is often done. It is this copying that causes problems for MPI-/ as described in the previous paragraph.
Our formal definition of a `simple' array section is
name ( [:,]... [<subscript>]:[<subscript>] [,<subscript>]... )That is, there are zero or more dimensions that are selected in full, then one dimension selected without a stride, then zero or more dimensions that are selected with a simple subscript. Examples are
A(1:N), A(:,N), A(:,1:N,1), A(1:6,N), A(:,:,1:N)Because of Fortran's column-major ordering, where the first index varies fastest, a simple section of a contiguous array will also be contiguous.10.2
The same problem can occur with a scalar argument. Some compilers, even for Fortran 77, make a copy of some scalar dummy arguments within a called procedure. That this can cause a problem is illustrated by the example
call user1(a,rq) call MPI_WAIT(rq,status,ierr) write (*,*) a subroutine user1(buf,request) call MPI_IRECV(buf,...,request,...) end
If a is copied, MPI_IRECV will alter the copy when it completes the communication and will not alter a itself.
Note that copying will almost certainly occur for an argument that is a non-trivial expression (one with at least one operator or function call), a section that does not select a contiguous part of its parent (e.g., A(1:n:2)), a pointer whose target is such a section, or an assumed-shape array that is (directly or indirectly) associated with such a section.
If there is a compiler option that inhibits copying of arguments, in either the calling or called procedure, this should be employed.
If a compiler makes copies in the calling procedure of arguments that are explicit-shape or assumed-size arrays, simple array sections of such arrays, or scalars, and if there is no compiler option to inhibit this, then the compiler cannot be used for applications that use MPI_GET_ADDRESS, or any non-blocking MPI-/ routine. If a compiler copies scalar arguments in the called procedure and there is no compiler option to inhibit this, then this compiler cannot be used for applications that use memory references across subroutine calls as in the example above.