4.8 All-to-All Scatter/Gather



MPI_ALLTOALL(sendbuf, sendcount, sendtype, recvbuf, recvcount, recvtype, comm)

IN
sendbuf starting address of send buffer (choice)
IN
sendcount number of elements sent to each process (integer)
IN
sendtype data type of send buffer elements (handle)
OUT
recvbuf address of receive buffer (choice)
IN
recvcount number of elements received from any process (integer)
IN
recvtype data type of receive buffer elements (handle)
IN
comm communicator (handle)

int MPI_Alltoall(void* sendbuf, int sendcount, MPI_Datatype sendtype, void* recvbuf, int recvcount, MPI_Datatype recvtype, MPI_Comm comm)



MPI_ALLTOALL(SENDBUF, SENDCOUNT, SENDTYPE, RECVBUF, RECVCOUNT, RECVTYPE, COMM, IERROR)
<type> SENDBUF(*), RECVBUF(*)
INTEGER SENDCOUNT, SENDTYPE, RECVCOUNT, RECVTYPE, COMM, IERROR



MPI_ALLTOALL is an extension of MPI_ALLGATHER to the case where each process sends distinct data to each of the receivers. The jth block sent from process i is received by process j and is placed in the ith block of recvbuf.

The type signature associated with sendcount, sendtype, at a process must be equal to the type signature associated with recvcount, recvtype at any other process. This implies that the amount of data sent must be equal to the amount of data received, pairwise between every pair of processes. As usual, however, the type maps may be different.

The outcome is as if each process executed a send to each process (itself included) with a call to,

\begin{displaymath}\tt
MPI\_Send(sendbuf+i\cdot sendcount\cdot extent(sendtype), sendcount,
sendtype, i,...),
\end{displaymath}

and a receive from every other process with a call to,

\begin{displaymath}\tt
MPI\_Recv(recvbuf+i\cdot recvcount\cdot extent(recvtype),recvcount,i,...).
\end{displaymath}

All arguments on all processes are significant. The argument comm must have identical values on all processes.



MPI_ALLTOALLV(sendbuf, sendcounts, sdispls, sendtype, recvbuf, recvcounts, rdispls, recvtype, comm)

IN
sendbuf starting address of send buffer (choice)
IN
sendcounts integer array equal to the group size specifying the number of elements to send to each processor
IN
sdispls integer array (of length group size). Entry j specifies the displacement (relative to sendbuf from which to take the outgoing data destined for process j
IN
sendtype data type of send buffer elements (handle)
OUT
recvbuf address of receive buffer (choice)
IN
recvcounts integer array equal to the group size specifying the number of elements that can be received from each processor
IN
rdispls integer array (of length group size). Entry i specifies the displacement (relative to recvbuf at which to place the incoming data from process i
IN
recvtype data type of receive buffer elements (handle)
IN
comm communicator (handle)

int MPI_Alltoallv(void* sendbuf, int *sendcounts, int *sdispls, MPI_Datatype sendtype, void* recvbuf, int *recvcounts, int *rdispls, MPI_Datatype recvtype, MPI_Comm comm)



MPI_ALLTOALLV(SENDBUF, SENDCOUNTS, SDISPLS, SENDTYPE, RECVBUF, RECVCOUNTS, RDISPLS, RECVTYPE, COMM, IERROR)
<type> SENDBUF(*), RECVBUF(*)
INTEGER SENDCOUNTS(*), SDISPLS(*), SENDTYPE, RECVCOUNTS(*), RDISPLS(*), RECVTYPE, COMM, IERROR



MPI_ALLTOALLV adds flexibility to MPI_ALLTOALL in that the location of data for the send is specified by sdispls and the location of the placement of the data on the receive side is specified by rdispls.

The jth block sent from process i is received by process j and is placed in the ith block of recvbuf. These blocks need not all have the same size.

The type signature associated with sendcount[j], sendtype at process i must be equal to the type signature associated with recvcount[i], recvtype at process j. This implies that the amount of data sent must be equal to the amount of data received, pairwise between every pair of processes. Distinct type maps between sender and receiver are still allowed.

The outcome is as if each process sent a message to every other process with,

\begin{displaymath}\tt
MPI\_Send(sendbuf+displs[i]\cdot extent(sendtype), sendcounts[i],
sendtype, i,...),
\end{displaymath}

and received a message from every other process with a call to

\begin{displaymath}\tt
MPI\_Recv(recvbuf+displs[i]\cdot extent(recvtype),recvcounts[i],recvtype,i,...).
\end{displaymath}

All arguments on all processes are significant. The argument comm must have identical values on all processes.

Rationale. The definitions of MPI_ALLTOALL and MPI_ALLTOALLV give as much flexibility as one would achieve by specifying n independent, point-to-point communications, with two exceptions: all messages use the same datatype, and messages are scattered from (or gathered to) sequential storage.(End of rationale.)

Advice to implementors. Although the discussion of collective communication in terms of point-to-point operation implies that each message is transferred directly from sender to receiver, implementations may use a tree communication pattern. Messages can be forwarded by intermediate nodes where they are split (for scatter) or concatenated (for gather), if this is more efficient.(End of advice to implementors.)

MPI-Standard for MARMOT