5.1 Introduction

MPI-/ provides an interface that allows processes in a parallel program to communicate with one another. MPI-/ specifies neither how the processes are created, nor how they establish communication. Moreover, an MPI-/ application is static; that is, no processes can be added to or deleted from an application after it has been started.

MPI-/ users have asked that the MPI-/ model be extended to allow process creation and management after an MPI-/ application has been started. A major impetus comes from the PVM [7] research effort, which has provided a wealth of experience with process management and resource control that illustrates their benefits and potential pitfalls.

The MPI-/ Forum decided not to address resource control in MPI-// because it was not able to design a portable interface that would be appropriate for the broad spectrum of existing and potential resource and process controllers. Resource control can encompass a wide range of abilities, including adding and deleting nodes from a virtual parallel machine, reserving and scheduling resources, managing compute partitions of an MPP, and returning information about available resources. MPI-// assumes that resource control is provided externally -- probably by computer vendors, in the case of tightly coupled systems, or by a third party software package when the environment is a cluster of workstations.

The reasons for adding process management to MPI-/ are both technical and practical. Important classes of message passing applications require process control. These include task farms, serial applications with parallel modules, and problems that require a run-time assessment of the number and type of processes that should be started. On the practical side, users of workstation clusters who are migrating from PVM to MPI-/ may be accustomed to using PVM's capabilities for process and resource management. The lack of these features is a practical stumbling block to migration.

While process management is essential, adding it to MPI-/ should not compromise the portability or performance of MPI-/ applications. In particular:

The MPI-// process management model addresses these issues in two ways. First, MPI-/ remains primarily a communication library. It does not manage the parallel environment in which a parallel program executes, though it provides a minimal interface between an application and external resource and process managers.

Second, MPI-// does not change the concept of communicator. Once a communicator is built, it behaves as specified in MPI-/. A communicator is never changed once created, and it is always created using deterministic collective operations.

MPI-Standard for MARMOT