Portals 4.0

Portals 4 has been under development since 2007, with a goal of reducing the interface-required coupling between the application processor and the portals processor. Additionally, we extended the API to better support light-weight communication models such as PGAS. A reference implementation of the Portals 4 API over InfiniBand and shared memory is available, see the Implementations Page for more information.

Foremost, Portals version 4.0 was substantially enhanced to better support the various PGAS programming models. Communication operations that do not include matching were added along with key atomic operations. In addition, the ordering definition was substantially strengthened relative to Portals version 3.3 for small messages. In support of the lightweight communication semantics required by PGAS models, lightweight "counting" events and acknowledgments were added. A PtlAtomic() function was added to support functionalities commonly provided in PGAS models. Finally, the Portals ordering model was substantially expanded to better support some PGAS models.

An equally fundamental change in Portals version 4.0 adds a mechanism to cope better with the concept of unexpected messages in MPI. Whereas version 3.3 used PtlMDUpdate() to atomically insert items into the match list so that the MPI implementation could manage unexpected messages, version 4.0 adds an overflow list where the application provides buffer space that the implementation can use to store unexpected messages. The implementation is then responsible for matching new list insertions to items that have arrived and are resident in the overflow list space. This change was necessary to eliminate round trips between the processor and the NIC for each item that was added to the match list (now named the priority list).

A third major change separated all resources for initiators and targets. Memory descriptors are used by the initiator to describe memory regions while list entries are used by targets to describe the memory region and matching criteria (in the case of match list entries). This separation of resources was also extended to events, where the number of event types was significantly reduced and initiator and target events were separated into different types with different accessor functions.

To better offload collective operations, a set of triggered operations were added. These operations allow an application to build non-blocking, offloaded collective operations with independent progress. They include variants of both the data movement operations (get and put) as well as the atomic operations.

Another set of changes arise from a desire to simplify hardware implementations. The threshold value was removed from the target and was replaced by the ability to specify that a match list entry is "use once" or "persistent". List insertions occur only at the tail of the list, since unexpected message handling has been separated out into a separate list.

Access control entries were found to be a non-scalable resource, so they have been eliminated. At the same time, it was recognized that the PTL_LE_OP_PUT and PTL_LE_OP_GET semantics required a form of matching. These two options along with the ability to include user ID based authentication were moved to permissions fields on the respective list entry or match list entry.

Ordering only at the message level was found to be insufficient for many PGAS models, which often require ordering of data. Unfortunately, uniformly requiring data ordering could create unnecessary performance constraints. As such, the ordering definition has been expanded to include data ordering and to let the user disable that ordering and message ordering.