Publications / Conference

An architecture to perform NIC based MPI matching

Hemmert, Karl S.; Underwood, Keith; Rodrigues, Arun

Modern supercomputers aggregate thousands of microprocessors through a high performance network. Many of these systems place a processor on the network interface controller (NIC) to handle some portion of the MPI processing. This processing involves traversing a linked list and invoking a matching function for each item. Although this task is critical to the performance of the system, microprocessors perform it extremely poorly. Furthermore, the traditional network processor approaches of multicore and multithreading map poorly to the problem because the list is a shared data structure. While match processing can be implemented directly in hardware, hardware implementations can be extremely inflexible and lead to extremely high risk. This paper presents a novel, programmable architecture for a processor to handle the matching function. The matching engine approaches the performance of a direct hardware implementation while maintaining a high degree of flexibility and programmability. More importantly, it requires a dramatically smaller area than a conventional processor. © 2007 IEEE.