Publications Details

Publications / SAND Report

Specification of Fenix MPI Fault Tolerance library (V.0.9)

Gammel, Marc; Van Der Wijngaart, Rob F.; Teranishi, Keita T.; Parashar, Manish

Fenix is a software library compatible with the Message Passing Interface (MPI) to support fault recovery without application shutdown. This specification is derived from a current implementation of Fenix that employs the User Level Fault Mitigation (ULFM) MPI fault tolerance module proposal. We only present the C library interface for Fenix; the Fortran interface will be added once the C version is complete.