Publications Details
Hardware Evaluation Outreach: Application development Challenges Now and for the Exascale Era
Bair, Ray; Cook, Jeanine C.; Donofrio, David; Kuehn, Jeff; Moore, Shirley
The intent of this document is to assist the programmer in understanding details of contemporary and Exascale hardware system design and how these designs provide opportunities and place constraints on next-generation simulation software design. We attempt to clarify hardware organization and component details for our most current and Exascale systems to help program developers understand how software needs to change in order to take best advantage of the performance available. Exascale success is specifically defined for ECP as a 50x improvement over baseline in the aggregate "capability volume" on several KPP axes, of which raw floating point performance is only one, but also includes characteristics such as problem size, system memory size, node memory size, power, and efficiency. This multi-axis approach is particularly important to understand in the context of delivered improvements in real applications, since, for instance, the floating point computation may comprise less than 10% of the actual computational work required. Given the Exascale requirements and the constraints these requirements put on the performance expectations of fundamental system components, the programmer will be forced to re-think several application implementation details in order to achieve exaflop performance on these systems. The remainder of this document aims to present more detail on Exascale era system hardware and the specific areas that the programmer should address to extract performance from these systems. We attempt to give the programmer guidance at both a high- and low-level, providing some abstract suggestions on how to refactor codes given the expected system architectures and some low-level recommendations on how to implement these modifications. We also include a section on training resources that are helpful for both programmers that are just beginning to understand code modifications for contemporary and Exascale systems and for those that have done some refactoring and are now trying to extract maximal application performance from these systems.