Publications Search

Towards performance-portability of the Albany Finite Element analysis code using the Kokkos library of Trilinos

International Journal of HPC Applications

Demeshko, Irina; Salinger, Andrew G.; Spotz, William S.; Tezaur, Irina K.; Guba, Oksana; Heroux, Michael A.

Performance portability on heterogeneous high-performance computing (HPC) systems is a major challenge faced today by code developers: parallel code needs to execute correctly as well as with high performance on machines with different architectures, operating systems, and software libraries. The Finite Element Method (FEM) is a popular and flexible method for discretizing partial differential equations arising in a wide variety of scientific, engineering, and industry applications that require HPC. This paper presents some preliminary results pertaining to our development of a performance portable implementation of the FEM-based Albany code. Performance portability is achieved using the Kokkos library of Trilinos. We present performance results for two different physics simulations modules in Albany: the Aeras global atmosphere dynamical code and the FELIX land-ice solver. As a result, numerical experiments show that our single code implementation gives reasonable performance across two multi-core/many-core architectures: NVIDIA GPUs and multi-core CPUs.

More Details

TYPE Journal Article YEAR 2016

DOI OSTI

Sustainable & productive: Improving incentives for quality software

CEUR Workshop Proceedings

Heroux, Michael A.

Computational Science and Engineering (CSE) software can benefit substantially from an explicit focus on quality improvement. This is especially true as we face increased demands in both modeling and software complexities. At the same time, just desiring improved quality is not sufficient. We must work with the entities that provide CSE research teams with publication venues, funding, and professional recognition in order to increase incentives for improved software quality. In fact, software quality is precisely calibrated to the expectations, explicit and implicit, set by these entities. We will see broad improvements in sustainability and productivity only when publishers, funding agencies and employers raise their expectations for software quality. CSE software community leaders, those who are in a position to inform and influence these entities, have a unique opportunity to broadly and positively impact software quality by working to establish incentives that will spur creative and novel approaches to improve developer productivity and software sustainability.

More Details

TYPE Conference Poster YEAR 2016

OSTI Scopus

Preconditioning Communication-Avoiding Krylov Methods

Rajamanickam, Sivasankaran; Yamazaki, I.; Boman, Erik G.; Prokopenko, Andrey V.; Heroux, Michael A.; Dongarra, J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Strategies for Next Generation HPC Applications and Systems

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Local recovery and failure masking for stencil-based applications at extreme scales

International Conference for High Performance Computing, Networking, Storage and Analysis, SC

Gamell, Marc; Teranishi, Keita; Heroux, Michael A.; Mayo, Jackson R.; Kolla, Hemanth; Chen, Jacqueline H.; Parashar, Manish

Application resilience is a key challenge that has to be addressed to realize the exascale vision. Online recovery, even when it involves all processes, can dramatically reduce the overhead of failures as compared to the more traditional approach where the job is terminated and restarted from the last checkpoint. In this paper we explore how local recovery can be used for certain classes of applications to further reduce overheads due to resilience. Specifically we develop programming support and scalable runtime mechanisms to enable online and transparent local recovery for stencil-based parallel applications on current leadership class systems. We also show how multiple independent failures can be masked to effectively reduce the impact on the total time to solution. We integrate these mechanisms with the S3D combustion simulation, and experimentally demonstrate (using the Titan Cray-XK7 system at ORNL) the ability to tolerate high failure rates (i.e., node failures every 5 seconds) with low overhead while sustaining performance, at scales up to 262144 cores.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

LFLR for MPI+X

Teranishi, Keita; Heroux, Michael A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Preconditioning Communication-Avoiding Krylov Methods

Rajamanickam, Sivasankaran; Yamazaki, Ichitaro; Boman, Erik G.; Hoemmen, Mark F.; Heroux, Michael A.; Tomov, Stan; Dongarra, Jack

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Addressing sustainability and performance portability challenges in Albany

Demeshko, Irina; Heroux, Michael A.; Salinger, Andrew G.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Local Recovery and Failure Masking for Stencil-based Applications at Extreme Scales

Gamell Balmana, Marc; Teranishi, Keita; Heroux, Michael A.; Mayo, Jackson R.; Kolla, Hemanth; Chen, Jacqueline H.; Parashar, Manish

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Assessing a mini-application as a performance proxy for a finite element method engineering application

Concurrency and Computation. Practice and Experience

Lin, Paul T.; Heroux, Michael A.; Williams, Alan B.; Barrett, Richard F.

The performance of a large-scale, production-quality science and engineering application (‘app’) is often dominated by a small subset of the code. Even within that subset, computational and data access patterns are often repeated, so that an even smaller portion can represent the performance-impacting features. If application developers, parallel computing experts, and computer architects can together identify this representative subset and then develop a small mini-application (‘miniapp’) that can capture these primary performance characteristics, then this miniapp can be used to both improve the performance of the app as well as provide a tool for co-design for the high-performance computing community. However, a critical question is whether a miniapp can effectively capture key performance behavior of an app. This study provides a comparison of an implicit finite element semiconductor device modeling app on unstructured meshes with an implicit finite element miniapp on unstructured meshes. The goal is to assess whether the miniapp is predictive of the performance of the app. Finally, single compute node performance will be compared, as well as scaling up to 16,000 cores. Results indicate that the miniapp can be reasonably predictive of the performance characteristics of the app for a single iteration of the solver on a single compute node.

More Details

TYPE Journal Article YEAR 2015

OSTI DOI

Exploring failure recovery for stencil-based applications at extreme scales

HPDC 2015 - Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing

Gamell Balmana, Marc; Teranishi, Keita; Heroux, Michael A.; Mayo, Jackson R.; Kolla, Hemanth; Chen, Jacqueline H.; Parashar, Manish

Application resilience is a key challenge that must be ad-dressed in order to realize the exascale vision. Previous work has shown that online recovery, even when done in a global manner (i.e., involving all processes), can dramatically re-duce the overhead of failures when compared to the more traditional approach of terminating the job and restarting it from the last stored checkpoint. In this paper we suggest going one step further, and explore how local recovery can be used for certain classes of applications to reduce the over-heads due to failures. Specifically we study the feasibility of local recovery for stencil-based parallel applications and we show how multiple independent failures can be masked to effectively reduce the impact on the total time to solution.

More Details

TYPE Conference Poster YEAR 2015

OSTI Scopus

ACM TOMS replicated computational results initiative

ACM Transactions on Mathematical Software

Heroux, Michael A.

In this study, the scientific community relies on the peer review process for assuring the quality of published material, the goal of which is to build a body of work we can trust. Computational journals such as The ACM Transactions on Mathematical Software (TOMS) use this process for rigorously promoting the clarity and completeness of content, and citation of prior work. At the same time, it is unusual to independently confirm computational results.

More Details

TYPE Journal Article YEAR 2015

DOI OSTI

Addressing sustainability and performance portability challenges in Albany

Demeshko, Irina; Salinger, Andrew G.; Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Versioned Distributed Arrays for Resilience in Scientific Applications: Global View Resilience

Teranishi, Keita; Heroux, Michael A.; Hoemmen, Mark F.; Chien, Andrew; Balaji, Pavan; Beckman, Pete; Dun, Nan; Fang, Aiman; Fujita, Hajime; Iskra, Kamil; Rubenstein, Zachary; Zheng, Zimming; Schreiber, Robert; Hammond, Jeff; Dinan, James; Laguna, Ignacio; Richards, David; Dubey, Anshu; Van Straalen, Brian; Siegel, Andrew

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Exploring Failure Recovery for Stencil-based Applications at Extreme Scales

Gamell, Marc; Teranishi, Keita; Heroux, Michael A.; Mayo, Jackson R.; Kolla, Hemanth; Chen, Jacqueline H.; Parashar, Manish

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Local Failure Local Recovery for large scale SPMD applications

Heroux, Michael A.; Manish, Parashar; Gamell, Marc

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

A Kokkos Implementation of Albany: A Performance Portable Multiphysics Simulation Code

Demeshko, Irina; Bradley, Andrew M.; Cyr, Eric C.; Edwards, Harold C.; Heroux, Michael A.; Phipps, Eric T.; Salinger, Andrew G.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Preconditioning Communication-Avoiding Krylov Methods

Rajamanickam, Sivasankaran; Yamazaki, Ichitaro; Boman, Erik G.; Hoemmen, Mark F.; Heroux, Michael A.; Tomov, Stanimire; Dongarra, Jack

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Towards Exascale Implementation of the Finite Element Based Application Development Environment

Demeshko, Irina; Edwards, Harold C.; Heroux, Michael A.; Salinger, Andrew G.; Pawlowski, Roger; Phipps, Eric T.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Assessing the role of mini-applications in predicting key performance characteristics of scientific and engineering applications

Journal of Parallel and Distributed Computing

Barrett, R.F.; Crozier, Paul; Doerfler, Douglas W.; Heroux, Michael A.; Lin, Paul T.; Thornquist, Heidi K.; Trucano, Timothy G.; Vaughan, Courtenay T.

Computational science and engineering application programs are typically large, complex, and dynamic, and are often constrained by distribution limitations. As a means of making tractable rapid explorations of scientific and engineering application programs in the context of new, emerging, and future computing architectures, a suite of "miniapps" has been created to serve as proxies for full scale applications. Each miniapp is designed to represent a key performance characteristic that does or is expected to significantly impact the runtime performance of an application program. In this paper we introduce a methodology for assessing the ability of these miniapps to effectively represent these performance issues. We applied this methodology to three miniapps, examining the linkage between them and an application they are intended to represent. Herein we evaluate the fidelity of that linkage. This work represents the initial steps required to begin to answer the question, "Under what conditions does a miniapp represent a key performance characteristic in a full app?"

More Details

TYPE Journal Article YEAR 2015

Scopus OSTI DOI

Mantevo 3.0 Overview

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Failure Masking and Local Recovery for Stencil-based Applications at Extreme Scales

Gamell, Marc; Teranishi, Keita; Heroux, Michael A.; Mayo, Jackson R.; Kolla, Hemanth; Chen, Jacqueline H.; Parashar, Manish

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2014

OSTI

Algorithms and Abstractions for Assembly in PDE Codes: Workshop Report

Cyr, Eric C.; Phipps, Eric T.; Heroux, Michael A.; Brown, Jed; Coon, Ethan T.; Hoemmen, Mark F.; Kirby, Robert C.; Kolev, Tzanio V.; Sutherland, James C.; Trott, Christian R.

The emergence of high-concurrency architectures offering unprecedented performance has brought many high-performance partial differential equation (PDE) discretization codes to the precipice of a major refactor. To help address this challenge a workshop titled "Algorithms and Abstractions for Assembly in PDE Codes" was held in the Computer Science Research Institute at Sandia National Laboratories on May 12th-14th, 2014. This document summarizes the goals of the workshop and the results of the presentations and subsequent discussions.

More Details

TYPE SAND Report YEAR 2014

DOI OSTI