A Case Study in Using Containers to Build and Distribute HPC Applications: ALEGRA
Abstract not provided.
Abstract not provided.
Abstract not provided.
This report documents the completion of milestone STPRO4-6 Kokkos Support for ASC applications and libraries. The team provided consultation and support for numerous ASC code projects including Sandias SPARC, EMPIRE, Aria, GEMMA, Alexa, Trilinos, LAMMPS and nimbleSM. Over the year more than 350 Kokkos github issues were resolved, with over 220 requiring fixes and enhancements to the code base. Resolving these requests, with many of them issued by ASC code teams, provided applications with the necessary capabilities in Kokkos to be successful.
Abstract not provided.
Abstract not provided.
Abstract not provided.
The SPARC (Sandia Parallel Aerodynamics and Reentry Code) will provide nuclear weapon qualification evidence for the random vibration and thermal environments created by re-entry of a warhead into the earth’s atmosphere. SPARC incorporates the innovative approaches of ATDM projects on several fronts including: effective harnessing of heterogeneous compute nodes using Kokkos, exascale-ready parallel scalability through asynchronous multi-tasking, uncertainty quantification through Sacado integration, implementation of state-of-the-art reentry physics and multiscale models, use of advanced verification and validation methods, and enabling of improved workflows for users. SPARC is being developed primarily for the Department of Energy nuclear weapon program, with additional development and use of the code is being supported by the Department of Defense for conventional weapons programs.
Abstract not provided.
23rd AIAA Computational Fluid Dynamics Conference, 2017
High performance computing (HPC) is undergoing a dramatic change in computing architectures. Nextgeneration HPC systems are being based primarily on many-core processing units and general purpose graphics processing units (GPUs). A computing node on a next-generation system can be, and in practice is, heterogeneous in nature, involving multiple memory spaces and multiple execution spaces. This presents a challenge for the development of application codes that wish to compute at the extreme scales afforded by these next-generation HPC technologies and systems - the best parallel programming model for one system is not necessarily the best parallel programming model for another. This inevitably raises the following question: how does an application code achieve high performance on disparate computing architectures without having entirely different, or at least significantly different, code paths, one for each architecture? This question has given rise to the term ‘performance portability’, a notion concerned with porting application code performance from architecture to architecture using a single code base. In this paper, we present the work being done at Sandia National Labs to develop a performance portable compressible CFD code that is targeting the ‘leadership’ class supercomputers the National Nuclear Security Administration (NNSA) is acquiring over the course of the next decade.
The Next Generation Global Atmosphere Model LDRD project developed a suite of atmosphere models: a shallow water model, an x - z hydrostatic model, and a 3D hydrostatic model, by using Albany, a finite element code. Albany provides access to a large suite of leading-edge Sandia high- performance computing technologies enabled by Trilinos, Dakota, and Sierra. The next-generation capabilities most relevant to a global atmosphere model are performance portability and embedded uncertainty quantification (UQ). Performance portability is the capability for a single code base to run efficiently on diverse set of advanced computing architectures, such as multi-core threading or GPUs. Embedded UQ refers to simulation algorithms that have been modified to aid in the quantifying of uncertainties. In our case, this means running multiple samples for an ensemble concurrently, and reaping certain performance benefits. We demonstrate the effectiveness of these approaches here as a prelude to introducing them into ACME.
Abstract not provided.
Abstract not provided.
This report provides in-depth information and analysis to help create a technical road map for developing next-generation programming models and runtime systems that support Advanced Simulation and Computing (ASC) work- load requirements. The focus herein is on asynchronous many-task (AMT) model and runtime systems, which are of great interest in the context of "Oriascale7 computing, as they hold the promise to address key issues associated with future extreme-scale computer architectures. This report includes a thorough qualitative and quantitative examination of three best-of-class AIM] runtime systems – Charm-++, Legion, and Uintah, all of which are in use as part of the Centers. The studies focus on each of the runtimes' programmability, performance, and mutability. Through the experiments and analysis presented, several overarching Predictive Science Academic Alliance Program II (PSAAP-II) Asc findings emerge. From a performance perspective, AIV runtimes show tremendous potential for addressing extreme- scale challenges. Empirical studies show an AM runtime can mitigate performance heterogeneity inherent to the machine itself and that Message Passing Interface (MP1) and AM11runtimes perform comparably under balanced conditions. From a programmability and mutability perspective however, none of the runtimes in this study are currently ready for use in developing production-ready Sandia ASC applications. The report concludes by recommending a co- design path forward, wherein application, programming model, and runtime system developers work together to define requirements and solutions. Such a requirements-driven co-design approach benefits the community as a whole, with widespread community engagement mitigating risk for both application developers developers. and high-performance computing runtime systein
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
20th AIAA Computational Fluid Dynamics Conference 2011
We analyze the artificial dissipation introduced by a streamline-upwind Petrov-Galerkin finite element method and consider its effect on the conservation of total enthalpy for the Euler and laminar Navier-Stokes equations. We also consider the chemically reacting case. We demonstrate that in general, total enthalpy is not conserved for the important special case of the steady-state Euler equations. A modification to the artificial dissipation is proposed and shown to significantly improve the conservation of total enthalpy.
A common purpose for performing an aerodynamic analysis is to calculate the resulting loads on a solid body immersed in the flow. Pressure or heat loads are often of interest for characterizing the structural integrity or thermal survivability of the structure. This document describes two algorithms for tightly coupling the mass, momentum and energy conservation equations for a compressible fluid and the energy conservation equation for heat transfer through a solid. We categorize both approaches as monolithically coupled, where the conservation equations for the fluid and the solid are assembled into a single residual vector. Newton's method is then used to solve the resulting nonlinear system of equations. These approaches are in contrast to other popular coupling schemes such as staggered coupling methods were each discipline is solved individually and loads are passed between as boundary conditions, and demonstrates the viability of the monolithic approach for aeroheating problems.
Abstract not provided.
Abstract not provided.
Abstract not provided.
VLSI Design
In this article we concisely present several modern strategies that are applicable to drift-dominated carrier transport in higher-order deterministic models such as the drift-diffusion, hydrodynamic, and quantum hydrodynamic systems. The approaches include extensions of `upwind' and artificial dissipation schemes, generalization of the traditional Scharfetter-Gummel approach, Petrov-Galerkin and streamline-upwind Petrov Galerkin (SUPG), `entropy' variables, transformations, least-squares mixed methods and other stabilized Galerkin schemes such as Galerkin least squares and discontinuous Galerkin schemes. The treatment is representative rather than an exhaustive review and several schemes are mentioned only briefly with appropriate reference to the literature. Some of the methods have been applied to the semiconductor device problem while others are still in the early stages of development for this class of applications. We have included numerical examples from our recent research tests with some of the methods. A second aspect of the work deals with algorithms that employ unstructured grids in conjunction with adaptive refinement strategies. The full benefits of such approaches have not yet been developed in this application area and we emphasize the need for further work on analysis, data structures and software to support adaptivity. Finally, we briefly consider some aspects of software frameworks. These include dial-an-operator approaches such as that used in the industrial simulator PROPHET, and object-oriented software, support such as those in the SANDIA National Laboratory framework SIERRA.