Publications

Results 1–100 of 135
Skip to search filters

A performance-portable nonhydrostatic atmospheric dycore for the energy exascale earth system model running at cloud-resolving resolutions

International Conference for High Performance Computing, Networking, Storage and Analysis, SC

Bertagna, Luca B.; Guba, Oksana G.; Taylor, Mark A.; Foucar, James G.; Larkin, Jeff; Bradley, Andrew M.; Rajamanickam, Sivasankaran R.; Salinger, Andrew G.

We present an effort to port the nonhydrostatic atmosphere dynamical core of the Energy Exascale Earth System Model (E3SM) to efficiently run on a variety of architectures, including conventional CPU, many-core CPU, and GPU. We specifically target cloud-resolving resolutions of 3 km and 1 km. To express on-node parallelism we use the C++ library Kokkos, which allows us to achieve a performance portable code in a largely architecture-independent way. Our C++ implementation is at least as fast as the original Fortran implementation on IBM Power9 and Intel Knights Landing processors, proving that the code refactor did not compromise the efficiency on CPU architectures. On the other hand, when using the GPUs, our implementation is able to achieve 0.97 Simulated Years Per Day, running on the full Summit supercomputer. To the best of our knowledge, this is the most achieved to date by any global atmosphere dynamical core running at such resolutions.

More Details

SCREAM: a performance-portable global cloud-resolving model based on the Energy Exascale Earth System Model

Hillman, Benjamin H.; Caldwell, Peter C.; Salinger, Andrew G.; Bertagna, Luca B.; Beydoun, Hassan B.; Peter, Bogenschutz.P.; Bradley, Andrew M.; Donahue, Aaron D.; Eldred, Christopher; Foucar, James G.; Golaz, Chris G.; Guba, Oksana G.; Jacob, Robert J.; Johnson, Jeff J.; Keen, Noel K.; Krishna, Jayesh K.; Lin, Wuyin L.; Liu, Weiran L.; Pressel, Kyle P.; Singh, Balwinder S.; Steyer, Andrew S.; Taylor, Mark A.; Terai, Chris T.; Ullrich, Paul A.; Wu, Danqing W.; Yuan, Xingqui Y.

Abstract not provided.

HOMMEXX 1.0: A performance-portable atmospheric dynamical core for the Energy Exascale Earth System Model

Geoscientific Model Development

Bertagna, Luca B.; Deakin, Michael; Guba, Oksana G.; Sunderland, Daniel S.; Bradley, Andrew M.; Kalashnikova, Irina; Taylor, Mark A.; Salinger, Andrew G.

We present an architecture-portable and performant implementation of the atmospheric dynamical core (High-Order Methods Modeling Environment, HOMME) of the Energy Exascale Earth System Model (E3SM). The original Fortran implementation is highly performant and scalable on conventional architectures using the Message Passing Interface (MPI) and Open MultiProcessor (OpenMP) programming models. We rewrite the model in C++ and use the Kokkos library to express on-node parallelism in a largely architecture-independent implementation. Kokkos provides an abstraction of a compute node or device, layout-polymorphic multidimensional arrays, and parallel execution constructs. The new implementation achieves the same or better performance on conventional multicore computers and is portable to GPUs. We present performance data for the original and new implementations on multiple platforms, on up to 5400 compute nodes, and study several aspects of the single-and multi-node performance characteristics of the new implementation on conventional CPU (e.g., Intel Xeon), many core CPU (e.g., Intel Xeon Phi Knights Landing), and Nvidia V100 GPU.

More Details

Description and evaluation of the Community Ice Sheet Model (CISM) v2.1

Geoscientific Model Development

Lipscomb, William H.; Price, Stephen F.; Hoffman, Matthew J.; Leguy, Gunter R.; Bennett, Andrew R.; Bradley, Sarah L.; Evans, Katherine J.; Fyke, Jeremy G.; Kennedy, Joseph H.; Perego, Mauro P.; Ranken, Douglas M.; Sacks, William J.; Salinger, Andrew G.; Vargo, Lauren J.; Worley, Patrick H.

We describe and evaluate version 2.1 of the Community Ice Sheet Model (CISM). CISM is a parallel, 3-D thermomechanical model, written mainly in Fortran, that solves equations for the momentum balance and the thickness and temperature evolution of ice sheets. CISM's velocity solver incorporates a hierarchy of Stokes flow approximations, including shallow-shelf, depth-integrated higher order, and 3-D higher order. CISM also includes a suite of test cases, links to third-party solver libraries, and parameterizations of physical processes such as basal sliding, iceberg calving, and sub-ice-shelf melting. The model has been verified for standard test problems, including the Ice Sheet Model Intercomparison Project for Higher-Order Models (ISMIP-HOM) experiments, and has participated in the initMIP-Greenland initialization experiment. In multimillennial simulations with modern climate forcing on a 4 km grid, CISM reaches a steady state that is broadly consistent with observed flow patterns of the Greenland ice sheet. CISM has been integrated into version 2.0 of the Community Earth System Model, where it is being used for Greenland simulations under past, present, and future climates. The code is open-source with extensive documentation and remains under active development.

More Details

The Aeras Next Generation Global Atmosphere Model

Bosler, Peter A.; Bova, S.W.; Demeshko, Irina P.; Fike, Jeffrey A.; Guba, Oksana G.; Overfelt, James R.; Roesler, Erika L.; Salinger, Andrew G.; Smith, Thomas M.; Kalashnikova, Irina; Watkins, Jerry E.

The Next Generation Global Atmosphere Model LDRD project developed a suite of atmosphere models: a shallow water model, an x - z hydrostatic model, and a 3D hydrostatic model, by using Albany, a finite element code. Albany provides access to a large suite of leading-edge Sandia high- performance computing technologies enabled by Trilinos, Dakota, and Sierra. The next-generation capabilities most relevant to a global atmosphere model are performance portability and embedded uncertainty quantification (UQ). Performance portability is the capability for a single code base to run efficiently on diverse set of advanced computing architectures, such as multi-core threading or GPUs. Embedded UQ refers to simulation algorithms that have been modified to aid in the quantifying of uncertainties. In our case, this means running multiple samples for an ensemble concurrently, and reaping certain performance benefits. We demonstrate the effectiveness of these approaches here as a prelude to introducing them into ACME.

More Details

A matrix dependent/algebraic multigrid approach for extruded meshes with applications to ice sheet modeling

SIAM Journal on Scientific Computing

Tuminaro, R.; Perego, Mauro P.; Tezaur, I.; Salinger, Andrew G.; Price, S.

A multigrid method is proposed that combines ideas from matrix dependent multigrid for structured grids and algebraic multigrid for unstructured grids. It targets problems where a three-dimensional mesh can be viewed as an extrusion of a two-dimensional, unstructured mesh in a third dimension. Our motivation comes from the modeling of thin structures via finite elements and, more specifically, the modeling of ice sheets. Extruded meshes are relatively common for thin structures and often give rise to anisotropic problems when the thin direction mesh spacing is much smaller than the broad direction mesh spacing. Within our approach, the first few multigrid hierarchy levels are obtained by applying matrix dependent multigrid to semicoarsen in a structured thin direction fashion. After sufficient structured coarsening, the resulting mesh contains only a single layer corresponding to a two-dimensional, unstructured mesh. Algebraic multigrid can then be employed in a standard manner to create further coarse levels, as the anisotropic phenomena is no longer present in the single layer problem. The overall approach remains fully algebraic, with the minor exception that some additional information is needed to determine the extruded direction. This facilitates integration of the solver with a variety of different extruded mesh applications.

More Details

On the scalability of the Albany/FELIX first-order stokes approximation ice sheet solver for large-scale simulations of the Greenland and Antarctic ice sheets

Procedia Computer Science

Kalashnikova, Irina; Tuminaro, Raymond S.; Perego, Mauro P.; Salinger, Andrew G.; Price, Stephen F.

We examine the scalability of the recently developed Albany/FELIX finite-element based code for the first-order Stokes momentum balance equations for ice flow. We focus our analysis on the performance of two possible preconditioners for the iterative solution of the sparse linear systems that arise from the discretization of the governing equations: (1) a preconditioner based on the incomplete LU (ILU) factorization, and (2) a recently-developed algebraic multigrid (AMG) preconditioner, constructed using the idea of semi-coarsening. A strong scalability study on a realistic, high resolution Greenland ice sheet problem reveals that, for a given number of processor cores, the AMG preconditioner results in faster linear solve times but the ILU preconditioner exhibits better scalability. A weak scalability study is performed on a realistic, moderate resolution Antarctic ice sheet problem, a substantial fraction of which contains floating ice shelves, making it fundamentally different from the Greenland ice sheet problem. Here, we show that as the problem size increases, the performance of the ILU preconditioner deteriorates whereas the AMG preconditioner maintains scalability. This is because the linear systems are extremely ill-conditioned in the presence of floating ice shelves, and the ill-conditioning has a greater negative effect on the ILU preconditioner than on the AMG preconditioner.

More Details

QCAD simulation and optimization of semiconductor double quantum dots

Nielsen, Erik N.; Gao, Xujiao G.; Kalashnikova, Irina; Muller, Richard P.; Salinger, Andrew G.; Young, Ralph W.

We present the Quantum Computer Aided Design (QCAD) simulator that targets modeling quantum devices, particularly silicon double quantum dots (DQDs) developed for quantum qubits. The simulator has three di erentiating features: (i) its core contains nonlinear Poisson, e ective mass Schrodinger, and Con guration Interaction solvers that have massively parallel capability for high simulation throughput, and can be run individually or combined self-consistently for 1D/2D/3D quantum devices; (ii) the core solvers show superior convergence even at near-zero-Kelvin temperatures, which is critical for modeling quantum computing devices; (iii) it couples with an optimization engine Dakota that enables optimization of gate voltages in DQDs for multiple desired targets. The Poisson solver includes Maxwell- Boltzmann and Fermi-Dirac statistics, supports Dirichlet, Neumann, interface charge, and Robin boundary conditions, and includes the e ect of dopant incomplete ionization. The solver has shown robust nonlinear convergence even in the milli-Kelvin temperature range, and has been extensively used to quickly obtain the semiclassical electrostatic potential in DQD devices. The self-consistent Schrodinger-Poisson solver has achieved robust and monotonic convergence behavior for 1D/2D/3D quantum devices at very low temperatures by using a predictor-correct iteration scheme. The QCAD simulator enables the calculation of dot-to-gate capacitances, and comparison with experiment and between solvers. It is observed that computed capacitances are in the right ballpark when compared to experiment, and quantum con nement increases capacitance when the number of electrons is xed in a quantum dot. In addition, the coupling of QCAD with Dakota allows to rapidly identify which device layouts are more likely leading to few-electron quantum dots. Very efficient QCAD simulations on a large number of fabricated and proposed Si DQDs have made it possible to provide fast feedback for design comparison and optimization.

More Details

The QCAD framework for quantum device modeling

Computational Electronics (IWCE), 2012 15th International Workshop on

Gao, Xujiao G.; Nielsen, Erik N.; Muller, Richard P.; Young, Ralph W.; Salinger, Andrew G.; Carroll, Malcolm

We present the Quantum Computer Aided Design (QCAD) simulator that targets modeling quantum devices, particularly Si double quantum dots (DQDs) developed for quantum computing. The simulator core includes Poisson, Schrodinger, and Configuration Interaction solvers which can be run individually or combined self-consistently. The simulator is built upon Sandia-developed Trilinos and Albany components, and is interfaced with the Dakota optimization tool. It is being developed for seamless integration, high flexibility and throughput, and is intended to be open source. The QCAD tool has been used to simulate a large number of fabricated silicon DQDs and has provided fast feedback for design comparison and optimization.

More Details

Automating embedded analysis capabilities and managing software complexity in multiphysics simulation, Part I: Template-based generic programming

Scientific Programming

Pawlowski, Roger P.; Phipps, Eric T.; Salinger, Andrew G.

An approach for incorporating embedded simulation and analysis capabilities in complex simulation codes through template-based generic programming is presented. This approach relies on templating and operator overloading within the C++ language to transform a given calculation into one that can compute a variety of additional quantities that are necessary for many state-of-the-art simulation and analysis algorithms. An approach for incorporating these ideas into complex simulation codes through general graph-based assembly is also presented. These ideas have been implemented within a set of packages in the Trilinos framework and are demonstrated on a simple problem from chemical engineering. © 2012 - IOS Press and the authors. All rights reserved.

More Details

Peridigm summary report : lessons learned in development with agile components

Parks, Michael L.; Littlewood, David J.; Salinger, Andrew G.

This report details efforts to deploy Agile Components for rapid development of a peridynamics code, Peridigm. The goal of Agile Components is to enable the efficient development of production-quality software by providing a well-defined, unifying interface to a powerful set of component-based software. Specifically, Agile Components facilitate interoperability among packages within the Trilinos Project, including data management, time integration, uncertainty quantification, and optimization. Development of the Peridigm code served as a testbed for Agile Components and resulted in a number of recommendations for future development. Agile Components successfully enabled rapid integration of Trilinos packages into Peridigm. A cost of this approach, however, was a set of restrictions on Peridigm's architecture which impacted the ability to track history-dependent material data, dynamically modify the model discretization, and interject user-defined routines into the time integration algorithm. These restrictions resulted in modifications to the Agile Components approach, as implemented in Peridigm, and in a set of recommendations for future Agile Components development. Specific recommendations include improved handling of material states, a more flexible flow control model, and improved documentation. A demonstration mini-application, SimpleODE, was developed at the onset of this project and is offered as a potential supplement to Agile Components documentation.

More Details

Continuation and bifurcation analysis of large-scale dynamical systems with LOCA

Salinger, Andrew G.; Pawlowski, Roger P.

Dynamical systems theory provides a powerful framework for understanding the behavior of complex evolving systems. However applying these ideas to large-scale dynamical systems such as discretizations of multi-dimensional PDEs is challenging. Such systems can easily give rise to problems with billions of dynamical variables, requiring specialized numerical algorithms implemented on high performance computing architectures with thousands of processors. This talk will describe LOCA, the Library of Continuation Algorithms, a suite of scalable continuation and bifurcation tools optimized for these types of systems that is part of the Trilinos software collection. In particular, we will describe continuation and bifurcation analysis techniques designed for large-scale dynamical systems that are based on specialized parallel linear algebra methods for solving augmented linear systems. We will also discuss several other Trilinos tools providing nonlinear solvers (NOX), eigensolvers (Anasazi), iterative linear solvers (AztecOO and Belos), preconditioners (Ifpack, ML, Amesos) and parallel linear algebra data structures (Epetra and Tpetra) that LOCA can leverage for efficient and scalable analysis of large-scale dynamical systems.

More Details
Results 1–100 of 135
Results 1–100 of 135