Publications

Results 5951–5975 of 9,998

Search results

Jump to search filters

Breaking Computational Barriers: Real-time Analysis and Optimization with Large-scale Nonlinear Models via Model Reduction

Drohmann, Martin; Tuminaro, Raymond S.; Boggs, Paul T.; Ray, Jaideep R.; van Bloemen Waanders, Bart G.; Carlberg, Kevin T.

Model reduction for dynamical systems is a promising approach for reducing the computational cost of large-scale physics-based simulations to enable high-fidelity models to be used in many- query (e.g., Bayesian inference) and near-real-time (e.g., fast-turnaround simulation) contexts. While model reduction works well for specialized problems such as linear time-invariant systems, it is much more difficult to obtain accurate, stable, and efficient reduced-order models (ROMs) for systems with general nonlinearities. This report describes several advances that enable nonlinear reduced-order models (ROMs) to be deployed in a variety of time-critical settings. First, we present an error bound for the Gauss-Newton with Approximated Tensors (GNAT) nonlinear model reduction technique. This bound allows the state-space error for the GNAT method to be quantified when applied with the backward Euler time-integration scheme. Second, we present a methodology for preserving classical Lagrangian structure in nonlinear model reduction. This technique guarantees that important properties--such as energy conservation and symplectic time-evolution maps--are preserved when performing model reduction for models described by a Lagrangian formalism (e.g., molecular dynamics, structural dynamics). Third, we present a novel technique for decreasing the temporal complexity --defined as the number of Newton-like iterations performed over the course of the simulation--by exploiting time-domain data. Fourth, we describe a novel method for refining projection-based reduced-order models a posteriori using a goal-oriented framework similar to mesh-adaptive h -refinement in finite elements. The technique allows the ROM to generate arbitrarily accurate solutions, thereby providing the ROM with a 'failsafe' mechanism in the event of insufficient training data. Finally, we present the reduced-order model error surrogate (ROMES) method for statistically quantifying reduced- order-model errors. This enables ROMs to be rigorously incorporated in uncertainty-quantification settings, as the error model can be treated as a source of epistemic uncertainty. This work was completed as part of a Truman Fellowship appointment. We note that much additional work was performed as part of the Fellowship. One salient project is the development of the Trilinos-based model-reduction software module Razor , which is currently bundled with the Albany PDE code and currently allows nonlinear reduced-order models to be constructed for any application supported in Albany. Other important projects include the following: 1. ROMES-equipped ROMs for Bayesian inference: K. Carlberg, M. Drohmann, F. Lu (Lawrence Berkeley National Laboratory), M. Morzfeld (Lawrence Berkeley National Laboratory). 2. ROM-enabled Krylov-subspace recycling: K. Carlberg, V. Forstall (University of Maryland), P. Tsuji, R. Tuminaro. 3. A pseudo balanced POD method using only dual snapshots: K. Carlberg, M. Sarovar. 4. An analysis of discrete v. continuous optimality in nonlinear model reduction: K. Carlberg, M. Barone, H. Antil (George Mason University). Journal articles for these projects are in progress at the time of this writing.

More Details

Yearly Update: Exascale Projections for 2014

Kogge, Peter M.; Resnick, David R.

The HPC architectures of today are significantly different for a decade ago, with high odds that further changes will occur on the road to Exascale. This report discusses the "perfect storm' in technology that produced this change, the classes of architectures we are dealing with, and probable trends in how they will evolve. These properties and trends are then evaluated in terms of what it likely means to future Exascale systems and applications.

More Details

Enabling communication concurrency through flexible MPI endpoints

International Journal of High Performance Computing Applications

Grant, Ryan E.

MPI defines a one-to-one relationship between MPI processes and ranks. This model captures many use cases effectively; however, it also limits communication concurrency and interoperability between MPI and programming models that utilize threads. Our paper describes the MPI endpoints extension, which relaxes the longstanding one-to-one relationship between MPI processes and ranks. Using endpoints, an MPI implementation can map separate communication contexts to threads, allowing them to drive communication independently. Also, endpoints enable threads to be addressable in MPI operations, enhancing interoperability between MPI and other programming models. Furthermore, these characteristics are illustrated through several examples and an empirical study that contrasts current multithreaded communication performance with the need for high degrees of communication concurrency to achieve peak communication performance.

More Details

Coarse-grained energy modeling of rollback/recovery mechanisms

Proceedings of the International Conference on Dependable Systems and Networks

Ibtesham, Dewan; Debonis, David; Arnold, Dorian; Ferreira, Kurt

As high-performance computing systems continue to grow in size and complexity, energy efficiency and reliability have emerged as first-order concerns. Researchers have shown that data movement is a significant contributing factor to power consumption on these systems. Additionally, rollback/recovery protocols like checkpoint/restart can generate large volumes of data traffic exacerbating the energy and power concerns. In this work, we show that a coarse-grained model can be used effectively to speculate about the energy footprints of rollback/recovery protocols. Using our validated model, we evaluate the energy footprint of checkpoint compression, a method that incurs higher computational demand to reduce data volumes and data traffic. Specifically, we show that while checkpoint compression leads to more frequent checkpoints (as per the optimal checkpoint frequency) and increases per checkpoint energy cost, compression still yields a decrease in total application energy consumption due to the overall runtime decrease.

More Details

Toward local failure local recovery resilience model using MPI-ULFM

ACM International Conference Proceeding Series

Teranishi, Keita T.; Heroux, Michael A.

The current system reaction to the loss of a single MPI process is to kill all the remaining processes and restart the application from the most recent checkpoint. This approach will become unfeasible for future extreme scale systems. We address this issue using an emerging resilient computing model called Local Failure Local Recovery (LFLR) that provides application developers with the ability to recover locally and continue application execution when a process is lost. We discuss the design of our software framework to enable the LFLR model using MPI-ULFM and demonstrate the resilient version of MiniFE that achieves a scalable recovery from process failures.

More Details

Particle dynamics modeling methods for colloid suspensions

Computational Particle Mechanics

Bolintineanu, Dan S.; Grest, Gary S.; Lechman, Jeremy B.; Pierce, Flint P.; Plimpton, Steven J.; Schunk, Randy

We present a review and critique of several methods for the simulation of the dynamics of colloidal suspensions at the mesoscale. We focus particularly on simulation techniques for hydrodynamic interactions, including implicit solvents (Fast Lubrication Dynamics, an approximation to Stokesian Dynamics) and explicit/particle-based solvents (Multi-Particle Collision Dynamics and Dissipative Particle Dynamics). Several variants of each method are compared quantitatively for the canonical system of monodisperse hard spheres, with a particular focus on diffusion characteristics, as well as shear rheology and microstructure. In all cases, we attempt to match the relevant properties of a well-characterized solvent, which turns out to be challenging for the explicit solvent models. Reasonable quantitative agreement is observed among all methods, but overall the Fast Lubrication Dynamics technique shows the best accuracy and performance. We also devote significant discussion to the extension of these methods to more complex situations of interest in industrial applications, including models for non-Newtonian solvent rheology, non-spherical particles, drying and curing of solvent and flows in complex geometries. This work identifies research challenges and motivates future efforts to develop techniques for quantitative, predictive simulations of industrially relevant colloidal suspension processes.

More Details

Compressed optimization of device architectures

Laros, James H.; Frees, Adam F.; Ward, Daniel R.; Blume-Kohout, Robin J.; Eriksson, M.A.; Friesen, Mark; Coppersmith, Susan N.

Recent advances in nanotechnology have enabled researchers to control individual quantum mechanical objects with unprecedented accuracy, opening the door for both quantum and extreme- scale conventional computation applications. As these devices become more complex, designing for facility of control becomes a daunting and computationally infeasible task. Here, motivated by ideas from compressed sensing, we introduce a protocol for the Compressed Optimization of Device Architectures (CODA). It leads naturally to a metric for benchmarking and optimizing device designs, as well as an automatic device control protocol that reduces the operational complexity required to achieve a particular output. Because this protocol is both experimentally and computationally efficient, it is readily extensible to large systems. For this paper, we demonstrate both the bench- marking and device control protocol components of CODA through examples of realistic simulations of electrostatic quantum dot devices, which are currently being developed experimentally for quantum computation.

More Details
Results 5951–5975 of 9,998
Results 5951–5975 of 9,998