DOI OSTI

The alliance for computing at the extreme scale

Ang, James A.; Doerfler, Douglas W.; Dosanjh, Sudip S.; Hemmert, Karl S.

Los Alamos and Sandia National Laboratories have formed a new high performance computing center, the Alliance for Computing at the Extreme Scale (ACES). The two labs will jointly architect, develop, procure and operate capability systems for DOE's Advanced Simulation and Computing Program. This presentation will discuss a petascale production capability system, Cielo, that will be deployed in late 2010, and a new partnership with Cray on advanced interconnect technologies.

More Details

TYPE Conference YEAR 2010

OSTI

The Alliance for Computing at the Extreme Scale

Ang, James A.; Doerfler, Douglas W.; Dosanjh, Sudip S.; Hemmert, Karl S.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Numerical approach for quanti?cation of epistemic uncertainty

Journal of Computational Physics

Physical Review Letters

Root, Seth R.; Magyar, Rudolph J.; Carpenter, John H.; Hanson, David L.; Mattsson, Thomas M.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2010

DOI OSTI

Steps toward fault-tolerant quantum chemistry

Taube, Andrew G.

Developing quantum chemistry programs on the coming generation of exascale computers will be a difficult task. The programs will need to be fault-tolerant and minimize the use of global operations. This work explores the use a task-based model that uses a data-centric approach to allocate work to different processes as it applies to quantum chemistry. After introducing the key problems that appear when trying to parallelize a complicated quantum chemistry method such as coupled-cluster theory, we discuss the implications of that model as it pertains to the computational kernel of a coupled-cluster program - matrix multiplication. Also, we discuss the extensions that would required to build a full coupled-cluster program using the task-based model. Current programming models for high-performance computing are fault-intolerant and use global operations. Those properties are unsustainable as computers scale to millions of CPUs; instead one must recognize that these systems will be hierarchical in structure, prone to constant faults, and global operations will be infeasible. The FAST-OS HARE project is introducing a scale-free computing model to address these issues. This model is hierarchical and fault-tolerant by design, allows for the clean overlap of computation and communication, reducing the network load, does not require checkpointing, and avoids the complexity of many HPC runtimes. Development of an algorithm within this model requires a change in focus from imperative programming to a data-centric approach. Quantum chemistry (QC) algorithms, in particular electronic structure methods, are an ideal test bed for this computing model. These methods describe the distribution of electrons in a molecule, which determine the properties of the molecule. The computational cost of these methods is high, scaling quartically or higher in the size of the molecule, which is why QC applications are major users of HPC resources. The complexity of these algorithms means that MPI alone is insufficient to achieve parallel scaling; QC developers have been forced to use alternative approaches to achieve scalability and would be receptive to radical shifts in the programming paradigm. Initial work in adapting the simplest QC method, Hartree-Fock, to this the new programming model indicates that the approach is beneficial for QC applications. However, the advantages to being able to scale to exascale computers are greatest for the computationally most expensive algorithms; within QC these are the high-accuracy coupled-cluster (CC) methods. Parallel coupledcluster programs are available, however they are based on the conventional MPI paradigm. Much of the effort is spent handling the complicated data dependencies between the various processors, especially as the size of the problem becomes large. The current paradigm will not survive the move to exascale computers. Here we discuss the initial steps toward designing and implementing a CC method within this model. First, we introduce the general concepts behind a CC method, focusing on the aspects that make these methods difficult to parallelize with conventional techniques. Then we outline what is the computational core of the CC method - a matrix multiply - within the task-based approach that the FAST-OS project is designed to take advantage of. Finally we outline the general setup to implement the simplest CC method in this model, linearized CC doubles (LinCC).

More Details

TYPE SAND Report YEAR 2010

DOI OSTI

Abstract not provided.

More Details

TYPE Presentation YEAR 2010

OSTI

In-Situ Visualization with the ParaView Coprocessing Library (SC10 Tutorial Proposal)

Moreland, Kenneth D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2010

OSTI

Proposed comment on Anderson (2010) Neural re-use as a fundamental organizational principle of the brain

Behavioral Brain Science

Speed, Ann S.; Verzi, Stephen J.; Wagner, John S.; Warrender, Christina E.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2010

OSTI

Unstructured discontinuous Galerkin for seismic inversion

Collis, Samuel S.; Ober, Curtis C.; van Bloemen Waanders, Bart G.

This abstract explores the potential advantages of discontinuous Galerkin (DG) methods for the time-domain inversion of media parameters within the earth's interior. In particular, DG methods enable local polynomial refinement to better capture localized geological features within an area of interest while also allowing the use of unstructured meshes that can accurately capture discontinuous material interfaces. This abstract describes our initial findings when using DG methods combined with Runge-Kutta time integration and adjoint-based optimization algorithms for full-waveform inversion. Our initial results suggest that DG methods allow great flexibility in matching the media characteristics (faults, ocean bottom and salt structures) while also providing higher fidelity representations in target regions. Time-domain inversion using discontinuous Galerkin on unstructured meshes and with local polynomial refinement is shown to better capture localized geological features and accurately capture discontinuous-material interfaces. These approaches provide the ability to surgically refine representations in order to improve predicted models for specific geological features. Our future work will entail automated extensions to directly incorporate local refinement and adaptive unstructured meshes within the inversion process.

More Details

TYPE Conference YEAR 2010

OSTI

Importance sampling : promises and limitations

Swiler, Laura P.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Importance sampling : promises and limitations

Swiler, Laura P.

Importance sampling is an unbiased sampling method used to sample random variables from different densities than originally defined. These importance sampling densities are constructed to pick 'important' values of input random variables to improve the estimation of a statistical response of interest, such as a mean or probability of failure. Conceptually, importance sampling is very attractive: for example one wants to generate more samples in a failure region when estimating failure probabilities. In practice, however, importance sampling can be challenging to implement efficiently, especially in a general framework that will allow solutions for many classes of problems. We are interested in the promises and limitations of importance sampling as applied to computationally expensive finite element simulations which are treated as 'black-box' codes. In this paper, we present a customized importance sampler that is meant to be used after an initial set of Latin Hypercube samples has been taken, to help refine a failure probability estimate. The importance sampling densities are constructed based on kernel density estimators. We examine importance sampling with respect to two main questions: is importance sampling efficient and accurate for situations where we can only afford small numbers of samples? And does importance sampling require the use of surrogate methods to generate a sufficient number of samples so that the importance sampling process does increase the accuracy of the failure probability estimate? We present various case studies to address these questions.

More Details

TYPE Conference YEAR 2010

OSTI

Modeling and Simulation of Nuclear Fuel Materials

Energy and Environment Science

Wills, Ann E.; Bartel, Timothy J.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2010

OSTI

A robust matrix-free trust-region SQP method for large-scale optimization

Ridzal, Denis R.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Unsymmetric ordering for sparse LU and mesh partitioning using hypergraphs

Chevalier, Cedric

Abstract not provided.

More Details

TYPE Presentation YEAR 2010

OSTI

Inexact Krylov Subspace Methods for Fluid Density Functional Theories

Parks, Michael L.; Frischknecht, Amalie F.; Day, David M.; Heroux, Michael A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2010

OSTI

Transparent redundant computing with MPI

Brightwell, Ronald B.; Ferreira, Kurt

Extreme-scale parallel systems will require alternative methods for applications to maintain current levels of uninterrupted execution. Redundant computation is one approach to consider, if the benefits of increased resiliency outweigh the cost of consuming additional resources. We describe a transparent redundancy approach for MPI applications and detail two different implementations that provide the ability to tolerate a range of failure scenarios, including loss of application processes and connectivity.We compare these two approaches and show performance results from micro-benchmarks that bound worst-case message passing performance degradation.We propose several enhancements that could lower the overhead of providing resiliency through redundancy.

More Details

TYPE Conference YEAR 2010

OSTI

Verification Practices for Code Development Teams

Weirs, Vincent G.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

A Multi-Paradigm Modeling Framework for Energy Systems Modeling Simulation and Analysis

Industrial&Engineering Chemistry Research

Classical Nonlocal and Fractional Diffusion Equations

International Journal for Multiscale Computational Engineering

Lehoucq, Richard B.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2010

OSTI

Optimization-based constrained modeling : a new transport paradigm

Ridzal, Denis R.; Bochev, Pavel B.; Scovazzi, Guglielmo S.; Peterson, Kara J.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Orbital Optimized MBPT(2) as a Non-local Optimized Effective Potential

Taube, Andrew G.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Mixed Aleatory-Epistemic Uncertainty Quantification with Stochastic Expansions and Optimization-Based Interval Estimation

Reliability Engineering and System Safety

International Journal of Multiscale Computational Engineering

A framework for reduced order modeling with mixed moment matching and peak error objectives

SIAM Journal on Scientific Computing

Santarelli, Keith R.

We examine a new method of producing reduced order models for LTI systems which attempts to minimize a bound on the peak error between t he original and reduced order models subject to a bound on the peak value of the input. The method, which can be implemented by solving a set of linear programming problems that are parameterized v ia a single scalar quantity, is able to minimize an error bound subject to a number of moment matc hing constraints. Moreover, because all optimization is performed in the time domain, the method can also be used to perform model reduction for infinite dimensional systems, rather than being restricted to finite order state space descriptions. We begin by contrasting the method we present her e with two classes of standard model reduction algorithms, namely, moment matching algorithms and singular value-based methods. After motivating the class of reduction tools we propose, we describe the algorithm (which minimizes the Ll norm of the difference between the original and reduced order impulse responses) and formulate the corresponding linear programming problem that is solved during each iteration of the algorithm. We then prove that, for a certain class of LTI systems, the metho d we propose can be used to produce reduced order models of arbitrary accuracy even when the original system is infinite dimensional. We then show how to incorporate moment matching constraints into the basic error bound minimization algorithm, and present three examples which utilize the techni ques described herein. We conclude with some comments on extensions to multi-input, multi-output systems, as well as some general comments for future work. © 2010 Society for Industrial and Applied Mathematics.

More Details

TYPE Conference YEAR 2010

OSTI Scopus

A switched state feedback law for the stabilization of LTI systems

Proceedings of the 2010 American Control Conference, ACC 2010

Santarelli, Keith R.

Inspired by prior work in the design of switched feedback controllers for second order systems, we develop a switched state feedback control law for the stabilization of LTI systems of arbitrary dimension. The control law operates by switching between two static gain vectors in such a way that the state trajectory is driven onto a stable n - 1 dimensional hyperplane (where n represents the system dimension). We begin by briefly examining relevant geometric properties of the phase portraits in the case of two-dimensional systems and show how these geometric properties can be expressed as algebraic constraints on the switched vector fields that are applicable to LTI systems of arbitrary dimension. We then describe an explicit procedure for designing stabilizing controllers and illustrate the closed-loop transient performance via two examples. © 2010 AACC.

More Details

TYPE SAND Report YEAR 2010

DOI OSTI Scopus

Advanced I/O for large-scale scientific applications

Oldfield, Ron A.

As scientific simulations scale to use petascale machines and beyond, the data volumes generated pose a dual problem. First, with increasing machine sizes, the careful tuning of IO routines becomes more and more important to keep the time spent in IO acceptable. It is not uncommon, for instance, to have 20% of an application's runtime spent performing IO in a 'tuned' system. Careful management of the IO routines can move that to 5% or even less in some cases. Second, the data volumes are so large, on the order of 10s to 100s of TB, that trying to discover the scientifically valid contributions requires assistance at runtime to both organize and annotate the data. Waiting for offline processing is not feasible due both to the impact on the IO system and the time required. To reduce this load and improve the ability of scientists to use the large amounts of data being produced, new techniques for data management are required. First, there is a need for techniques for efficient movement of data from the compute space to storage. These techniques should understand the underlying system infrastructure and adapt to changing system conditions. Technologies include aggregation networks, data staging nodes for a closer parity to the IO subsystem, and autonomic IO routines that can detect system bottlenecks and choose different approaches, such as splitting the output into multiple targets, staggering output processes. Such methods must be end-to-end, meaning that even with properly managed asynchronous techniques, it is still essential to properly manage the later synchronous interaction with the storage system to maintain acceptable performance. Second, for the data being generated, annotations and other metadata must be incorporated to help the scientist understand output data for the simulation run as a whole, to select data and data features without concern for what files or other storage technologies were employed. All of these features should be attained while maintaining a simple deployment for the science code and eliminating the need for allocation of additional computational resources.

More Details

TYPE SAND Report YEAR 2010

DOI OSTI

Lightweight storage and overlay networks for fault tolerance

Oldfield, Ron A.

The next generation of capability-class, massively parallel processing (MPP) systems is expected to have hundreds of thousands to millions of processors, In such environments, it is critical to have fault-tolerance mechanisms, including checkpoint/restart, that scale with the size of applications and the percentage of the system on which the applications execute. For application-driven, periodic checkpoint operations, the state-of-the-art does not provide a scalable solution. For example, on today's massive-scale systems that execute applications which consume most of the memory of the employed compute nodes, checkpoint operations generate I/O that consumes nearly 80% of the total I/O usage. Motivated by this observation, this project aims to improve I/O performance for application-directed checkpoints through the use of lightweight storage architectures and overlay networks. Lightweight storage provide direct access to underlying storage devices. Overlay networks provide caching and processing capabilities in the compute-node fabric. The combination has potential to signifcantly reduce I/O overhead for large-scale applications. This report describes our combined efforts to model and understand overheads for application-directed checkpoints, as well as implementation and performance analysis of a checkpoint service that uses available compute nodes as a network cache for checkpoint operations.

More Details

TYPE SAND Report YEAR 2010

DOI OSTI

V&V of Behavioral Models

Ames, Arlo L.; Sunderland, Daniel S.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI OSTI

Altman, Susan J.; Clem, Paul G.; Cook, Adam W.; Hart, William E.; Hibbs, Michael R.; Ho, Clifford K.; Jones, Howland D.; Sun, Amy C.; Webb, Stephen W.

Biofouling, the unwanted growth of biofilms on a surface, of water-treatment membranes negatively impacts in desalination and water treatment. With biofouling there is a decrease in permeate production, degradation of permeate water quality, and an increase in energy expenditure due to increased cross-flow pressure needed. To date, a universal successful and cost-effect method for controlling biofouling has not been implemented. The overall goal of the work described in this report was to use high-performance computing to direct polymer, material, and biological research to create the next generation of water-treatment membranes. Both physical (micromixers - UV-curable epoxy traces printed on the surface of a water-treatment membrane that promote chaotic mixing) and chemical (quaternary ammonium groups) modifications of the membranes for the purpose of increasing resistance to biofouling were evaluated. Creation of low-cost, efficient water-treatment membranes helps assure the availability of fresh water for human use, a growing need in both the U. S. and the world.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

Journal of Computational Physics

Publications

Search results