Homological Stabilizer Codes
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
High explosives are an important class of energetic materials used in many weapons applications. Even with modern computers, the simulation of the dynamic chemical reactions and energy release is exceedingly challenging. While the scale of the detonation process may be macroscopic, the dynamic bond breaking responsible for the explosive release of energy is fundamentally quantum mechanical. Thus, any method that does not adequately describe bonding is destined to lack predictive capability on some level. Performing quantum mechanics calculations on systems with more than dozens of atoms is a gargantuan task, and severe approximation schemes must be employed in practical calculations. We have developed and tested a divide and conquer (DnC) scheme to obtain total energies, forces, and harmonic frequencies within semi-empirical quantum mechanics. The method is intended as an approximate but faster solution to the full problem and is possible due to the sparsity of the density matrix in many applications. The resulting total energy calculation scales linearly as the number of subsystems, and the method provides a path-forward to quantum mechanical simulations of millions of atoms.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
International Journal of High Performance Computing
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Proposed for publication in Advances in Parallel Computing.
Abstract not provided.
Water Distribution Systems Analysis 2010 - Proceedings of the 12th International Conference, WDSA 2010
The optimization of sensor placements is a key aspect of the design of contaminant warning systems for automatically detecting contaminants in water distribution systems. Although researchers have generally assumed that all sensors are placed at the same time, in practice sensor networks will likely grow and evolve over time. For example, limitations for a water utility's budget may dictate an staged, incremental deployment of sensors over many years. We describe optimization formulations of multi-stage sensor placement problems. The objective of these formulations includes an explicit trade-off between the value of the initially deployed and final sensor networks. This trade-off motivates the deployment of sensors in initial stages of the deployment schedule, even though these choices typically lead to a solution that is suboptimal when compared to placing all sensors at once. These multi-stage sensor placement problems can be represented as mixed-integer programs, and we illustrate the impact of this trade-off using standard commercial solvers. We also describe a multi-stage formulation that models budget uncertainty, expressed as a tree of potential budget scenarios through time. Budget uncertainty is used to assess and hedge against risks due to a potentially incomplete deployment of a planned sensor network. This formulation is a multi-stage stochastic mixed-integer program, which are notoriously difficult to solve. We apply standard commercial solvers to small-scale test problems, enabling us to effectively analyze multi-stage sensor placement problems subject to budget uncertainties, and assess the impact of accounting for such uncertainty relative to a deterministic multi-stage model. © 2012 ASCE.
Water Distribution Systems Analysis 2010 - Proceedings of the 12th International Conference, WDSA 2010
We present a mixed-integer linear programming formulation to determine optimal locations for manual grab sampling after the detection of contaminants in a water distribution system. The formulation selects optimal manual grab sample locations that maximize the total pair-wise distinguishability of candidate contamination events. Given an initial contaminant detection location, a source inversion is performed that will eliminate unlikely events resulting in a much smaller set of candidate contamination events. We then propose a cyclical process where optimal grab samples locations are determined and manual grab samples taken. Relying only on YES/NO indicators of the presence of contaminant, source inversion is performed to reduce the set of candidate contamination events. The process is repeated until the number of candidate events is sufficiently small. Case studies testing this process are presented using water network models ranging from 4 to approximately 13000 nodes. The results demonstrate that the contamination event can be identified within a remarkably small number of sampling cycles using very few sampling teams. Furthermore, solution times were reasonable making this formulation suitable for real-time settings. © 2012 ASCE.
Engineering with Computers
A general-purpose algorithm for mesh optimization via node-movement, known as the Target-Matrix Paradigm, is introduced. The algorithm is general purpose in that it can be applied to a wide variety of mesh and element types, and to various commonly recurring mesh optimization problems such as shape improvement, and to more unusual problems like boundary-layer preservation with sliver removal, high-order mesh improvement, and edge-length equalization. The algorithm can be considered to be a direct optimization method in which weights are automatically constructed to enable definitions of application- specific mesh quality. The high-level concepts of the paradigm have been implemented in the Mesquite mesh improvement library, along with a number of concrete algorithms that address mesh quality issues such as those shown in the examples of the present paper. © Springer-Verlag (outside the USA) 2011.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Proposed exascale systems will present a number of considerable resiliency challenges. In particular, DRAM soft-errors, or bit-flips, are expected to greatly increase due to the increased memory density of these systems. Current hardware-based fault-tolerance methods will be unsuitable for addressing the expected soft error frequency rate. As a result, additional software will be needed to address this challenge. In this paper we introduce LIBSDC, a tunable, transparent silent data corruption detection and correction library for HPC applications. LIBSDC provides comprehensive SDC protection for program memory by implementing on-demand page integrity verification. Experimental benchmarks with Mantevo HPCCG show that once tuned, LIBSDC is able to achieve SDC protection with 50% overhead of resources, less than the 100% needed for double modular redundancy. © 2012 Springer-Verlag Berlin Heidelberg.
Journal of Computational Physics
Density Functional Theory (DFT) based Equation of State (EOS) construction is a prominent part of Sandia's capabilities to support engineering sciences. This capability is based on amending experimental data with information gained from computational investigations, in parts of the phase space where experimental data is hard, dangerous, or expensive to obtain. A prominent materials area where such computational investigations are hard to perform today because of limited accuracy is actinide and lanthanide materials. The Science of Extreme Environment Lab Directed Research and Development project described in this Report has had the aim to cure this accuracy problem. We have focused on the two major factors which would allow for accurate computational investigations of actinide and lanthanide materials: (1) The fully relativistic treatment needed for materials containing heavy atoms, and (2) the needed improved performance of DFT exchange-correlation functionals. We have implemented a fully relativistic treatment based on the Dirac Equation into the LANL code RSPt and we have shown that such a treatment is imperative when calculating properties of materials containing actinides and/or lanthanides. The present standard treatment that only includes some of the relativistic terms is not accurate enough and can even give misleading results. Compared to calculations previously considered state of the art, the Dirac treatment gives a substantial change in equilibrium volume predictions for materials with large spin-orbit coupling. For actinide and lanthanide materials, a Dirac treatment is thus a fundamental requirement in any computational investigation, including those for DFT-based EOS construction. For a full capability, a DFT functional capable of describing strongly correlated systems such as actinide materials need to be developed. Using the previously successful subsystem functional scheme developed by Mattsson et.al., we have created such a functional. In this functional the Harmonic Oscillator Gas is providing the necessary reference system for the strong correlation and localization occurring in actinides. Preliminary testing shows that the new Hao-Armiento-Mattsson (HAM) functional gives a trend towards improved results for the crystalline copper oxide test system we have chosen. This test system exhibits the same exchange-correlation physics as the actinide systems do, but without the relativistic effects, giving access to a pure testing ground for functionals. During the work important insights have been gained. An example is that currently available functionals, contrary to common belief, make large errors in so called hybridization regions where electrons from different ions interact and form new states. Together with the new understanding of functional issues, the Dirac implementation into the RSPt code will permit us to gain more fundamental understanding, both quantitatively and qualitatively, of materials of importance for Sandia and the rest of the Nuclear Weapons complex.
Proposed for publication in Applied Numerical Mathematics.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Proposed for publication in Physical Review B.
Abstract not provided.
Computer Aided Chemical Engineering
This manuscript presents a unified software framework for modeling and optimizing large-scale engineered systems with uncertainty. We propose a Python-based " block- oriented" modeling approach for representing the discrete components within the system. Through the use of a modeling components library, the block-oriented approach facilitates a clean separation of system superstructure from the details of individual components. This approach also lends itself naturally to expressing design and operational decisions as disjunctive expressions over the component blocks. We then apply a Python- based risk and uncertainty analysis library that leverages the explicit representation of the mathematical program in Python to automatically expand the deterministic system model into a multi-stage stochastic program, which can then be solved either directly or via decomposition-based solution strategies. This manuscript demonstrates the application of this modeling approach for risk-aware analysis of an electric distribution system. © 2012 Elsevier B.V.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Proposed for publication in IEEE Transactions on Visualization and Computer Graphics.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This report gives an overview of the work done as part of an Early Career LDRD aimed at modeling flow induced damage of materials involving chemical reactions, deformation of the porous matrix, and complex flow phenomena. The numerical formulation is motivated by a mixture theory or theory of interacting continua type approach to coupling the behavior of the fluid and the porous matrix. Results for the proposed method are presented for several engineering problems of interest including carbon dioxide sequestration, hydraulic fracturing, and energetic materials applications. This work is intended to create a general framework for flow induced damage that can be further developed in each of the particular areas addressed below. The results show both convincing proof of the methodologies potential and the need for further validation of the models developed.
Abstract not provided.
Decision makers increasingly rely on large-scale computational models to simulate and analyze complex man-made systems. For example, computational models of national infrastructures are being used to inform government policy, assess economic and national security risks, evaluate infrastructure interdependencies, and plan for the growth and evolution of infrastructure capabilities. A major challenge for decision makers is the analysis of national-scale models that are composed of interacting systems: effective integration of system models is difficult, there are many parameters to analyze in these systems, and fundamental modeling uncertainties complicate analysis. This project is developing optimization methods to effectively represent and analyze large-scale heterogeneous system of systems (HSoS) models, which have emerged as a promising approach for describing such complex man-made systems. These optimization methods enable decision makers to predict future system behavior, manage system risk, assess tradeoffs between system criteria, and identify critical modeling uncertainties.
Software lifecycles are becoming an increasingly important issue for computational science and engineering (CSE) software. The process by which a piece of CSE software begins life as a set of research requirements and then matures into a trusted high-quality capability is both commonplace and extremely challenging. Although an implicit lifecycle is obviously being used in any effort, the challenges of this process - respecting the competing needs of research vs. production - cannot be overstated. Here we describe a proposal for a well-defined software lifecycle process based on modern Lean/Agile software engineering principles. What we propose is appropriate for many CSE software projects that are initially heavily focused on research but also are expected to eventually produce usable high-quality capabilities. The model is related to TriBITS, a build, integration and testing system, which serves as a strong foundation for this lifecycle model, and aspects of this lifecycle model are ingrained in the TriBITS system. Here, we advocate three to four phases or maturity levels that address the appropriate handling of many issues associated with the transition from research to production software. The goals of this lifecycle model are to better communicate maturity levels with customers and to help to identify and promote Software Engineering (SE) practices that will help to improve productivity and produce better software. An important collection of software in this domain is Trilinos, which is used as the motivation and the initial target for this lifecycle model. However, many other related and similar CSE (and non-CSE) software projects can also make good use of this lifecycle model, especially those that use the TriBITS system. Indeed this lifecycle process, if followed, will enable large-scale sustainable integration of many complex CSE software efforts across several institutions.
Computer Aided Chemical Engineering
We present a methodology for optimally locating disinfectant booster stations for response to contamination events in water distribution systems. A stochastic programming problem considering uncertainty in both the location and time of the contamination event is formulated resulting in an extensive form that is equivalent to the weighted maximum coverage problem. Although the original full-space problem is intractably large, we show a series of reductions that reduce the size of the problem by five orders of magnitude and allow solutions of the optimal placement problem for realistically sized water network models. © 2012 Elsevier B.V.
Central European Journal of Mathematics
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Scientific Programming
PyTrilinos is a set of Python interfaces to compiled Trilinos packages. This collection supports serial and parallel dense linear algebra, serial and parallel sparse linear algebra, direct and iterative linear solution techniques, algebraic and multilevel preconditioners, nonlinear solvers and continuation algorithms, eigensolvers and partitioning algorithms. Also included are a variety of related utility functions and classes, including distributed I/O, coloring algorithms and matrix generation. PyTrilinos vector objects are compatible with the popular NumPy Python package. As a Python front end to compiled libraries, PyTrilinos takes advantage of the flexibility and ease of use of Python, and the efficiency of the underlying C++, C and Fortran numerical kernels. This paper covers recent, previously unpublished advances in the PyTrilinos package.
Scientific Programming
An approach for incorporating embedded simulation and analysis capabilities in complex simulation codes through template-based generic programming is presented. This approach relies on templating and operator overloading within the C++ language to transform a given calculation into one that can compute a variety of additional quantities that are necessary for many state-of-the-art simulation and analysis algorithms. An approach for incorporating these ideas into complex simulation codes through general graph-based assembly is also presented. These ideas have been implemented within a set of packages in the Trilinos framework and are demonstrated on a simple problem from chemical engineering. © 2012 - IOS Press and the authors. All rights reserved.
ASME International Mechanical Engineering Congress and Exposition, Proceedings (IMECE)
Modal-based methods for structural health monitoring require the identification of characteristic frequencies associated with a structure's primary modes of failure. A major difficulty is the extraction of damage-related frequency shifts from the large set of often benign frequency shifts observed experimentally. In this study, we apply peridynamics in combination with modal analysis for the prediction of characteristic frequency shifts throughout the damage evolution process. Peridynamics, a nonlocal extension of continuum mechanics, is unique in its ability to capture progressive material damage. The application of modal analysis to peridynamic models enables the tracking of structural modes and characteristic frequencies over the course of a simulation. Shifts in characteristic frequencies resulting from evolving structural damage can then be isolated and utilized in the analysis of frequency responses observed experimentally. We present a methodology for quasi-static peridynamic analyses, including the solution of the eigenvalue problem for identification of structural modes. Repeated solution of the eigenvalue problem over the course of a transient simulation yields a data set from which critical shifts in modal frequencies can be isolated. The application of peridynamics to modal analysis is demonstrated on the benchmark problem of a simply-supported beam. The computed natural frequencies of an undamaged beam are found to agree well with the classical local solution. Analyses in the presence of cracks of various lengths are shown to reveal frequency shifts associated with structural damage. Copyright © 2012 by ASME.
1st IEEE Symposium on Large-Scale Data Analysis and Visualization 2011, LDAV 2011 - Proceedings
Experts agree that the exascale machine will comprise processors that contain many cores, which in turn will necessitate a much higher degree of concurrency. Software will require a minimum of a 1,000 times more concurrency. Most parallel analysis and visualization algorithms today work by partitioning data and running mostly serial algorithms concurrently on each data partition. Although this approach lends itself well to the concurrency of current high-performance computing, it does not exhibit the appropriate pervasive parallelism required for exascale computing. The data partitions are too small and the overhead of the threads is too large to make effective use of all the cores in an extreme-scale machine. This paper introduces a new visualization framework designed to exhibit the pervasive parallelism necessary for extreme scale machines. We demonstrate the use of this system on a GPU processor, which we feel is the best analog to an exascale node that we have available today. © 2011 IEEE.
Materials Research Society Symposium Proceedings
The structures, energies, and energy levels of a comprehensive set of simple intrinsic point defects in aluminum arsenide are predicted using density functional theory (DFT). The calculations incorporate explicit and rigorous treatment of charged supercell boundary conditions. The predicted defect energy levels, computed as total energy differences, do not suffer from the DFT band gap problem, spanning the experimental gap despite the Kohn-Sham eigenvalue gap being much smaller than experiment. Defects in AlAs exhibit a surprising complexity - with a greater range of charge states, bistabilities, and multiple negative-U systems - that would be impossible to resolve with experiment alone. The simulation results can be used to populate defect physics models in III-V semiconductor device simulations with reliable quantities in those cases where experimental data is lacking, as in AlAs. © 2011 Materials Research Society.
Collection of Technical Papers - AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference
This paper explores various frameworks to quantify and propagate sources of epistemic and aleatoric uncertainty within the context of decision making for assessing system performance relative to design margins of a complex mechanical system. If sufficient data is available for characterizing aleatoric-type uncertainties, probabilistic methods are commonly used for computing response distribution statistics based on input probability distribution specifications. Conversely, for epistemic uncertainties, data is generally too sparse to support objective probabilistic input descriptions, leading to either subjective probabilistic descriptions (e.g., assumed priors in Bayesian analysis) or non-probabilistic methods based on interval specifications. Among the techniques examined in this work are (1) Interval analysis, (2) Dempster-Shafer Theory of Evidence, (3) a second-order probability (SOP) analysis in which the aleatory and epistemic variables are treated separately, and a nested iteration is performed, typically sampling epistemic variables on the outer loop, then sampling over aleatory variables on the inner loop and (4) a Bayesian approach where plausible prior distributions describing the epistemic variable are created and updated using available experimental data. This paper compares the results and the information provided by different methods to enable decision making in the context of performance assessment when epistemic uncertainty is considered.
Collection of Technical Papers - AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference
This paper discusses the handling and treatment of uncertainties corresponding to relatively few data samples in experimental characterization of random quantities. The importance of this topic extends beyond experimental uncertainty to situations where the derived experimental information is used for model validation or calibration. With very sparse data it is not practical to have a goal of accurately estimating the underlying variability distribution (probability density function, PDF). Rather, a pragmatic goal is that the uncertainty representation should be conservative so as to bound a desired percentage of the actual PDF, say 95% included probability, with reasonable reliability. A second, opposing objective is that the representation not be overly conservative; that it minimally over-estimate the random-variable range corresponding to the desired percentage of the actual PDF. The performance of a variety of uncertainty representation techniques is tested and characterized in this paper according to these two opposing objectives. An initial set of test problems and results is presented here from a larger study currently underway.
SC'11 - Proceedings of the 2011 High Performance Computing Networking, Storage and Analysis Companion, Co-located with SC'11
With the increasing levels of parallelism in a compute node, it is important to exploit multiple levels of parallelism even within a single compute node. We present ShyLU (pro- nounced\Shy-loo"for Scalable Hybrid LU), a\hybrid-hybrid" solver for general sparse linear systems that is hybrid in two ways: First, it combines direct and iterative methods. The iterative method is based on approximate Schur com- plements. Second, the solver uses two levels of parallelism via hybrid programming (MPI+threads). Our solver is use- ful both in shared-memory environments and on large par- allel computers with distributed memory (as a subdomain solver). We compare the robustness of ShyLU against other algebraic preconditioners. ShyLU scales well up to 192 cores for a given problem size. We compare at MPI performance of ShyLU against a hybrid implementation. We conclude that on present multicore nodes at MPI is better. However, for future manycore machines (48 or more cores) hybrid/ hi- erarchical algorithms and implementations are important for sustained performance. Copyright is held by the author/owner(s).
ACM International Conference Proceeding Series
Approximate counting [18] is useful for data stream and database summarization. It can help in many settings that allow only one pass over the data, want low memory usage, and can accept some relative error. Approximate counters use fewer bits; we focus on 8-bits but our results are general. These small counters represent a sparse sequence of larger numbers. Counters are incremented probabilistically based on the spacing between the numbers they represent. Our contributions are a customized distribution of counter values and efficient strategies for deciding when to increment them. At run-time, users may independently select the spacing (accuracy) of the approximate counter for small, medium, and large values. We allow the user to select the maximum number to count up to, and our algorithm will select the exponential base of the spacing. These provide additional flexibility over both classic and Csurös's [4] floating-point approximate counting. These provide additional structure, a useful schema for users, over Kruskal and Greenberg [13]. We describe two new and efficient strategies for incrementing approximate counters: use a deterministic countdown or sample from a geometric distribution. In Csurös's all increments are powers of two, so random bits rather than full random numbers can be used. We also provide the option to use powers-of-two but retain flexibility. We show when each strategy is fastest in our implementation. © 2011 ACM.
20th AIAA Computational Fluid Dynamics Conference 2011
We analyze the artificial dissipation introduced by a streamline-upwind Petrov-Galerkin finite element method and consider its effect on the conservation of total enthalpy for the Euler and laminar Navier-Stokes equations. We also consider the chemically reacting case. We demonstrate that in general, total enthalpy is not conserved for the important special case of the steady-state Euler equations. A modification to the artificial dissipation is proposed and shown to significantly improve the conservation of total enthalpy.
11AIChE - 2011 AIChE Annual Meeting, Conference Proceedings
This presentation will discuss progress towards developing a large-scale parallel CFD capability using stabilized finite element formulations to simulate turbulent reacting flow and heat transfer in light water nuclear reactors (LWRs). Numerical simultation plays a critical role in the design, certification, and operation of LWRs. The Consortium for Advanced Simulation of Light Water Reactors is a U. S. Department of Energy Innovation Hub that is developing a virtual reactor toolkit that will incorporate science-based models, state-of-the-art numerical methods, modern computational science and engineering practices, and uncertainty quantification (UQ) and validation against operating pressurized water reactors. It will couple state-of-the-art fuel performance, neutronics, thermal-hydraulics (T-H), and structural models with existing tools for systems and safety analysis and will be designed for implementation on both today's leadership-class computers and next-generation advanced architecture platforms. We will first describe the finite element discretization utilizing PSPG, SUPG, and discontinuity capturing stabilization. We will then discuss our initial turbulence modeling formulations (LES and URANS) and the scalable fully implicit, fully coupled solution methods that are used to solve the challenging systems. These include globalized Newton-Krylov methods for solving the nonlinear systems of equaitons and preconditioned Krylov techniques. The preconditioners are based on fully-coupled algebraic multigrid and approximate block factorization preconditioners. We will discuss how these methods provide a powerful integration path for multiscale coupling to the neutronics and structures applications. Initial results on scalabiltiy will be presented. Finally we will comment on our use of embedded technology and how this capbaility impacts the application of implicit methods, sensitivity analysis and UQ.
AAAI Fall Symposium - Technical Report
Inference techniques play a central role in many cognitive systems. They transform low-level observations of the environment into high-level, actionable knowledge which then gets used by mechanisms that drive action, problem-solving, and learning. This paper presents an initial effort at combining results from AI and psychology into a pragmatic and scalable computational reasoning system. Our approach combines a numeric notion of plausibility with first-order logic to produce an incremental inference engine that is guided by heuristics derived from the psychological literature. We illustrate core ideas with detailed examples and discuss the advantages of the approach with respect to cognitive systems.
PDSW'11 - Proceedings of the 6th Parallel Data Storage Workshop, Co-located with SC'11
The increasing fidelity of scientific simulations as they scale towards exascale sizes is straining the proven IO techniques championed throughout terascale computing. Chief among the successful IO techniques is the idea of collective IO where processes coordinate and exchange data prior to writing to storage in an effort to reduce the number of small, independent IO operations. As well as collective IO works for efficiently creating a data set in the canonical order, 3-D domain decompositions prove troublesome due to the amount of data exchanged prior to writing to storage. When each process has a tiny piece of a 3-D simulation space rather than a complete 'pencil' or 'plane', 2-D or 1-D domain decompositions respectively, the communication overhead to rearrange the data can dwarf the time spent actually writing to storage [27]. Our approach seeks to transparently increase scalability and performance while maintaining both the IO routines in the application and the final data format in the storage system. Accomplishing this leverages both the Nessie [23] RPC framework and a staging area with staging services. Through these tools, we employ a variety of data processing operations prior to invoking the native API to write data to storage yielding as much as a 3X performance improvement over the native calls. © 2011 ACM.
Abstract not provided.
SIGMETRICS Performance Evaluation Review
Abstract not provided.
Abstract not provided.
The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic expansion methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a theoretical manual for selected algorithms implemented within the DAKOTA software. It is not intended as a comprehensive theoretical treatment, since a number of existing texts cover general optimization theory, statistical analysis, and other introductory topics. Rather, this manual is intended to summarize a set of DAKOTA-related research publications in the areas of surrogate-based optimization, uncertainty quantification, and optimization under uncertainty that provide the foundation for many of DAKOTA's iterative analysis capabilities.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
International Journal for Numerical Methods in Engineering
Abstract not provided.
Abstract not provided.
Physical Review B
Abstract not provided.
Proposed for publication in Computers & Fluids.
Abstract not provided.
Abstract not provided.
The objective of the U.S. Department of Energy Office of Nuclear Energy Advanced Modeling and Simulation Waste Integrated Performance and Safety Codes (NEAMS Waste IPSC) is to provide an integrated suite of computational modeling and simulation (M&S) capabilities to quantitatively assess the long-term performance of waste forms in the engineered and geologic environments of a radioactive-waste storage facility or disposal repository. Achieving the objective of modeling the performance of a disposal scenario requires describing processes involved in waste form degradation and radionuclide release at the subcontinuum scale, beginning with mechanistic descriptions of chemical reactions and chemical kinetics at the atomic scale, and upscaling into effective, validated constitutive models for input to high-fidelity continuum scale codes for coupled multiphysics simulations of release and transport. Verification and validation (V&V) is required throughout the system to establish evidence-based metrics for the level of confidence in M&S codes and capabilities, including at the subcontiunuum scale and the constitutive models they inform or generate. This Report outlines the nature of the V&V challenge at the subcontinuum scale, an approach to incorporate V&V concepts into subcontinuum scale modeling and simulation (M&S), and a plan to incrementally incorporate effective V&V into subcontinuum scale M&S destined for use in the NEAMS Waste IPSC work flow to meet requirements of quantitative confidence in the constitutive models informed by subcontinuum scale phenomena.
International Journal of Computational Science and Engineering (IJCSE)
Abstract not provided.
Abstract not provided.
Abstract not provided.
This is a companion publication to the paper 'A Matrix-Free Trust-Region SQP Algorithm for Equality Constrained Optimization' [11]. In [11], we develop and analyze a trust-region sequential quadratic programming (SQP) method that supports the matrix-free (iterative, in-exact) solution of linear systems. In this report, we document the numerical behavior of the algorithm applied to a variety of equality constrained optimization problems, with constraints given by partial differential equations (PDEs).
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Nature
Abstract not provided.
Abstract not provided.
Parallel Processing Letters
Proceedings - Symposium on the High Performance Interconnects, Hot Interconnects
Low latency collective communications are key to application scalability. As systems grow larger, minimizing collective communication time becomes increasingly challenging. Offload is an effective technique for accelerating collective operations; however, algorithms for collective communication constantly evolve such that flexible implementations are critical. This paper presents triggered operations-a semantic building block that allows the key components of collective communications to be offloaded while allowing the host side software to define the algorithm. Simulations are used to demonstrate the performance improvements achievable through the offload of MPI-Allreduce using these building blocks. © 2011 IEEE.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Peridynamics is a nonlocal extension of classical continuum mechanics. The discrete peridynamic model has the same computational structure as a molecular dynamics model. This document provides a brief overview of the peridynamic model of a continuum, then discusses how the peridynamic model is discretized within LAMMPS. An example problem is also included.
LIME is a small software package for creating multiphysics simulation codes. The name was formed as an acronym denoting 'Lightweight Integrating Multiphysics Environment for coupling codes.' LIME is intended to be especially useful when separate computer codes (which may be written in any standard computer language) already exist to solve different parts of a multiphysics problem. LIME provides the key high-level software (written in C++), a well defined approach (with example templates), and interface requirements to enable the assembly of multiple physics codes into a single coupled-multiphysics simulation code. In this report we introduce important software design characteristics of LIME, describe key components of a typical multiphysics application that might be created using LIME, and provide basic examples of its use - including the customized software that must be written by a user. We also describe the types of modifications that may be needed to individual physics codes in order for them to be incorporated into a LIME-based multiphysics application.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Scientific Programming
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Scientific computing-driven discoveries are frequently driven from workflows that use persistent storage as a staging area for data between operations. With the bad and progressively worse bandwidth vs. data size issues as we continue towards exascale, eliminating persistent storage through techniques like data staging will both enable these workflows to continue online, but also enable more interactive workflows reducing the time to scientific discoveries. Data staging has shown to be an effective way for applications running on high-end computing platforms to offload expensive I/O operations and to manage the tremendous amounts of data they produce. This data staging approach, however, lacks the ACID style guarantees traditional straight-to-disk methods provide. Distributed transactions are a proven way to add ACID properties to data movements, however distributed transactions follow 1xN data movement semantics, where our highly parallel HPC environments employ MxN data movement semantics. In this paper we present a novel protocol that extends distributed transaction terminology to include MxN semantics which allows our data staging areas to benefit from ACID properties. We show that with our protocol we can provide resilient data staging with a limited performance penalty over current data staging implementations.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This document reports on the research of Kenneth Letendre, the recipient of a Sandia Graduate Research Fellowship at the University of New Mexico. Warfare is an extreme form of intergroup competition in which individuals make extreme sacrifices for the benefit of their nation or other group to which they belong. Among animals, limited, non-lethal competition is the norm. It is not fully understood what factors lead to warfare. We studied the global variation in the frequency of civil conflict among countries of the world, and its positive association with variation in the intensity of infectious disease. We demonstrated that the burden of human infectious disease importantly predicts the frequency of civil conflict and tested a causal model for this association based on the parasite-stress theory of sociality. We also investigated the organization of social foraging by colonies of harvester ants in the genus Pogonomyrmex, using both field studies and computer models.
Proceedings of SPIE - The International Society for Optical Engineering
The electronic structure and optical properties of Ge-core/Si-shell nanocrystal or quantum dot (QD) are investigated using the atomistic tight binding method as implemented in NEMO3D. The thermionic lifetime that governs the hole leakage mechanism in the Ge/Si QD based laser, as a function of the Ge core size and strain, is also calculated by capturing the bound and extended eigenstates, well below the band edges. We also analyzed the effect of core size and strain on optical properties such as transition energies and transition rates between electron and hole states. Finally, a quantitative and qualitative analysis of the leakage current due to the hole leakage through the Ge-core/Si-shell QD laser, at different temperatures and Ge core sizes, is presented. © 2011 SPIE.
We investigate Bayesian techniques that can be used to reconstruct field variables from partial observations. In particular, we target fields that exhibit spatial structures with a large spectrum of lengthscales. Contemporary methods typically describe the field on a grid and estimate structures which can be resolved by it. In contrast, we address the reconstruction of grid-resolved structures as well as estimation of statistical summaries of subgrid structures, which are smaller than the grid resolution. We perform this in two different ways (a) via a physical (phenomenological), parameterized subgrid model that summarizes the impact of the unresolved scales at the coarse level and (b) via multiscale finite elements, where specially designed prolongation and restriction operators establish the interscale link between the same problem defined on a coarse and fine mesh. The estimation problem is posed as a Bayesian inverse problem. Dimensionality reduction is performed by projecting the field to be inferred on a suitable orthogonal basis set, viz. the Karhunen-Loeve expansion of a multiGaussian. We first demonstrate our techniques on the reconstruction of a binary medium consisting of a matrix with embedded inclusions, which are too small to be grid-resolved. The reconstruction is performed using an adaptive Markov chain Monte Carlo method. We find that the posterior distributions of the inferred parameters are approximately Gaussian. We exploit this finding to reconstruct a permeability field with long, but narrow embedded fractures (which are too fine to be grid-resolved) using scalable ensemble Kalman filters; this also allows us to address larger grids. Ensemble Kalman filtering is then used to estimate the values of hydraulic conductivity and specific yield in a model of the High Plains Aquifer in Kansas. Strong conditioning of the spatial structure of the parameters and the non-linear aspects of the water table aquifer create difficulty for the ensemble Kalman filter. We conclude with a demonstration of the use of multiscale stochastic finite elements to reconstruct permeability fields. This method, though computationally intensive, is general and can be used for multiscale inference in cases where a subgrid model cannot be constructed.
Molecular dynamics simulation (MD) is an invaluable tool for studying problems sensitive to atomscale physics such as structural transitions, discontinuous interfaces, non-equilibrium dynamics, and elastic-plastic deformation. In order to apply this method to modeling of ramp-compression experiments, several challenges must be overcome: accuracy of interatomic potentials, length- and time-scales, and extraction of continuum quantities. We have completed a 3 year LDRD project with the goal of developing molecular dynamics simulation capabilities for modeling the response of materials to ramp compression. The techniques we have developed fall in to three categories (i) molecular dynamics methods (ii) interatomic potentials (iii) calculation of continuum variables. Highlights include the development of an accurate interatomic potential describing shock-melting of Beryllium, a scaling technique for modeling slow ramp compression experiments using fast ramp MD simulations, and a technique for extracting plastic strain from MD simulations. All of these methods have been implemented in Sandia's LAMMPS MD code, ensuring their widespread availability to dynamic materials research at Sandia and elsewhere.
Abstract not provided.
Inverse Problems
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This report describes the laboratory directed research and development work to model relevant areas of the brain that associate multi-modal information for long-term storage for the purpose of creating a more effective, and more automated, association mechanism to support rapid decision making. Using the biology and functionality of the hippocampus as an analogy or inspiration, we have developed an artificial neural network architecture to associate k-tuples (paired associates) of multimodal input records. The architecture is composed of coupled unimodal self-organizing neural modules that learn generalizations of unimodal components of the input record. Cross modal associations, stored as a higher-order tensor, are learned incrementally as these generalizations form. Graph algorithms are then applied to the tensor to extract multi-modal association networks formed during learning. Doing so yields a novel approach to data mining for knowledge discovery. This report describes the neurobiological inspiration, architecture, and operational characteristics of our model, and also provides a real world terrorist network example to illustrate the model's functionality.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
In the ACS Data Analytics Project (also known as 'YumYum'), a supercomputer is modeled as a graph of components and dependencies, jobs and faults are simulated, and component fault rates are estimated using the graph structure and job pass/fail outcomes. This report documents the successful completion of all SNL deliverables and tasks, describes the software written by SNL for the project, and presents the data it generates. Readers should understand what the software tools are, how they fit together, and how to use them to reproduce the presented data and additional experiments as desired. The SNL YumYum tools provide the novel simulation and inference capabilities desired by ACS. SNL also developed and implemented a new algorithm, which provides faster estimates, at finer component granularity, on arbitrary directed acyclic graphs.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This paper proposes methodology for providing robustness and resilience for a highly threaded distributed- and shared-memory environment based on well-defined inputs and outputs to lightweight tasks. These inputs and outputs form a failure 'barrier', allowing tasks to be restarted or duplicated as necessary. These barriers must be expanded based on task behavior, such as communication between tasks, but do not prohibit any given behavior. One of the trends in high-performance computing codes seems to be a trend toward self-contained functions that mimic functional programming. Software designers are trending toward a model of software design where their core functions are specified in side-effect free or low-side-effect ways, wherein the inputs and outputs of the functions are well-defined. This provides the ability to copy the inputs to wherever they need to be - whether that's the other side of the PCI bus or the other side of the network - do work on that input using local memory, and then copy the outputs back (as needed). This design pattern is popular among new distributed threading environment designs. Such designs include the Barcelona STARS system, distributed OpenMP systems, the Habanero-C and Habanero-Java systems from Vivek Sarkar at Rice University, the HPX/ParalleX model from LSU, as well as our own Scalable Parallel Runtime effort (SPR) and the Trilinos stateless kernels. This design pattern is also shared by CUDA and several OpenMP extensions for GPU-type accelerators (e.g. the PGI OpenMP extensions).
Abstract not provided.
Abstract not provided.
Abstract not provided.
Scientific Programming
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Scientific Programming
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This report summarizes research performed for the Nuclear Energy Advanced Modeling and Simulation (NEAMS) Subcontinuum and Upscaling Task. The work conducted focused on developing a roadmap to include molecular scale, mechanistic information in continuum-scale models of nuclear waste glass dissolution. This information is derived from molecular-scale modeling efforts that are validated through comparison with experimental data. In addition to developing a master plan to incorporate a subcontinuum mechanistic understanding of glass dissolution into continuum models, methods were developed to generate constitutive dissolution rate expressions from quantum calculations, force field models were selected to generate multicomponent glass structures and gel layers, classical molecular modeling was used to study diffusion through nanopores analogous to those in the interfacial gel layer, and a micro-continuum model (K{mu}C) was developed to study coupled diffusion and reaction at the glass-gel-solution interface.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This report summarizes research conducted through the Sandia National Laboratories Robust Automated Knowledge Capture Laboratory Directed Research and Development project. The objective of this project was to advance scientific understanding of the influence of individual cognitive attributes on decision making. The project has developed a quantitative model known as RumRunner that has proven effective in predicting the propensity of an individual to shift strategies on the basis of task and experience related parameters. Three separate studies are described which have validated the basic RumRunner model. This work provides a basis for better understanding human decision making in high consequent national security applications, and in particular, the individual characteristics that underlie adaptive thinking.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Historically, MPI implementations have had to choose between eager messaging protocols that require buffering and rendezvous protocols that sacrifice overlap and strong independent progress in some scenarios. The typical choice is to use an eager protocol for short messages and switch to a rendezvous protocol for long messages. If overlap and progress are desired, some implementations offer the option of using a thread. We propose an approach that leverages triggered operations to implement a long message rendezvous protocol that provides strong progress guarantees. The results indicate that a triggered operation based rendezvous can achieve better overlap than a traditional rendezvous implementation and less wasted bandwidth than an eager long protocol. © 2011 Springer-Verlag Berlin Heidelberg.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Concern is beginning to grow in the high-performance computing (HPC) community regarding the reliability guarantees of future large-scale systems. Disk-based coordinated checkpoint/restart has been the dominant fault tolerance mechanism in HPC systems for the last 30 years. Checkpoint performance is so fundamental to scalability that nearly all capability applications have custom checkpoint strategies to minimize state and reduce checkpoint time. One well-known optimization to traditional checkpoint/restart is incremental checkpointing, which has a number of known limitations. To address these limitations, we introduce libhashckpt; a hybrid incremental checkpointing solution that uses both page protection and hashing on GPUs to determine changes in application data with very low overhead. Using real capability workloads, we show the merit of this technique for a certain class of HPC applications. © 2011 Springer-Verlag Berlin Heidelberg.
Physical Review E - Statistical, Nonlinear, and Soft Matter Physics
The purpose of this paper is to derive the energy and momentum conservation laws of the peridynamic nonlocal continuum theory using the principles of classical statistical mechanics. The peridynamic laws allow the consideration of discontinuous motion, or deformation, by relying on integral operators. These operators sum forces and power expenditures separated by a finite distance and so represent nonlocal interaction. The integral operators replace the differential divergence operators conventionally used, thereby obviating special treatment at points of discontinuity. The derivation presented employs a general multibody interatomic potential, avoiding the standard assumption of a pairwise decomposition. The integral operators are also expressed in terms of a stress tensor and heat flux vector under the assumption that these fields are differentiable, demonstrating that the classical continuum energy and momentum conservation laws are consequences of the more general peridynamic laws. An important conclusion is that nonlocal interaction is intrinsic to continuum conservation laws when derived using the principles of statistical mechanics. © 2011 American Physical Society.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
In this paper, we present scheduling algorithms that simultaneously support guaranteed starting times and favor jobs with system-desired traits. To achieve the first of these goals, our algorithms keep a profile with potential starting times for every unfinished job and never move these starting times later, just as in Conservative Backfilling. To achieve the second, they exploit previously unrecognized flexibility in the handling of holes opened in this profile when jobs finish early. We find that, with one choice of job selection function, our algorithms can consistently yield a lower average waiting time than Conservative Backfilling while still providing a guaranteed start time to each job as it arrives. In fact, in most cases, the algorithms give a lower average waiting time than the more aggressive EASY backfilling algorithm, which does not provide guaranteed start times. Alternately, with a different choice of job selection function, our algorithms can focus the benefit on the widest submitted jobs, the reason for the existence of parallel systems. In this case, these jobs experience significantly lower waiting time than Conservative Backfilling with minimal impact on other jobs. © 2011 Springer-Verlag.
Parallel Computing
We describe a parallel library written with message-passing (MPI) calls that allows algorithms to be expressed in the MapReduce paradigm. This means the calling program does not need to include explicit parallel code, but instead provides "map" and "reduce" functions that operate independently on elements of a data set distributed across processors. The library performs needed data movement between processors. We describe how typical MapReduce functionality can be implemented in an MPI context, and also in an out-of-core manner for data sets that do not fit within the aggregate memory of a parallel machine. Our motivation for creating this library was to enable graph algorithms to be written as MapReduce operations, allowing processing of terabyte-scale data sets on traditional MPI-based clusters. We outline MapReduce versions of several such algorithms: vertex ranking via PageRank, triangle finding, connected component identification, Luby's algorithm for maximally independent sets, and single-source shortest-path calculation. To test the algorithms on arbitrarily large artificial graphs we generate randomized R-MAT matrices in parallel; a MapReduce version of this operation is also described. Performance and scalability results for the various algorithms are presented for varying size graphs on a distributed-memory cluster. For some cases, we compare the results with non-MapReduce algorithms, different machines, and different MapReduce software, namely Hadoop. Our open-source library is written in C++, is callable from C++, C, Fortran, or scripting languages such as Python, and can run on any parallel platform that supports MPI. © 2011 Elsevier B.V. All rights reserved.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Operations Research Letters
Abstract not provided.
This document summarizes research performed under the SNL LDRD entitled - Computational Mechanics for Geosystems Management to Support the Energy and Natural Resources Mission. The main accomplishment was development of a foundational SNL capability for computational thermal, chemical, fluid, and solid mechanics analysis of geosystems. The code was developed within the SNL Sierra software system. This report summarizes the capabilities of the simulation code and the supporting research and development conducted under this LDRD. The main goal of this project was the development of a foundational capability for coupled thermal, hydrological, mechanical, chemical (THMC) simulation of heterogeneous geosystems utilizing massively parallel processing. To solve these complex issues, this project integrated research in numerical mathematics and algorithms for chemically reactive multiphase systems with computer science research in adaptive coupled solution control and framework architecture. This report summarizes and demonstrates the capabilities that were developed together with the supporting research underlying the models. Key accomplishments are: (1) General capability for modeling nonisothermal, multiphase, multicomponent flow in heterogeneous porous geologic materials; (2) General capability to model multiphase reactive transport of species in heterogeneous porous media; (3) Constitutive models for describing real, general geomaterials under multiphase conditions utilizing laboratory data; (4) General capability to couple nonisothermal reactive flow with geomechanics (THMC); (5) Phase behavior thermodynamics for the CO2-H2O-NaCl system. General implementation enables modeling of other fluid mixtures. Adaptive look-up tables enable thermodynamic capability to other simulators; (6) Capability for statistical modeling of heterogeneity in geologic materials; and (7) Simulator utilizes unstructured grids on parallel processing computers.
Many of the most important and hardest-to-solve problems related to the synthesis, performance, and aging of materials involve diffusion through the material or along surfaces and interfaces. These diffusion processes are driven by motions at the atomic scale, but traditional atomistic simulation methods such as molecular dynamics are limited to very short timescales on the order of the atomic vibration period (less than a picosecond), while macroscale diffusion takes place over timescales many orders of magnitude larger. We have completed an LDRD project with the goal of developing and implementing new simulation tools to overcome this timescale problem. In particular, we have focused on two main classes of methods: accelerated molecular dynamics methods that seek to extend the timescale attainable in atomistic simulations, and so-called 'equation-free' methods that combine a fine scale atomistic description of a system with a slower, coarse scale description in order to project the system forward over long times.
There is currently sparse literature on how to implement systematic and comprehensive processes for modern V&V/UQ (VU) within large computational simulation projects. Important design requirements have been identified in order to construct a viable 'system' of processes. Significant processes that are needed include discovery, accumulation, and assessment. A preliminary design is presented for a VU Discovery process that accounts for an important subset of the requirements. The design uses a hierarchical approach to set context and a series of place-holders that identify the evidence and artifacts that need to be created in order to tell the VU story and to perform assessments. The hierarchy incorporates VU elements from a Predictive Capability Maturity Model and uses questionnaires to define critical issues in VU. The place-holders organize VU data within a central repository that serves as the official VU record of the project. A review process ensures that those who will contribute to the record have agreed to provide the evidence identified by the Discovery process. VU expertise is an essential part of this process and ensures that the roadmap provided by the Discovery process is adequate. Both the requirements and the design were developed to support the Nuclear Energy Advanced Modeling and Simulation Waste project, which is developing a set of advanced codes for simulating the performance of nuclear waste storage sites. The Waste project served as an example to keep the design of the VU Discovery process grounded in practicalities. However, the system is represented abstractly so that it can be applied to other M&S projects.
The class of discontinuous Petrov-Galerkin finite element methods (DPG) proposed by L. Demkowicz and J. Gopalakrishnan guarantees the optimality of the solution in an energy norm and produces a symmetric positive definite stiffness matrix, among other desirable properties. In this paper, we describe a toolbox, implemented atop Sandia's Trilinos library, for rapid development of solvers for DPG methods. We use this toolbox to develop solvers for the Poisson and Stokes problems.