Publications

Results 1–100 of 9,998
Skip to search filters

Enabling power measurement and control on Astra: The first petascale Arm supercomputer

Concurrency and Computation: Practice and Experience

Grant, Ryan E.; Hammond, Simon D.; Laros, James H.; Levenhagen, Michael J.; Olivier, Stephen L.; Pedretti, Kevin P.; Ward, Harry L.; Younge, Andrew J.

Astra, deployed in 2018, was the first petascale supercomputer to utilize processors based on the ARM instruction set. The system was also the first under Sandia's Vanguard program which seeks to provide an evaluation vehicle for novel technologies that with refinement could be utilized in demanding, large-scale HPC environments. In addition to ARM, several other important first-of-a-kind developments were used in the machine, including new approaches to cooling the datacenter and machine. This article documents our experiences building a power measurement and control infrastructure for Astra. While this is often beyond the control of users today, the accurate measurement, cataloging, and evaluation of power, as our experiences show, is critical to the successful deployment of a large-scale platform. While such systems exist in part for other architectures, Astra required new development to support the novel Marvell ThunderX2 processor used in compute nodes. In addition to documenting the measurement of power during system bring up and for subsequent on-going routine use, we present results associated with controlling the power usage of the processor, an area which is becoming of progressively greater interest as data centers and supercomputing sites look to improve compute/energy efficiency and find additional sources for full system optimization.

More Details

What can simulation test beds teach us about social science? Results of the ground truth program

Computational and Mathematical Organization Theory

Naugle, Asmeret B.; Krofcheck, Daniel J.; Warrender, Christina E.; Lakkaraju, Kiran L.; Swiler, Laura P.; Verzi, Stephen J.; Emery, Ben; Murdock, Jaimie; Bernard, Michael L.; Romero, Vicente J.

The ground truth program used simulations as test beds for social science research methods. The simulations had known ground truth and were capable of producing large amounts of data. This allowed research teams to run experiments and ask questions of these simulations similar to social scientists studying real-world systems, and enabled robust evaluation of their causal inference, prediction, and prescription capabilities. We tested three hypotheses about research effectiveness using data from the ground truth program, specifically looking at the influence of complexity, causal understanding, and data collection on performance. We found some evidence that system complexity and causal understanding influenced research performance, but no evidence that data availability contributed. The ground truth program may be the first robust coupling of simulation test beds with an experimental framework capable of teasing out factors that determine the success of social science research.

More Details

A Novel Partitioned Approach for Reduced Order Model—Finite Element Model (ROM-FEM) and ROM-ROM Coupling

Earth and Space 2022

de Castro, Amy G.; Kuberry, Paul A.; Kalashnikova, Irina; Bochev, Pavel B.

Partitioned methods allow one to build a simulation capability for coupled problems by reusing existing single-component codes. In so doing, partitioned methods can shorten code development and validation times for multiphysics and multiscale applications. In this work, we consider a scenario in which one or more of the “codes” being coupled are projection-based reduced order models (ROMs), introduced to lower the computational cost associated with a particular component. We simulate this scenario by considering a model interface problem that is discretized independently on two non-overlapping subdomains. Here we then formulate a partitioned scheme for this problem that allows the coupling between a ROM “code” for one of the subdomains with a finite element model (FEM) or ROM “code” for the other subdomain. The ROM “codes” are constructed by performing proper orthogonal decomposition (POD) on a snapshot ensemble to obtain a low-dimensional reduced order basis, followed by a Galerkin projection onto this basis. The ROM and/or FEM “codes” on each subdomain are then coupled using a Lagrange multiplier representing the interface flux. To partition the resulting monolithic problem, we first eliminate the flux through a dual Schur complement. Application of an explicit time integration scheme to the transformed monolithic problem decouples the subdomain equations, allowing their independent solution for the next time step. We show numerical results that demonstrate the proposed method’s efficacy in achieving both ROM-FEM and ROM-ROM coupling.

More Details

The DPG Method for the Convection-Reaction Problem, Revisited

Computational Methods in Applied Mathematics

Demkowicz, Leszek F.; Roberts, Nathan V.; Muñoz-Matute, Judit

We study both conforming and non-conforming versions of the practical DPG method for the convection-reaction problem. We determine that the most common approach for DPG stability analysis - construction of a local Fortin operator - is infeasible for the convection-reaction problem. We then develop a line of argument based on a direct proof of discrete stability; we find that employing a polynomial enrichment for the test space does not suffice for this purpose, motivating the introduction of a (two-element) subgrid mesh. The argument combines mathematical analysis with numerical experiments.

More Details

Feedback density and causal complexity of simulation model structure

Journal of Simulation

Naugle, Asmeret B.; Verzi, Stephen J.; Lakkaraju, Kiran L.; Swiler, Laura P.; Warrender, Christina E.; Bernard, Michael L.; Romero, Vicente J.

Measures of simulation model complexity generally focus on outputs; we propose measuring the complexity of a model’s causal structure to gain insight into its fundamental character. This article introduces tools for measuring causal complexity. First, we introduce a method for developing a model’s causal structure diagram, which characterises the causal interactions present in the code. Causal structure diagrams facilitate comparison of simulation models, including those from different paradigms. Next, we develop metrics for evaluating a model’s causal complexity using its causal structure diagram. We discuss cyclomatic complexity as a measure of the intricacy of causal structure and introduce two new metrics that incorporate the concept of feedback, a fundamental component of causal structure. The first new metric introduced here is feedback density, a measure of the cycle-based interconnectedness of causal structure. The second metric combines cyclomatic complexity and feedback density into a comprehensive causal complexity measure. Finally, we demonstrate these complexity metrics on simulation models from multiple paradigms and discuss potential uses and interpretations. These tools enable direct comparison of models across paradigms and provide a mechanism for measuring and discussing complexity based on a model’s fundamental assumptions and design.

More Details

Combining Spike Time Dependent Plasticity (STDP) and Backpropagation (BP) for Robust and Data Efficient Spiking Neural Networks (SNN)

Wang, Felix W.; Teeter, Corinne M.

National security applications require artificial neural networks (ANNs) that consume less power, are fast and dynamic online learners, are fault tolerant, and can learn from unlabeled and imbalanced data. We explore whether two fundamentally different, traditional learning algorithms from artificial intelligence and the biological brain can be merged. We tackle this problem from two directions. First, we start from a theoretical point of view and show that the spike time dependent plasticity (STDP) learning curve observed in biological networks can be derived using the mathematical framework of backpropagation through time. Second, we show that transmission delays, as observed in biological networks, improve the ability of spiking networks to perform classification when trained using a backpropagation of error (BP) method. These results provide evidence that STDP could be compatible with a BP learning rule. Combining these learning algorithms will likely lead to networks more capable of meeting our national security missions.

More Details

Combining DPG in space with DPG time-marching scheme for the transient advection–reaction equation

Computer Methods in Applied Mechanics and Engineering

Muñoz-Matute, Judit; Demkowicz, Leszek; Roberts, Nathan V.

In this article, we present a general methodology to combine the Discontinuous Petrov–Galerkin (DPG) method in space and time in the context of methods of lines for transient advection–reaction problems. We first introduce a semidiscretization in space with a DPG method redefining the ideas of optimal testing and practicality of the method in this context. Then, we apply the recently developed DPG-based time-marching scheme, which is of exponential-type, to the resulting system of Ordinary Differential Equations (ODEs). We also discuss how to efficiently compute the action of the exponential of the matrix coming from the space semidiscretization without assembling the full matrix. Finally, we verify the proposed method for 1D+time advection–reaction problems showing optimal convergence rates for smooth solutions and more stable results for linear conservation laws comparing to the classical exponential integrators.

More Details

Nonlocal kernel network (NKN): A stable and resolution-independent deep neural network

Journal of Computational Physics

You, Huaiqian; Yu, Yue; D'Elia, Marta D.; Gao, Tian; Silling, Stewart A.

Neural operators [1–5] have recently become popular tools for designing solution maps between function spaces in the form of neural networks. Differently from classical scientific machine learning approaches that learn parameters of a known partial differential equation (PDE) for a single instance of the input parameters at a fixed resolution, neural operators approximate the solution map of a family of PDEs [6,7]. Despite their success, the uses of neural operators are so far restricted to relatively shallow neural networks and confined to learning hidden governing laws. In this work, we propose a novel nonlocal neural operator, which we refer to as nonlocal kernel network (NKN), that is resolution independent, characterized by deep neural networks, and capable of handling a variety of tasks such as learning governing equations and classifying images. Our NKN stems from the interpretation of the neural network as a discrete nonlocal diffusion reaction equation that, in the limit of infinite layers, is equivalent to a parabolic nonlocal equation, whose stability is analyzed via nonlocal vector calculus. The resemblance with integral forms of neural operators allows NKNs to capture long-range dependencies in the feature space, while the continuous treatment of node-to-node interactions makes NKNs resolution independent. The resemblance with neural ODEs, reinterpreted in a nonlocal sense, and the stable network dynamics between layers allow for generalization of NKN's optimal parameters from shallow to deep networks. This fact enables the use of shallow-to-deep initialization techniques [8]. Our tests show that NKNs outperform baseline methods in both learning governing equations and image classification tasks and generalize well to different resolutions and depths.

More Details

Development of Single Photon Sources in GaN

Mounce, Andrew M.; Wang, George W.; Schultz, Peter A.; Titze, Michael T.; Campbell, DeAnna M.; Lu, Ping L.; Henshaw, Jacob D.

The recent discovery of bright, room-temperature, single photon emitters in GaN leads to an appealing alternative to diamond best single photon emitters given the widespread use and technological maturity of III-nitrides for optoelectronics (e.g. blue LEDs, lasers) and high-speed, high-power electronics. This discovery opens the door to on-chip and on-demand single photon sources integrated with detectors and electronics. Currently, little is known about the underlying defect structure nor is there a sense of how such an emitter might be controllably created. A detailed understanding of the origin of the SPEs in GaN and a path to deterministically introduce them is required. In this project, we develop new experimental capabilities to then investigate single photon emission from GaN nanowires and both GAN and AlN wafers. We ion implant our wafers with the ion implanted with our focused ion beam nanoimplantation capabilities at Sandia, to go beyond typical broad beam implantation and create single photon emitting defects with nanometer precision. We've created light emitting sources using Li+ and He+, but single photon emission has yet to be demonstrated. In parallel, we calculate the energy levels of defects and transition metal substitutions in GaN to gain a better understanding of the sources of single photon emission in GaN and AlN. The combined experimental and theoretical capabilities developed throughout this project will enable further investigation into the origins of single photon emission from defects in GaN, AlN, and other wide bandgap semiconductors.

More Details

A fractional model for anomalous diffusion with increased variability: Analysis, algorithms and applications to interface problems

Numerical Methods for Partial Differential Equations

D'Elia, Marta D.; Glusa, Christian A.

Fractional equations have become the model of choice in several applications where heterogeneities at the microstructure result in anomalous diffusive behavior at the macroscale. In this work we introduce a new fractional operator characterized by a doubly-variable fractional order and possibly truncated interactions. Under certain conditions on the model parameters and on the regularity of the fractional order we show that the corresponding Poisson problem is well-posed. We also introduce a finite element discretization and describe an efficient implementation of the finite-element matrix assembly in the case of piecewise constant fractional order. Through several numerical tests, we illustrate the improved descriptive power of this new operator across media interfaces. Furthermore, we present one-dimensional and two-dimensional h-convergence results that show that the variable-order model has the same convergence behavior as the constant-order model.

More Details

Conflicting Information and Compliance With COVID-19 Behavioral Recommendations

Naugle, Asmeret B.; Rothganger, Fredrick R.; Verzi, Stephen J.; Doyle, Casey L.

The prevalence of COVID-19 is shaped by behavioral responses to recommendations and warnings. Available information on the disease determines the population’s perception of danger and thus its behavior; this information changes dynamically, and different sources may report conflicting information. We study the feedback between disease, information, and stay-at-home behavior using a hybrid agent-based-system dynamics model that incorporates evolving trust in sources of information. We use this model to investigate how divergent reporting and conflicting information can alter the trajectory of a public health crisis. The model shows that divergent reporting not only alters disease prevalence over time, but also increases polarization of the population’s behaviors and trust in different sources of information.

More Details

Monotonic Gaussian Process for Physics-Constrained Machine Learning With Materials Science Applications

Journal of Computing and Information Science in Engineering

Tran, Anh; Maupin, Kathryn A.; Rodgers, Theron R.

Physics-constrained machine learning is emerging as an important topic in the field of machine learning for physics. One of the most significant advantages of incorporating physics constraints into machine learning methods is that the resulting model requires significantly less data to train. By incorporating physical rules into the machine learning formulation itself, the predictions are expected to be physically plausible. Gaussian process (GP) is perhaps one of the most common methods in machine learning for small datasets. In this paper, we investigate the possibility of constraining a GP formulation with monotonicity on three different material datasets, where one experimental and two computational datasets are used. The monotonic GP is compared against the regular GP, where a significant reduction in the posterior variance is observed. The monotonic GP is strictly monotonic in the interpolation regime, but in the extrapolation regime, the monotonic effect starts fading away as one goes beyond the training dataset. Imposing monotonicity on the GP comes at a small accuracy cost, compared to the regular GP. The monotonic GP is perhaps most useful in applications where data are scarce and noisy, and monotonicity is supported by strong physical evidence.

More Details

Processing Particle Data Flows with SmartNICs

Liu, Jianshen L.; Maltzahn, Carlos M.; Curry, Matthew L.; Ulmer, Craig D.

Many distributed applications implement complex data flows and need a flexible mechanism for routing data between producers and consumers. Recent advances in programmable network interface cards, or SmartNICs, represent an opportunity to offload data-flow tasks into the network fabric, thereby freeing the hosts to perform other work. System architects in this space face multiple questions about the best way to leverage SmartNICs as processing elements in data flows. In this paper, we advocate the use of Apache Arrow as a foundation for implementing data-flow tasks on SmartNICs. We report on our experiences adapting a partitioning algorithm for particle data to Apache Arrow and measure the on-card processing performance for the BlueField-2 SmartNIC. Our experiments confirm that the BlueField-2’s (de)compression hardware can have a significant impact on in-transit workflows where data must be unpacked, processed, and repacked.

More Details

Entropy and its Relationship with Statistics

Lehoucq, Richard B.; Mayer, Carolyn D.; Tucker, James D.

The purpose of our report is to discuss the notion of entropy and its relationship with statistics. Our goal is to provide a manner in which you can think about entropy, its central role within information theory and relationship with statistics. We review various relationships between information theory and statistics—nearly all are well-known but unfortunately are often not recognized. Entropy quantities the "average amount of surprise" in a random variable and lies at the heart of information theory, which studies the transmission, processing, extraction, and utilization of information. For us, data is information. What is the distinction between information theory and statistics? Information theorists work with probability distributions. Instead, statisticians work with samples. In so many words, information theory using samples is the practice of statistics. Acknowledgements. We thank Danny Dunlavy, Carlos Llosa, Oscar Lopez, Arvind Prasadan, Gary Saavedra, Jeremy Wendt for helpful discussions along the way. Our report was supported by the Laboratory Directed Research and Development program at San- dia National Laboratories, a multimission laboratory managed and operated by National Technol- ogy and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell Inter- national, Inc., for the U.S. Department of Energy's National Nuclear Adminstration under contract DE-NA0003525.

More Details

Viability of S3 Object Storage for the ASC Program at Sandia

Kordenbrock, Todd H.; Templet, Gary J.; Ulmer, Craig D.; widenerpm, widenerpm

Recent efforts at Sandia such as DataSEA are creating search engines that enable analysts to query the institution’s massive archive of simulation and experiment data. The benefit of this work is that analysts will be able to retrieve all historical information about a system component that the institution has amassed over the years and make better-informed decisions in current work. As DataSEA gains momentum, it faces multiple technical challenges relating to capacity storage. From a raw capacity perspective, data producers will rapidly overwhelm the system with massive amounts of data. From an accessibility perspective, analysts will expect to be able to retrieve any portion of the bulk data, from any system on the enterprise network. Sandia’s Institutional Computing is mitigating storage problems at the enterprise level by procuring new capacity storage systems that can be accessed from anywhere on the enterprise network. These systems use the simple storage service, or S3, API for data transfers. While S3 uses objects instead of files, users can access it from their desktops or Sandia’s high-performance computing (HPC) platforms. S3 is particularly well suited for bulk storage in DataSEA, as datasets can be decomposed into object that can be referenced and retrieved individually, as needed by an analyst. In this report we describe our experiences working with S3 storage and provide information about how developers can leverage Sandia’s current systems. We present performance results from two sets of experiments. First, we measure S3 throughput when exchanging data between four different HPC platforms and two different enterprise S3 storage systems on the Sandia Restricted Network (SRN). Second, we measure the performance of S3 when communicating with a custom-built Ceph storage system that was constructed from HPC components. Overall, while S3 storage is significantly slower than traditional HPC storage, it provides significant accessibility benefits that will be valuable for archiving and exploiting historical data. There are multiple opportunities that arise from this work, including enhancing DataSEA to leverage S3 for bulk storage and adding native S3 support to Sandia’s IOSS library.

More Details

Comparison of exponential integrators and traditional time integration schemes for the shallow water equations

Applied Numerical Mathematics

Brachet, Matthieu; Debreu, Laurent; Eldred, Christopher

The time integration scheme is probably one of the most fundamental choices in the development of an ocean model. In this paper, we investigate several time integration schemes when applied to the shallow water equations. This set of equations is accurate enough for the modeling of a shallow ocean and is also relevant to study as it is the one solved for the barotropic (i.e. vertically averaged) component of a three dimensional ocean model. We analyze different time stepping algorithms for the linearized shallow water equations. High order explicit schemes are accurate but the time step is constrained by the Courant-Friedrichs-Lewy stability condition. Implicit schemes can be unconditionally stable but, in practice lack accuracy when used with large time steps. In this paper we propose a detailed comparison of such classical schemes with exponential integrators. The accuracy and the computational costs are analyzed in different configurations.

More Details

Embedded pairs for optimal explicit strong stability preserving Runge–Kutta methods

Journal of Computational and Applied Mathematics

Fekete, Imre; Conde, Sidafa; Shadid, John N.

We construct a family of embedded pairs for optimal explicit strong stability preserving Runge–Kutta methods of order 2≤p≤4 to be used to obtain numerical solution of spatially discretized hyperbolic PDEs. In this construction, the goals include non-defective property, large stability region, and small error values as defined in Dekker and Verwer (1984) and Kennedy et al. (2000). The new family of embedded pairs offer the ability for strong stability preserving (SSP) methods to adapt by varying the step-size. Through several numerical experiments, we assess the overall effectiveness in terms of work versus precision while also taking into consideration accuracy and stability.

More Details

Understanding Phase and Interfacial Effects of Spall Fracture in Additively Manufactured Ti-5Al-5V-5Mo-3Cr

Branch, Brittany A.; Ruggles, Timothy R.; Miers, John C.; Massey, Caroline E.; Moore, David G.; Brown, Nathan B.; Duwal, Sakun D.; Silling, Stewart A.; Mitchell, John A.; Specht, Paul E.

Additive manufactured Ti-5Al-5V-5Mo-3Cr (Ti-5553) is being considered as an AM repair material for engineering applications because of its superior strength properties compared to other titanium alloys. Here, we describe the failure mechanisms observed through computed tomography, electron backscatter diffraction (EBSD), and scanning electron microscopy (SEM) of spall damage as a result of tensile failure in as-built and annealed Ti-5553. We also investigate the phase stability in native powder, as-built and annealed Ti-5553 through diamond anvil cell (DAC) and ramp compression experiments. We then explore the effect of tensile loading on a sample containing an interface between a Ti-6Al-V4 (Ti-64) baseplate and additively manufactured Ti-5553 layer. Post-mortem materials characterization showed spallation occurred in regions of initial porosity and the interface provides a nucleation site for spall damage below the spall strength of Ti-5553. Preliminary peridynamics modeling of the dynamic experiments is described. Finally, we discuss further development of Stochastic Parallel PARticle Kinteic Simulator (SPPARKS) Monte Carlo (MC) capabilities to include the integration of alpha (α)-phase and microstructural simulations for this multiphase titanium alloy.

More Details

Super-Resolution Approaches in Three-Dimensions for Classification and Screening of Commercial-Off-The-Shelf Components

Polonsky, Andrew P.; Martinez, Carianne M.; Appleby, Catherine A.; Bernard, Sylvain R.; Griego, J.J.M.; Noell, Philip N.; Pathare, Priya R.

X-ray computed tomography is generally a primary step in characterization of defective electronic components, but is generally too slow to screen large lots of components. Super-resolution imaging approaches, in which higher-resolution data is inferred from lower-resolution images, have the potential to substantially reduce collection times for data volumes accessible via x-ray computed tomography. Here we seek to advance existing two-dimensional super-resolution approaches directly to three-dimensional computed tomography data. Multiple scan resolutions over a half order of magnitude of resolution were collected for four classes of commercial electronic components to serve as training data for a deep-learning, super-resolution network. A modular python framework for three-dimensional super-resolution of computed tomography data has been developed and trained over multiple classes of electronic components. Initial training and testing demonstrate the vast promise for these approaches, which have the potential for more than an order of magnitude reduction in collection time for electronic component screening.

More Details

Demonstrate multi-turbine simulation with hybrid-structured / unstructured-moving-grid software stack running primarily on GPUs and propose improvements for successful KPP-2

Bidadi, Shreyas B.; Brazell, Michael B.; Brunhart-Lupo, Nicholas B.; Henry de Frahan, Marc T.; Lee, Dong H.; Hu, Jonathan J.; Melvin, Jeremy M.; Mullowney, Paul M.; Vijayakumar, Ganesh V.; Moser, Robert D.; Rood, Jon R.; Sakievich, Philip S.; Sharma, Ashesh S.; Williams, Alan B.; Sprague, Michael A.

The goal of the ExaWind project is to enable predictive simulations of wind farms comprised of many megawatt-scale turbines situated in complex terrain. Predictive simulations will require computational fluid dynamics (CFD) simulations for which the mesh resolves the geometry of the turbines, capturing the thin boundary layers, and captures the rotation and large deflections of blades. Whereas such simulations for a single turbine are arguably petascale class, multi-turbine wind farm simulations will require exascale-class resources.

More Details

ATHENA: Analytical Tool for Heterogeneous Neuromorphic Architectures

Cardwell, Suma G.; Plagge, Mark P.; Hughes, Clayton H.; Rothganger, Fredrick R.; Agarwal, Sapan A.; Feinberg, Benjamin F.; Awad, Amro A.; mcfarland, john m.; Parker, Luke G.

The ASC program seeks to use machine learning to improve efficiencies in its stockpile stewardship mission. Moreover, there is a growing market for technologies dedicated to accelerating AI workloads. Many of these emerging architectures promise to provide savings in energy efficiency, area, and latency when compared to traditional CPUs for these types of applications — neuromorphic analog and digital technologies provide both low-power and configurable acceleration of challenging artificial intelligence (AI) algorithms. If designed into a heterogeneous system with other accelerators and conventional compute nodes, these technologies have the potential to augment the capabilities of traditional High Performance Computing (HPC) platforms [5]. This expanded computation space requires not only a new approach to physics simulation, but the ability to evaluate and analyze next-generation architectures specialized for AI/ML workloads in both traditional HPC and embedded ND applications. Developing this capability will enable ASC to understand how this hardware performs in both HPC and ND environments, improve our ability to port our applications, guide the development of computing hardware, and inform vendor interactions, leading them toward solutions that address ASC’s unique requirements.

More Details

Electron dynamics in extended systems within real-time time-dependent density-functional theory

MRS communications

Kononov, Alina K.; Lee, Cheng-Wei L.; Pereira dos Santos, Tatiane P.; Robinson, Brian R.; Yao, Yifan Y.; Yao, Yi Y.; Andrade, Xavier A.; Baczewski, Andrew D.; Constantinescu, Emil C.; Correa, Alfredo C.; Kanai, Yosuke K.; Modine, N.A.; Schleife, Andre S.

Due to a beneficial balance of computational cost and accuracy, real-time time-dependent density-functional theory has emerged as a promising first-principles framework to describe electron real-time dynamics. Here we discuss recent implementations around this approach, in particular in the context of complex, extended systems. Results include an analysis of the computational cost associated with numerical propagation and when using absorbing boundary conditions. We extensively explore the shortcomings for describing electron-electron scattering in real time and compare to many-body perturbation theory. Modern improvements of the description of exchange and correlation are reviewed. In this work, we specifically focus on the Qb@ll code, which we have mainly used for these types of simulations over the last years, and we conclude by pointing to further progress needed going forward.

More Details

Microstructure-Sensitive Uncertainty Quantification for Crystal Plasticity Finite Element Constitutive Models Using Stochastic Collocation Methods

Frontiers in Materials

Tran, Anh; Wildey, Tim; Lim, Hojun L.

Uncertainty quantification (UQ) plays a major role in verification and validation for computational engineering models and simulations, and establishes trust in the predictive capability of computational models. In the materials science and engineering context, where the process-structure-property-performance linkage is well known to be the only road mapping from manufacturing to engineering performance, numerous integrated computational materials engineering (ICME) models have been developed across a wide spectrum of length-scales and time-scales to relieve the burden of resource-intensive experiments. Within the structure-property linkage, crystal plasticity finite element method (CPFEM) models have been widely used since they are one of a few ICME toolboxes that allows numerical predictions, providing the bridge from microstructure to materials properties and performances. Several constitutive models have been proposed in the last few decades to capture the mechanics and plasticity behavior of materials. While some UQ studies have been performed, the robustness and uncertainty of these constitutive models have not been rigorously established. In this work, we apply a stochastic collocation (SC) method, which is mathematically rigorous and has been widely used in the field of UQ, to quantify the uncertainty of three most commonly used constitutive models in CPFEM, namely phenomenological models (with and without twinning), and dislocation-density-based constitutive models, for three different types of crystal structures, namely face-centered cubic (fcc) copper (Cu), body-centered cubic (bcc) tungsten (W), and hexagonal close packing (hcp) magnesium (Mg). Our numerical results not only quantify the uncertainty of these constitutive models in stress-strain curve, but also analyze the global sensitivity of the underlying constitutive parameters with respect to the initial yield behavior, which may be helpful for robust constitutive model calibration works in the future.

More Details

Unified Language Frontend for Physic-Informed AI/ML

Kelley, Brian M.; Rajamanickam, Sivasankaran R.

Artificial intelligence and machine learning (AI/ML) are becoming important tools for scientific modeling and simulation as in several other fields such as image analysis and natural language processing. ML techniques can leverage the computing power available in modern systems and reduce the human effort needed to configure experiments, interpret and visualize results, draw conclusions from huge quantities of raw data, and build surrogates for physics based models. Domain scientists in fields like fluid dynamics, microelectronics and chemistry can automate many of their most difficult and repetitive tasks or improve the design times by use of the faster ML-surrogates. However, modern ML and traditional scientific highperformance computing (HPC) tend to use completely different software ecosystems. While ML frameworks like PyTorch and TensorFlow provide Python APIs, most HPC applications and libraries are written in C++. Direct interoperability between the two languages is possible but is tedious and error-prone. In this work, we show that a compiler-based approach can bridge the gap between ML frameworks and scientific software with less developer effort and better efficiency. We use the MLIR (multi-level intermediate representation) ecosystem to compile a pre-trained convolutional neural network (CNN) in PyTorch to freestanding C++ source code in the Kokkos programming model. Kokkos is a programming model widely used in HPC to write portable, shared-memory parallel code that can natively target a variety of CPU and GPU architectures. Our compiler-generated source code can be directly integrated into any Kokkosbased application with no dependencies on Python or cross-language interfaces.

More Details

Lossless Quantum Hard-Drive Memory Using Parity-Time Symmetry

Chatterjee, Eric N.; Soh, Daniel B.; Young, Steve M.

We theoretically studied the feasibility of building a long-term read-write quantum memory using the principle of parity-time (PT) symmetry, which has already been demonstrated for classical systems. The design consisted of a two-resonator system. Although both resonators would feature intrinsic loss, the goal was to apply a driving signal to one of the resonators such that it would become an amplifying subsystem, with a gain rate equal and opposite to the loss rate of the lossy resonator. Consequently, the loss and gain probabilities in the overall system would cancel out, yielding a closed quantum system. Upon performing detailed calculations on the impact of a driving signal on a lossy resonator, our results demonstrated that an amplifying resonator is physically unfeasible, thus forestalling the possibility of PT-symmetric quantum storage. Our finding serves to significantly narrow down future research into designing a viable quantum hard drive.

More Details

Mathematical Foundations for Nonlocal Interface Problems: Multiscale Simulations of Heterogeneous Materials (Final LDRD Report)

D'Elia, Marta D.; Bochev, Pavel B.; Foster, John E.; Glusa, Christian A.; Gulian, Mamikon G.; Gunzburger, Max G.; Trageser, Jeremy T.; Kuhlman, Kristopher L.; Martinez, Mario A.; Najm, H.N.; Silling, Stewart A.; Tupek, Michael T.; Xu, Xiao X.

Nonlocal models provide a much-needed predictive capability for important Sandia mission applications, ranging from fracture mechanics for nuclear components to subsurface flow for nuclear waste disposal, where traditional partial differential equations (PDEs) models fail to capture effects due to long-range forces at the microscale and mesoscale. However, utilization of this capability is seriously compromised by the lack of a rigorous nonlocal interface theory, required for both application and efficient solution of nonlocal models. To unlock the full potential of nonlocal modeling we developed a mathematically rigorous and physically consistent interface theory and demonstrate its scope in mission-relevant exemplar problems.

More Details

Large-Scale Atomistic Simulations [Slides]

Moore, Stan G.

This report investigates free expansion of Aluminum and provides a take home message of "The physically realistic SNAP machine-learning potential captures liquid-vapor coexistence behavior for free expansion of aluminum at a level not generally accessible to hydrocodes".

More Details

GDSA Framework Development and Process Model Integration FY2022

Mariner, Paul M.; Debusschere, Bert D.; Fukuyama, David E.; Harvey, Jacob H.; LaForce, Tara; Leone, Rosemary C.; Perry, Frank V.; Swiler, Laura P.; TACONI, ANNA M.

The Spent Fuel and Waste Science and Technology (SFWST) Campaign of the U.S. Department of Energy (DOE) Office of Nuclear Energy (NE), Office of Spent Fuel & Waste Disposition (SFWD) is conducting research and development (R&D) on geologic disposal of spent nuclear fuel (SNF) and high-level nuclear waste (HLW). A high priority for SFWST disposal R&D is disposal system modeling (Sassani et al. 2021). The SFWST Geologic Disposal Safety Assessment (GDSA) work package is charged with developing a disposal system modeling and analysis capability for evaluating generic disposal system performance for nuclear waste in geologic media. This report describes fiscal year (FY) 2022 advances of the Geologic Disposal Safety Assessment (GDSA) performance assessment (PA) development groups of the SFWST Campaign. The common mission of these groups is to develop a geologic disposal system modeling capability for nuclear waste that can be used to assess probabilistically the performance of generic disposal options and generic sites. The modeling capability under development is called GDSA Framework (pa.sandia.gov). GDSA Framework is a coordinated set of codes and databases designed for probabilistically simulating the release and transport of disposed radionuclides from a repository to the biosphere for post-closure performance assessment. Primary components of GDSA Framework include PFLOTRAN to simulate the major features, events, and processes (FEPs) over time, Dakota to propagate uncertainty and analyze sensitivities, meshing codes to define the domain, and various other software for rendering properties, processing data, and visualizing results.

More Details

Revealing conductivity of p-type delta layer systems for novel computing applications

Mamaluy, Denis M.; Mendez Granado, Juan P.

This project uses a quantum simulation technique to reveal the true conducting properties of novel atomic precision advanced manufacturing materials. With Moore's law approaching the limit of scaling for the CMOS technology, it is crucial to provide the best computing power and resources to National Security missions. Atomic precision advanced manufacturing-based computing systems can become the key to the design, use, and security of modern weapon systems, critical infrastructure, and communications. We will utilize the state-of-the-art computational methodology to create a predictive simulator for p-type atomic precision advanced manufacturing systems, which may also find applications in counterfeit detection and anti-tamper.

More Details

First-principles simulation of light-ion microscopy of graphene

2D Materials

Kononov, Alina K.; Olmstead, Alexandra L.; Baczewski, Andrew D.; Schleife, Andre S.

The extreme sensitivity of 2D materials to defects and nanostructure requires precise imaging techniques to verify presence of desirable and absence of undesirable features in the atomic geometry. Helium-ion beams have emerged as a promising materials imaging tool, achieving up to 20 times higher resolution and 10 times larger depth-of-field than conventional or environmental scanning electron microscopes. Here, we offer first-principles theoretical insights to advance ion-beam imaging of atomically thin materials by performing real-time time-dependent density functional theory simulations of single impacts of 10–200 keV light ions in free-standing graphene. Here we predict that detecting electrons emitted from the back of the material (the side from which the ion exits) would result in up to three times higher signal and up to five times higher contrast images, making 2D materials especially compelling targets for ion-beam microscopy. This predicted superiority of exit-side emission likely arises from anisotropic kinetic emission. The charge induced in the graphene equilibrates on a sub-fs time scale, leading to only slight disturbances in the carbon lattice that are unlikely to damage the atomic structure for any of the beam parameters investigated here.

More Details

Sensitivity analysis of generic deep geologic repository with focus on spatial heterogeneity induced by stochastic fracture network generation

Advances in Water Resources

Brooks, Dusty M.; Swiler, Laura P.; Stein, Emily S.; Mariner, Paul M.; Basurto, Eduardo B.; Portone, Teresa P.; Eckert, Aubrey C.; Leone, Rosemary C.

Geologic Disposal Safety Assessment Framework is a state-of-the-art simulation software toolkit for probabilistic post-closure performance assessment of systems for deep geologic disposal of nuclear waste developed by the United States Department of Energy. This paper presents a generic reference case and shows how it is being used to develop and demonstrate performance assessment methods within the Geologic Disposal Safety Assessment Framework that mitigate some of the challenges posed by high uncertainty and limited computational resources. Variance-based global sensitivity analysis is applied to assess the effects of spatial heterogeneity using graph-based summary measures for scalar and time-varying quantities of interest. Behavior of the system with respect to spatial heterogeneity is further investigated using ratios of water fluxes. This analysis shows that spatial heterogeneity is a dominant uncertainty in predictions of repository performance which can be identified in global sensitivity analysis using proxy variables derived from graph descriptions of discrete fracture networks. New quantities of interest defined using water fluxes proved useful for better understanding overall system behavior.

More Details

Composing preconditioners for multiphysics PDE systems with applications to Generalized MHD

Tuminaro, Raymond S.; Crockatt, Michael M.; Robinson, Allen C.

New patch smoothers or relaxation techniques are developed for solving linear matrix equations coming from systems of discretized partial differential equations (PDEs). One key linear solver challenge for many PDE systems arises when the resulting discretization matrix has a near null space that has a large dimension, which can occur in generalized magnetohydrodynamic (GMHD) systems. Patch-based relaxation is highly effective for problems when the null space can be spanned by a basis of locally supported vectors. The patch-based relaxation methods that we develop can be used either within an algebraic multigrid (AMG) hierarchy or as stand-alone preconditioners. These patch-based relaxation techniques are a form of well-known overlapping Schwarz methods where the computational domain is covered with a series of overlapping sub-domains (or patches). Patch relaxation then corresponds to solving a set of independent linear systems associated with each patch. In the context of GMHD, we also reformulate the underlying discrete representation used to generate a suitable set of matrix equations. In general, deriving a discretization that accurately approximates the curl operator and the Hall term while also producing linear systems with physically meaningful near null space properties can be challenging. Unfortunately, many natural discretization choices lead to a near null space that includes non-physical oscillatory modes and where it is not possible to span the near null space with a minimal set of locally supported basis vectors. Further discretization research is needed to understand the resulting trade-offs between accuracy, stability, and ease in solving the associated linear systems.

More Details

Enabling power measurement and control on Astra: The first petascale Arm supercomputer

Concurrency and Computation. Practice and Experience

Grant, Ryan E.; Hammond, Simon D.; Laros, James H.; Levenhagen, Michael J.; Olivier, Stephen L.; Pedretti, Kevin P.; Ward, H.L.; Younge, Andrew J.

Astra, deployed in 2018, was the first petascale supercomputer to utilize processors based on the ARM instruction set. The system was also the first under Sandia's Vanguard program which seeks to provide an evaluation vehicle for novel technologies that with refinement could be utilized in demanding, large-scale HPC environments. In addition to ARM, several other important first-of-a-kind developments were used in the machine, including new approaches to cooling the datacenter and machine. Here we document our experiences building a power measurement and control infrastructure for Astra. While this is often beyond the control of users today, the accurate measurement, cataloging, and evaluation of power, as our experiences show, is critical to the successful deployment of a large-scale platform. While such systems exist in part for other architectures, Astra required new development to support the novel Marvell ThunderX2 processor used in compute nodes. In addition to documenting the measurement of power during system bring up and for subsequent on-going routine use, we present results associated with controlling the power usage of the processor, an area which is becoming of progressively greater interest as data centers and supercomputing sites look to improve compute/energy efficiency and find additional sources for full system optimization.

More Details

Thermodynamically consistent versions of approximations used in modelling moist air

Quarterly Journal of the Royal Meteorological Society

Eldred, Christopher; Guba, Oksana G.; Taylor, Mark A.

Some existing approaches to modelling the thermodynamics of moist air make approximations that break thermodynamic consistency, such that the resulting thermodynamics does not obey the first and second laws or has other inconsistencies. Recently, an approach to avoid such inconsistency has been suggested: the use of thermodynamic potentials in terms of their natural variables, from which all thermodynamic quantities and relationships (equations of state) are derived. In this article, we develop this approach for unapproximated moist-air thermodynamics and two widely used approximations: the constant-κ approximation and the dry heat capacities approximation. The (consistent) constant-κ approximation is particularly attractive because it leads to, with the appropriate choice of thermodynamic variable, adiabatic dynamics that depend only on total mass and are independent of the breakdown between water forms. Additionally, a wide variety of material from different sources in the literature on thermodynamics in atmospheric modelling is brought together. It is hoped that this article provides a comprehensive reference for the use of thermodynamic potentials in atmospheric modelling, especially for the three systems considered here.

More Details

Metrics for Intercomparison of Remapping Algorithms (MIRA) protocol applied to Earth system models

Geoscientific Model Development

Mahadevan, Vijay S.; Guerra, Jorge E.; Jiao, Xiangmin; Kuberry, Paul A.; Li, Yipeng; Ullrich, Paul; Marsico, David; Jacob, Robert; Bochev, Pavel B.; Jones, Philip

Strongly coupled nonlinear phenomena such as those described by Earth system models (ESMs) are composed of multiple component models with independent mesh topologies and scalable numerical solvers. A common operation in ESMs is to remap or interpolate component solution fields defined on their computational mesh to another mesh with a different combinatorial structure and decomposition, e.g., from the atmosphere to the ocean, during the temporal integration of the coupled system. Several remapping schemes are currently in use or available for ESMs. However, a unified approach to compare the properties of these different schemes has not been attempted previously. We present a rigorous methodology for the evaluation and intercomparison of remapping methods through an independently implemented suite of metrics that measure the ability of a method to adhere to constraints such as grid independence, monotonicity, global conservation, and local extrema or feature preservation. A comprehensive set of numerical evaluations is conducted based on a progression of scalar fields from idealized and smooth to more general climate data with strong discontinuities and strict bounds. We examine four remapping algorithms with distinct design approaches, namely ESMF Regrid , TempestRemap , generalized moving least squares (GMLS) with post-processing filters, and WLS-ENOR . By repeated iterative application of the high-order remapping methods to the test fields, we verify the accuracy of each scheme in terms of their observed convergence order for smooth data and determine the bounded error propagation using challenging, realistic field data on both uniform and regionally refined mesh cases. In addition to retaining high-order accuracy under idealized conditions, the methods also demonstrate robust remapping performance when dealing with non-smooth data. There is a failure to maintain monotonicity in the traditional L2-minimization approaches used in ESMF and TempestRemap, in contrast to stable recovery through nonlinear filters used in both meshless GMLS and hybrid mesh-based WLS-ENOR schemes. Local feature preservation analysis indicates that high-order methods perform better than low-order dissipative schemes for all test cases. The behavior of these remappers remains consistent when applied on regionally refined meshes, indicating mesh-invariant implementations. The MIRA intercomparison protocol proposed in this paper and the detailed comparison of the four algorithms demonstrate that the new schemes, namely GMLS and WLS-ENOR, are competitive compared to standard conservative minimization methods requiring computation of mesh intersections. The work presented in this paper provides a foundation that can be extended to include complex field definitions, realistic mesh topologies, and spectral element discretizations, thereby allowing for a more complete analysis of production-ready remapping packages.

More Details

Dynamics Informed Optimization for Resilient Energy Systems

Arguello, Bryan A.; Stewart, Nathan; Hoffman, Matthew J.; Nicholson, Bethany L.; Garrett, Richard A.; Moog, Emily R.

Optimal mitigation planning for highly disruptive contingencies to a transmission-level power system requires optimization with dynamic power system constraints, due to the key role of dynamics in system stability to major perturbations. We formulate a generalized disjunctive program to determine optimal grid component hardening choices for protecting against major failures, with differential algebraic constraints representing system dynamics (specifically, differential equations representing generator and load behavior and algebraic equations representing instantaneous power balance over the transmission system). We optionally allow stochastic optimal pre-positioning across all considered failure scenarios, and optimal emergency control within each scenario. This novel formulation allows, for the first time, analyzing the resilience interdependencies of mitigation planning, preventive control, and emergency control. Using all three strategies in concert is particularly effective at maintaining robust power system operation under severe contingencies, as we demonstrate on the Western System Coordinating Council (WSCC) 9-bus test system using synthetic multi-device outage scenarios. Towards integrating our modeling framework with real threats and more realistic power systems, we explore applying hybrid dynamics to power systems. Our work is applied to basic RL circuits with the ultimate goal of using the methodology to model protective tripping schemes in the grid. Finally, we survey mitigation techniques for HEMP threats and describe a GIS application developed to create threat scenarios in a grid with geographic detail.

More Details

An introduction to developing GitLab/Jacamar runner analyst centric workflows at Sandia

Robinson, Allen C.; Swan, Matthew S.; Harvey, Evan C.; Klein, Brandon T.; Lawson, Gary L.; Milewicz, Reed M.; Pedretti, Kevin P.; Schmitz, Mark E.; Warnock, Scott A.

This document provides very basic background information and initial enabling guidance for computational analysts to develop and utilize GitOps practices within the Common Engineering Environment (CEE) and High Performance Computing (HPC) computational environment at Sandia National Laboratories through GitLab/Jacamar runner based workflows.

More Details

Deployment of Multifidelity Uncertainty Quantification for Thermal Battery Assessment Part I: Algorithms and Single Cell Results

Eldred, Michael S.; Adams, Brian M.; Geraci, Gianluca G.; Portone, Teresa P.; Ridgway, Elliott M.; Stephens, John A.; Wildey, Timothy M.

This report documents the results of an FY22 ASC V&V level 2 milestone demonstrating new algorithms for multifidelity uncertainty quantification. Part I of the report describes the algorithms, studies their performance on a simple model problem, and then deploys the methods to a thermal battery example from the open literature. Part II (restricted distribution) applies the multifidelity UQ methods to specific thermal batteries of interest to the NNSA/ASC program.

More Details

Neuromorphic Information Processing by Optical Media

Leonard, Francois L.; Fuller, Elliot J.; Teeter, Corinne M.; Vineyard, Craig M.

Classification of features in a scene typically requires conversion of the incoming photonic field int the electronic domain. Recently, an alternative approach has emerged whereby passive structured materials can perform classification tasks by directly using free-space propagation and diffraction of light. In this manuscript, we present a theoretical and computational study of such systems and establish the basic features that govern their performance. We show that system architecture, material structure, and input light field are intertwined and need to be co-designed to maximize classification accuracy. Our simulations show that a single layer metasurface can achieve classification accuracy better than conventional linear classifiers, with an order of magnitude fewer diffractive features than previously reported. For a wavelength λ, single layer metasurfaces of size 100λ x 100λ with aperture density λ-2 achieve ~96% testing accuracy on the MNIST dataset, for an optimized distance ~100λ to the output plane. This is enabled by an intrinsic nonlinearity in photodetection, despite the use of linear optical metamaterials. Furthermore, we find that once the system is optimized, the number of diffractive features is the main determinant of classification performance. The slow asymptotic scaling with the number of apertures suggests a reason why such systems may benefit from multiple layer designs. Finally, we show a trade-off between the number of apertures and fabrication noise.

More Details

Sensitivity Analyses for Monte Carlo Sampling-Based Particle Simulations

Bond, Stephen D.; Franke, Brian C.; Lehoucq, Richard B.; McKinley, Scott M.

Computational design-based optimization is a well-used tool in science and engineering. Our report documents the successful use of a particle sensitivity analysis for design-based optimization within Monte Carlo sampling-based particle simulation—a currently unavailable capability. Such a capability enables the particle simulation communities to go beyond forward simulation and promises to reduce the burden on overworked analysts by getting more done with less computation.

More Details

Quantum-Accurate Multiscale Modeling of Shock Hugoniots, Ramp Compression Paths, Structural and Magnetic Phase Transitions, and Transport Properties in Highly Compressed Metals

Wood, Mitchell A.; Nikolov, Svetoslav V.; Rohskopf, Andrew D.; Desjarlais, Michael P.; Cangi, Attila C.; Tranchida, Julien T.

Fully characterizing high energy density (HED) phenomena using pulsed power facilities (Z machine) and coherent light sources is possible only with complementary numerical modeling for design, diagnostic development, and data interpretation. The exercise of creating numerical tests, that match experimental conditions, builds critical insight that is crucial for the development of a strong fundamental understanding of the physics behind HED phenomena and for the design of next generation pulsed power facilities. The persistence of electron correlation in HED ma- terials arising from Coulomb interactions and the Pauli exclusion principle is one of the greatest challenges for accurate numerical modeling and has hitherto impeded our ability to model HED phenomena across multiple length and time scales at sufficient accuracy. An exemplar is a fer- romagnetic material like iron, while familiar and widely used, we lack a simulation capability to characterize the interplay of structure and magnetic effects that govern material strength, ki- netics of phase transitions and other transport properties. Herein we construct and demonstrate the Molecular-Spin Dynamics (MSD) simulation capability for iron from ambient to earth core conditions, all software advances are open source and presently available for broad usage. These methods are multi-scale in nature, direct comparisons between high fidelity density functional the- ory (DFT) and linear-scaling MSD simulations is done throughout this work, with advancements made to MSD allowing for electronic structure changes being reflected in classical dynamics. Main takeaways for the project include insight into the role of magnetic spins on mechanical properties and thermal conductivity, development of accurate interatomic potentials paired with spin Hamil- tonians, and characterization of the high pressure melt boundary that is of critical importance to planetary modeling efforts.

More Details

Multi-fidelity information fusion and resource allocation

Jakeman, John D.; Eldred, Michael S.; Geraci, Gianluca G.; Seidl, Daniel T.; Smith, Thomas M.; Gorodetsky, Alex A.; Pham, Trung P.; Narayan, Akil N.; Zeng, Xiaoshu Z.; Ghanem, Roger G.

This project created and demonstrated a framework for the efficient and accurate prediction of complex systems with only a limited amount of highly trusted data. These next generation computational multi-fidelity tools fuse multiple information sources of varying cost and accuracy to reduce the computational and experimental resources needed for designing and assessing complex multi-physics/scale/component systems. These tools have already been used to substantially improve the computational efficiency of simulation aided modeling activities from assessing thermal battery performance to predicting material deformation. This report summarizes the work carried out during a two year LDRD project. Specifically we present our technical accomplishments; project outputs such as publications, presentations and professional leadership activities; and the project’s legacy.

More Details

Model-Form Epistemic Uncertainty Quantification for Modeling with Differential Equations: Application to Epidemiology

Acquesta, Erin A.; Portone, Teresa P.; Dandekar, Raj D.; Rackauckas, Chris R.; Bandy, Rileigh J.; Huerta, Jose G.; Dytzel, India L.

Modeling real-world phenomena to any degree of accuracy is a challenge that the scientific research community has navigated since its foundation. Lack of information and limited computational and observational resources necessitate modeling assumptions which, when invalid, lead to model-form error (MFE). The work reported herein explored a novel method to represent model-form uncertainty (MFU) that combines Bayesian statistics with the emerging field of universal differential equations (UDEs). The fundamental principle behind UDEs is simple: use known equational forms that govern a dynamical system when you have them; then incorporate data-driven approaches – in this case neural networks (NNs) – embedded within the governing equations to learn the interacting terms that were underrepresented. Utilizing epidemiology as our motivating exemplar, this report will highlight the challenges of modeling novel infectious diseases while introducing ways to incorporate NN approximations to MFE. Prior to embarking on a Bayesian calibration, we first explored methods to augment the standard (non-Bayesian) UDE training procedure to account for uncertainty and increase robustness of training. In addition, it is often the case that uncertainty in observations is significant; this may be due to randomness or lack of precision in the measurement process. This uncertainty typically manifests as “noisy” observations which deviate from a true underlying signal. To account for such variability, the NN approximation to MFE is endowed with a probabilistic representation and is updated using available observational data in a Bayesian framework. By representing the MFU explicitly and deploying an embedded, data-driven model, this approach enables an agile, expressive, and interpretable method for representing MFU. In this report we will provide evidence that Bayesian UDEs show promise as a novel framework for any science-based, data-driven MFU representation; while emphasizing that significant advances must be made in the calibration of Bayesian NNs to ensure a robust calibration procedure.

More Details

Accelerating Multiscale Materials Modeling with Machine Learning

Modine, N.A.; Stephens, John A.; Swiler, Laura P.; Thompson, Aidan P.; Vogel, Dayton J.; Cangi, Attila C.; Feilder, Lenz F.; Rajamanickam, Sivasankaran R.

The focus of this project is to accelerate and transform the workflow of multiscale materials modeling by developing an integrated toolchain seamlessly combining DFT, SNAP, LAMMPS, (shown in Figure 1-1) and a machine-learning (ML) model that will more efficiently extract information from a smaller set of first-principles calculations. Our ML model enables us to accelerate first-principles data generation by interpolating existing high fidelity data, and extend the simulation scale by extrapolating high fidelity data (102 atoms) to the mesoscale (104 atoms). It encodes the underlying physics of atomic interactions on the microscopic scale by adapting a variety of ML techniques such as deep neural networks (DNNs), and graph neural networks (GNNs). We developed a new surrogate model for density functional theory using deep neural networks. The developed ML surrogate is demonstrated in a workflow to generate accurate band energies, total energies, and density of the 298K and 933K Aluminum systems. Furthermore, the models can be used to predict the quantities of interest for systems with more number of atoms than the training data set. We have demonstrated that the ML model can be used to compute the quantities of interest for systems with 100,000 Al atoms. When compared with 2000 Al system the new surrogate model is as accurate as DFT, but three orders of magnitude faster. We also explored optimal experimental design techniques to choose the training data and novel Graph Neural Networks to train on smaller data sets. These are promising methods that need to be explored in the future.

More Details

Differential geometric approaches to momentum-based formulations for fluids [Slides]

Eldred, Christopher

This SAND report documents CIS Late Start LDRD Project 22-0311, "Differential geometric approaches to momentum-based formulations for fluids". The project primarily developed geometric mechanics formulations for momentum-based descriptions of nonrelativistic fluids, utilizing a differential geometry/exterior calculus treatment of momentum and a space+time splitting. Specifically, the full suite of geometric mechanics formulations (variational/Lagrangian, Lie-Poisson Hamiltonian and Curl-Form Hamiltonian) were developed in terms of exterior calculus using vector-bundle valued differential forms. This was done for a fairly general version of semi-direct product theory sufficient to cover a wide range of both neutral and charged fluid models, including compressible Euler, magnetohydrodynamics and Euler-Maxwell. As a secondary goal, this project also explored the connection between geometric mechanics formulations and the more traditional Godunov form (a hyperbolic system of conservation laws). Unfortunately, this stage did not produce anything particularly interesting, due to unforeseen technical difficulties. There are two publications related to this work currently in preparation, and this work will be presented at SIAM CSE 23, at which the PI is organizing a mini-symposium on geometric mechanics formulations and structure-preserving discretizations for fluids. The logical next step is to utilize the exterior calculus based understanding of momentum coupled with geometric mechanics formulations to develop (novel) structure-preserving discretizations of momentum. This is the main subject of a successful FY23 CIS LDRD "Structure-preserving discretizations for momentum-based formulations of fluids".

More Details

Towards Z-Next: The Integration of Theory, Experiments, and Computational Simulation in a Bayesian Data Assimilation Framework

Maupin, Kathryn A.; Tran, Anh; Lewis, William L.; Knapp, Patrick K.; Joseph, V.R.; Wu, C.F.J.; Glinsky, Michael G.; Valaitis, Sonata V.

Making reliable predictions in the presence of uncertainty is critical to high-consequence modeling and simulation activities, such as those encountered at Sandia National Laboratories. Surrogate or reduced-order models are often used to mitigate the expense of performing quality uncertainty analyses with high-fidelity, physics-based codes. However, phenomenological surrogate models do not always adhere to important physics and system properties. This project develops surrogate models that integrate physical theory with experimental data through a maximally-informative framework that accounts for the many uncertainties present in computational modeling problems. Correlations between relevant outputs are preserved through the use of multi-output or co-predictive surrogate models; known physical properties (specifically monotoncity) are also preserved; and unknown physics and phenomena are detected using a causal analysis. By endowing surrogate models with key properties of the physical system being studied, their predictive power is arguably enhanced, allowing for reliable simulations and analyses at a reduced computational cost.

More Details

Combining Physics and Machine Learning for the Next Generation of Molecular Simulation

Rackers, Joshua R.

Simulating molecules and atomic systems at quantum accuracy is a grand challenge for science in the 21st century. Quantum-accurate simulations would enable the design of new medicines and the discovery of new materials. The defining problem in this challenge is that quantum calculations on large molecules, like proteins or DNA, are fundamentally impossible with current algorithms. In this work, we explore a range of different methods that aim to make large, quantum-accurate simulations possible. We show that using advanced classical models, we can accurately simulate ion channels, an important biomolecular system. We show how advanced classical models can be implemented in an exascale-ready software package. Lastly, we show how machine learning can learn the laws of quantum mechanics from data and enable quantum electronic structure calculations on thousands of atoms, a feat that is impossible for current algorithms. Altogether, this work shows that combining advances in physics models, computing, and machine learning, we are moving closer to the reality of accurately simulating our molecular world.

More Details

AI-enhanced Codesign for Next-Generation Neuromorphic Circuits and Systems

Cardwell, Suma G.; Smith, John D.; Crowder, Douglas C.

This report details work that was completed to address the Fiscal Year 2022 Advanced Science and Technology (AS&T) Laboratory Directed Research and Development (LDRD) call for “AI-enhanced Co-Design of Next Generation Microelectronics.” This project required concurrent contributions from the fields of 1) materials science, 2) devices and circuits, 3) physics of computing, and 4) algorithms and system architectures. During this project, we developed AI-enhanced circuit design methods that relied on reinforcement learning and evolutionary algorithms. The AI-enhanced design methods were tested on neuromorphic circuit design problems that have real-world applications related to Sandia’s mission needs. The developed methods enable the design of circuits, including circuits that are built from emerging devices, and they were also extended to enable novel device discovery. We expect that these AI-enhanced design methods will accelerate progress towards developing next-generation, high-performance neuromorphic computing systems.

More Details

Using ultrasonic attenuation in cortical bone to infer distributions on pore size

Applied Mathematical Modelling

White, Rebekah D.; Alexanderian, A.; Yousefian, O.; Karbalaeisadegh, Y.; Bekele-Maxwell, K.; Kasali, A.; Banks, H.T.; Talmant, M.; Grimal, Q.; Muller, M.

In this work we infer the underlying distribution on pore radius in human cortical bone samples using ultrasonic attenuation data. We first discuss how to formulate polydisperse attenuation models using a probabilistic approach and the Waterman Truell model for scattering attenuation. We then compare the Independent Scattering Approximation and the higher-order Waterman Truell models’ forward predictions for total attenuation in polydisperse samples. Following this, we formulate an inverse problem under the Prohorov Metric Framework coupled with variational regularization to stabilize this inverse problem. We then use experimental attenuation data taken from human cadaver samples and solve inverse problems resulting in nonparametric estimates of the probability density function on pore radius. We compare these estimates to the “true” microstructure of the bone samples determined via microCT imaging. We find that our methodology allows us to reliably estimate the underlying microstructure of the bone from attenuation data.

More Details

Progress in Modeling the 2019 Extended Magnetically Insulated Transmission Line (MITL) and Courtyard Environment Trial at HERMES-III

Cartwright, Keith C.; Pointon, Tim P.; Powell, Troy C.; Grabowski, Theodore C.; Shields, Sidney S.; Sirajuddin, David S.; Jensen, Daniel S.; Renk, Timothy J.; Cyr, Eric C.; Stafford, David S.; Swan, Matthew S.; Mitra, Sudeep M.; McDoniel, William M.; Moore, Christopher H.

This report documents the progress made in simulating the HERMES-III Magnetically Insulated Transmission Line (MITL) and courtyard with EMPIRE and ITS. This study focuses on the shots that were taken during the months of June and July of 2019 performed with the new MITL extension. There were a few shots where there was dose mapping of the courtyard, 11132, 11133, 11134, 11135, 11136, and 11146. This report focuses on these shots because there was full data return from the MITL electrical diagnostics and the radiation dose sensors in the courtyard. The comparison starts with improving the processing of the incoming voltage into the EMPIRE simulation from the experiment. The currents are then compared at several location along the MITL. The simulation results of the electrons impacting the anode are shown. The electron impact energy and angle is then handed off to ITS which calculates the dose on the faceplate and locations in the courtyard and they are compared to experimental measurements. ITS also calculates the photons and electrons that are injected into the courtyard, these quantities are then used by EMPIRE to calculated the photon and electron transport in the courtyard. The details for the algorithms used to perform the courtyard simulations are presented as well as qualitative comparisons of the electric field, magnetic field, and the conductivity in the courtyard. Because of the computational burden of these calculations the pressure was reduce in the courtyard to reduce the computational load. The computation performance is presented along with suggestion on how to improve both the computational performance as well as the algorithmic performance. Some of the algorithmic changed would reduce the accuracy of the models and detail comparison of these changes are left for a future study. As well as, list of code improvements there is also a list of suggested experimental improvements to improve the quality of the data return.

More Details

Resilience Enhancements through Deep Learning Yields

Eydenberg, Michael S.; Batsch-Smith, Lisa B.; Bice, Charles T.; Blakely, Logan; Bynum, Michael L.; Boukouvala, Fani B.; Castillo, Anya C.; Haddad, Joshua H.; Hart, William E.; Jalving, Jordan H.; Kilwein, Zachary A.; Laird, Carl D.; Skolfield, Joshua K.

This report documents the Resilience Enhancements through Deep Learning Yields (REDLY) project, a three-year effort to improve electrical grid resilience by developing scalable methods for system operators to protect the grid against threats leading to interrupted service or physical damage. The computational complexity and uncertain nature of current real-world contingency analysis presents significant barriers to automated, real-time monitoring. While there has been a significant push to explore the use of accurate, high-performance machine learning (ML) model surrogates to address this gap, their reliability is unclear when deployed in high-consequence applications such as power grid systems. Contemporary optimization techniques used to validate surrogate performance can exploit ML model prediction errors, which necessitates the verification of worst-case performance for the models.

More Details

Improving Predictive Capability in REHEDS Simulations with Fast, Accurate, and Consistent Non-Equilibrium Material Properties

Hansen, Stephanie B.; Baczewski, Andrew D.; Gomez, T.A.; Hentschel, T.W.; Jennings, Christopher A.; Kononov, Alina K.; Nagayama, Taisuke N.; Adler, Kelsey A.; Cangi, A.C.; Cochrane, Kyle C.; Schleife, A. &.

Predictive design of REHEDS experiments with radiation-hydrodynamic simulations requires knowledge of material properties (e.g. equations of state (EOS), transport coefficients, and radiation physics). Interpreting experimental results requires accurate models of diagnostic observables (e.g. detailed emission, absorption, and scattering spectra). In conditions of Local Thermodynamic Equilibrium (LTE), these material properties and observables can be pre-computed with relatively high accuracy and subsequently tabulated on simple temperature-density grids for fast look-up by simulations. When radiation and electron temperatures fall out of equilibrium, however, non-LTE effects can profoundly change material properties and diagnostic signatures. Accurately and efficiently incorporating these non-LTE effects has been a longstanding challenge for simulations. At present, most simulations include non-LTE effects by invoking highly simplified inline models. These inline non-LTE models are both much slower than table look-up and significantly less accurate than the detailed models used to populate LTE tables and diagnose experimental data through post-processing or inversion. Because inline non-LTE models are slow, designers avoid them whenever possible, which leads to known inaccuracies from using tabular LTE. Because inline models are simple, they are inconsistent with tabular data from detailed models, leading to ill-known inaccuracies, and they cannot generate detailed synthetic diagnostics suitable for direct comparisons with experimental data. This project addresses the challenge of generating and utilizing efficient, accurate, and consistent non-equilibrium material data along three complementary but relatively independent research lines. First, we have developed a relatively fast and accurate non-LTE average-atom model based on density functional theory (DFT) that provides a complete set of EOS, transport, and radiative data, and have rigorously tested it against more sophisticated first-principles multi-atom DFT models, including time-dependent DFT. Next, we have developed a tabular scheme and interpolation methods that compactly capture non-LTE effects for use in simulations and have implemented these tables in the GORGON magneto-hydrodynamic (MHD) code. Finally, we have developed post-processing tools that use detailed tabulated non-LTE data to directly predict experimental observables from simulation output.

More Details

Adaptive Space-Time Methods for Large Scale Optimal Design

DiPietro, Kelsey L.; Ridzal, Denis R.; Morales, Diana M.

When modeling complex physical systems with advanced dynamics, such as shocks and singularities, many classic methods for solving partial differential equations can return inaccurate or unusable results. One way to resolve these complex dynamics is through r-adaptive refinement methods, in which a fixed number of mesh points are shifted to areas of high interest. The mesh refinement map can be found through the solution of the Monge-Ampére equation, a highly nonlinear partial differential equation. Due to its nonlinearity, the numerical solution of the Monge-Ampére equation is nontrivial and has previously required computationally expensive methods. In this report, we detail our novel optimization-based, multigrid-enabled solver for a low-order finite element approximation of the Monge-Ampére equation. This fast and scalable solver makes r-adaptive meshing more readily available for problems related to large-scale optimal design. Beyond mesh adaptivity, our report discusses additional applications where our fast solver for the Monge-Ampére equation could be easily applied.

More Details

Fluid-Kinetic Coupling: Advanced Discretizations for Simulations on Emerging Heterogeneous Architectures (LDRD FY20-0643)

Roberts, Nathan V.; Bond, Stephen D.; Miller, Sean A.; Cyr, Eric C.

Plasma physics simulations are vital for a host of Sandia mission concerns, for fundamental science, and for clean energy in the form of fusion power. Sandia's most mature plasma physics simulation capabilities come in the form of particle-in-cell (PIC) models and magnetohydrodynamics (MHD) models. MHD models for a plasma work well in denser plasma regimes when there is enough material that the plasma approximates a fluid. PIC models, on the other hand, work well in lower-density regimes, in which there is not too much to simulate; error in PIC scales as the square root of the number of particles, making high-accuracy simulations expensive. Real-world applications, however, almost always involve a transition region between the high-density regimes where MHD is appropriate, and the low-density regimes for PIC. In such a transition region, a direct discretization of Vlasov is appropriate. Such discretizations come with their own computational costs, however; the phase-space mesh for Vlasov can involve up to six dimensions (seven if time is included), and to apply appropriate homogeneous boundary conditions in velocity space requires meshing a substantial padding region to ensure that the distribution remains sufficiently close to zero at the velocity boundaries. Moreover, for collisional plasmas, the right-hand side of the Vlasov equation is a collision operator, which is non-local in velocity space, and which may dominate the cost of the Vlasov solver. The present LDRD project endeavors to develop modern, foundational tools for the development of continuum-kinetic Vlasov solvers, using the discontinuous Petrov-Galerkin (DPG) methodology, for discretization of Vlasov, and machine-learning (ML) models to enable efficient evaluation of collision operators. DPG affords several key advantages. First, it has a built-in, robust error indicator, allowing us to adapt the mesh in a very natural way, enabling a coarse velocity-space mesh near the homogeneous boundaries, and a fine mesh where the solution has fine features. Second, it is an inherently high-order, high-intensity method, requiring extra local computations to determine so-called optimal test functions, which makes it particularly suited to modern hardware in which floating-point throughput is increasing at a faster rate than memory bandwidth. Finally, DPG is a residual-minimizing method, which enables high-accuracy computation: in typical cases, the method delivers something very close to the $L^2$ projection of the exact solution. Meanwhile, the ML-based collision model we adopt affords a cost structure that scales as the square root of a standard direct evaluation. Moreover, we design our model to conserve mass, momentum, and energy by construction, and our approach to training is highly flexible, in that it can incorporate not only synthetic data from direct-simulation Monte Carlo (DSMC) codes, but also experimental data. We have developed two DPG formulations for Vlasov-Poisson: a time-marching, backward-Euler discretization and a space-time discretization. We have conducted a number of numerical experiments to verify the approach in a 1D1V setting. In this report, we detail these formulations and experiments. We also summarize some new theoretical results developed as part of this project (published as papers previously): some new analysis of DPG for the convection-reaction problem (of which the Vlasov equation is an instance), a new exponential integrator for DPG, and some numerical exploration of various DPG-based time-marching approaches to the heat equation. As part of this work, we have contributed extensively to the Camellia open-source library; we also describe the new capabilities and their usage. We have also developed a well-documented methodology for single-species collision operators, which we applied to argon and demonstrated with numerical experiments. We summarize those results here, as well as describing at a high level a design extending the methodology to multi-species operators. We have released a new open-source library, MLC, under a BSD license; we include a summary of its capabilities as well.

More Details

Modeling Analog Tile-Based Accelerators Using SST

Feinberg, Benjamin F.; Agarwal, Sapan A.; Plagge, Mark P.; Rothganger, Fredrick R.; Cardwell, Suma G.; Hughes, Clayton H.

Analog computing has been widely proposed to improve the energy efficiency of multiple important workloads including neural network operations, and other linear algebra kernels. To properly evaluate analog computing and explore more complex workloads such as systems consisting of multiple analog data paths, system level simulations are required. Moreover, prior work on system architectures for analog computing often rely on custom simulators creating signficant additional design effort and complicating comparisons between different systems. To remedy these issues, this report describes the design and implementation of a flexible tile-based analog accelerator element for the Structural Simulation Toolkit (SST). The element focuses on heavily on the tile controller—an often neglected aspect of prior work—that is sufficiently versatile to simulate a wide range of different tile operations including neural network layers, signal processing kernels, and generic linear algebra operations without major constraints. The tile model also interoperates with existing SST memory and network models to reduce the overall development load and enable future simulation of heterogeneous systems with both conventional digital logic and analog compute tiles. Finally, both the tile and array models are designed to easily support future extensions as new analog operations and applications that can benefit from analog computing are developed.

More Details

Global Sensitivity Analysis Using the Ultra‐Low Resolution Energy Exascale Earth System Model

Journal of Advances in Modeling Earth Systems

Kalashnikova, Irina; Peterson, Kara J.; Powell, Amy J.; Jakeman, John D.; Roesler, Erika L.

For decades, Arctic temperatures have increased twice as fast as average global temperatures. As a first step towards quantifying parametric uncertainty in Arctic climate, we performed a variance-based global sensitivity analysis (GSA) using a fully-coupled, ultra-low resolution (ULR) configuration of version 1 of the U.S. Department of Energy’s Energy Exascale Earth System Model (E3SMv1). Specifically, we quantified the sensitivity of six quantities of interest (QOIs), which characterize changes in Arctic climate over a 75 year period, to uncertainties in nine model parameters spanning the sea ice, atmosphere and ocean components of E3SMv1. Sensitivity indices for each QOI were computed with a Gaussian process emulator using 139 random realizations of the random parameters and fixed pre-industrial forcing. Uncertainties in the atmospheric parameters in the CLUBB (Cloud Layers Unified by Binormals) scheme were found to have the most impact on sea ice status and the larger Arctic climate. Our results demonstrate the importance of conducting sensitivity analyses with fully coupled climate models. The ULR configuration makes such studies computationally feasible today due to its low computational cost. When advances in computational power and modeling algorithms enable the tractable use of higher-resolution models, our results will provide a baseline that can quantify the impact of model resolution on the accuracy of sensitivity indices. Moreover, the confidence intervals provided by our study, which we used to quantify the impact of the number of model evaluations on the accuracy of sensitivity estimates, have the potential to inform the computational resources needed for future sensitivity studies.

More Details

Neural-network based collision operators for the Boltzmann equation

Journal of Computational Physics

Roberts, Nathan V.; Bond, Stephen D.; Cyr, Eric C.; Miller, Sean T.

Kinetic gas dynamics in rarefied and moderate-density regimes have complex behavior associated with collisional processes. These processes are generally defined by convolution integrals over a high-dimensional space (as in the Boltzmann operator), or require evaluating complex auxiliary variables (as in Rosenbluth potentials in Fokker-Planck operators) that are challenging to implement and computationally expensive to evaluate. In this work, we develop a data-driven neural network model that augments a simple and inexpensive BGK collision operator with a machine-learned correction term, which improves the fidelity of the simple operator with a small overhead to overall runtime. The composite collision operator has a tunable fidelity and, in this work, is trained using and tested against a direct-simulation Monte-Carlo (DSMC) collision operator.

More Details

Numerical simulation of a relativistic magnetron using a fluid electron model

Physics of Plasmas

Roberds, Nicholas R.; Cartwright, Keith C.; Sandoval, Andrew J.; Beckwith, Kristian B.; Cyr, Eric C.; Glines, Forrest W.

An approach to numerically modeling relativistic magnetrons, in which the electrons are represented with a relativistic fluid, is described. A principal effect in the operation of a magnetron is space-charge-limited (SCL) emission of electrons from the cathode. We have developed an approximate SCL emission boundary condition for the fluid electron model. This boundary condition prescribes the flux of electrons as a function of the normal component of the electric field on the boundary. We show the results of a benchmarking activity that applies the fluid SCL boundary condition to the one-dimensional Child–Langmuir diode problem and a canonical two-dimensional diode problem. Simulation results for a two-dimensional A6 magnetron are then presented. Computed bunching of the electron cloud occurs and coincides with significant microwave power generation. Numerical convergence of the solution is considered. Sharp gradients in the solution quantities at the diocotron resonance, spanning an interval of three to four grid cells in the most well-resolved case, are present and likely affect convergence.

More Details

Combining DPG in space with DPG time-marching scheme for the transient advection–reaction equation

Computer Methods in Applied Mechanics and Engineering

Roberts, Nathan V.; Muñoz-Matute, Judit M.; Demkowicz, Leszek D.

In this article, we present a general methodology to combine the Discontinuous Petrov–Galerkin (DPG) method in space and time in the context of methods of lines for transient advection–reaction problems. We first introduce a semidiscretization in space with a DPG method redefining the ideas of optimal testing and practicality of the method in this context. Then, we apply the recently developed DPG-based time-marching scheme, which is of exponential-type, to the resulting system of Ordinary Differential Equations (ODEs). Further, we also discuss how to efficiently compute the action of the exponential of the matrix coming from the space semidiscretization without assembling the full matrix. Finally, we verify the proposed method for 1D+time advection–reaction problems showing optimal convergence rates for smooth solutions and more stable results for linear conservation laws comparing to the classical exponential integrators.

More Details

Uncertainty and Sensitivity Analysis Methods and Applications in the GDSA Framework (FY2022)

Swiler, Laura P.; Basurto, Eduardo B.; Brooks, Dusty M.; Eckert, Aubrey C.; Leone, Rosemary C.; Mariner, Paul M.; Portone, Teresa P.; Smith, Mariah L.

The Spent Fuel and Waste Science and Technology (SFWST) Campaign of the U.S. Department of Energy (DOE) Office of Nuclear Energy (NE), Office of Fuel Cycle Technology (FCT) is conducting research and development (R&D) on geologic disposal of spent nuclear fuel (SNF) and high-level nuclear waste (HLW). Two high priorities for SFWST disposal R&D are design concept development and disposal system modeling. These priorities are directly addressed in the SFWST Geologic Disposal Safety Assessment (GDSA) control account, which is charged with developing a geologic repository system modeling and analysis capability, and the associated software, GDSA Framework, for evaluating disposal system performance for nuclear waste in geologic media. GDSA Framework is supported by SFWST Campaign and its predecessor the Used Fuel Disposition (UFD) campaign.

More Details

Islet: interpolation semi-Lagrangian element-based transport

Geoscientific Model Development (Online)

Bradley, Andrew M.; Bosler, Peter A.; Guba, Oksana G.

Abstract. Advection of trace species, or tracers, also called tracer transport, in models of the atmosphere and other physical domains is an important and potentially computationally expensive part of a model's dynamical core. Semi-Lagrangian (SL) advection methods are efficient because they permit a time step much larger than the advective stability limit for explicit Eulerian methods without requiring the solution of a globally coupled system of equations as implicit Eulerian methods do. Thus, to reduce the computational expense of tracer transport, dynamical cores often use SL methods to advect tracers. The class of interpolation semi-Lagrangian (ISL) methods contains potentially extremely efficient SL methods. We describe a finite-element ISL transport method that we call the interpolation semi-Lagrangian element-based transport (Islet) method, such as for use with atmosphere models discretized using the spectral element method. The Islet method uses three grids that share an element grid: a dynamics grid supporting, for example, the Gauss–Legendre–Lobatto basis of degree three; a physics parameterizations grid with a configurable number of finite-volume subcells per element; and a tracer grid supporting use of Islet bases with particular basis again configurable. This method provides extremely accurate tracer transport and excellent diagnostic values in a number of verification problems.

More Details

Accurate Compression of Tabulated Chemistry Models with Partition of Unity Networks

Combustion Science and Technology

Armstrong, Elizabeth A.; Hansen, Michael A.; Knaus, Robert C.; Trask, Nathaniel A.; Hewson, John C.; Sutherland, James C.

Tabulated chemistry models are widely used to simulate large-scale turbulent fires in applications including energy generation and fire safety. Tabulation via piecewise Cartesian interpolation suffers from the curse-of-dimensionality, leading to a prohibitive exponential growth in parameters and memory usage as more dimensions are considered. Artificial neural networks (ANNs) have attracted attention for constructing surrogates for chemistry models due to their ability to perform high-dimensional approximation. However, due to well-known pathologies regarding the realization of suboptimal local minima during training, in practice they do not converge and provide unreliable accuracy. Partition of unity networks (POUnets) are a recently introduced family of ANNs which preserve notions of convergence while performing high-dimensional approximation, discovering a mesh-free partition of space which may be used to perform optimal polynomial approximation. In this work, we assess their performance with respect to accuracy and model complexity in reconstructing unstructured flamelet data representative of nonadiabatic pool fire models. Our results show that POUnets can provide the desirable accuracy of classical spline-based interpolants with the low memory footprint of traditional ANNs while converging faster to significantly lower errors than ANNs. For example, we observe POUnets obtaining target accuracies in two dimensions with 40 to 50 times less memory and roughly double the compression in three dimensions. We also address the practical matter of efficiently training accurate POUnets by studying convergence over key hyperparameters, the impact of partition/basis formulation, and the sensitivity to initialization.

More Details

PyApprox: Enabling efficient model analysis

Jakeman, John D.

PyApprox is a Python-based one-stop-shop for probabilistic analysis of scientific numerical models. Easy to use and extendable tools are provided for constructing surrogates, sensitivity analysis, Bayesian inference, experimental design, and forward uncertainty quantification. The algorithms implemented represent the most popular methods for model analysis developed over the past two decades, including recent advances in multi-fidelity approaches that use multiple model discretizations and/or simplified physics to significantly reduce the computational cost of various types of analyses. Simple interfaces are provided for the most commonly-used algorithms to limit a user’s need to tune the various hyper-parameters of each algorithm. However, more advanced work flows that require customization of hyper-parameters is also supported. An extensive set of Benchmarks from the literature is also provided to facilitate the easy comparison of different algorithms for a wide range of model analyses. This paper introduces PyApprox and its various features, and presents results demonstrating the utility of PyApprox on a benchmark problem modeling the advection of a tracer in ground water.

More Details

Toward efficient polynomial preconditioning for GMRES

Numerical Linear Algebra with Applications

Loe, Jennifer A.; Morgan, Ronald B.

We present a polynomial preconditioner for solving large systems of linear equations. The polynomial is derived from the minimum residual polynomial (the GMRES polynomial) and is more straightforward to compute and implement than many previous polynomial preconditioners. Our current implementation of this polynomial using its roots is naturally more stable than previous methods of computing the same polynomial. We implement further stability control using added roots, and this allows for high degree polynomials. We discuss the effectiveness and challenges of root-adding and give an additional check for stability. In this article, we study the polynomial preconditioner applied to GMRES; however it could be used with any Krylov solver. This polynomial preconditioning algorithm can dramatically improve convergence for some problems, especially for difficult problems, and can reduce dot products by an even greater margin.

More Details

Permutation-adapted complete and independent basis for atomic cluster expansion descriptors

Goff, James M.; Sievers, Charles S.; Wood, Mitchell A.; Thompson, Aidan P.

In many recent applications, particularly in the field of atom-centered descriptors for interatomic potentials, tensor products of spherical harmonics have been used to characterize complex atomic environments. When coupled with a radial basis, the atomic cluster expansion (ACE) basis is obtained. However, symmetrization with respect to both rotation and permutation results in an overcomplete set of ACE descriptors with linear dependencies occurring within blocks of functions corresponding to particular generalized Wigner symbols. All practical applications of ACE employ semi-numerical constructions to generate a complete, fully independent basis. While computationally tractable, the resultant basis cannot be expressed analytically, is susceptible to numerical instability, and thus has limited reproducibility. Here we present a procedure for generating explicit analytic expressions for a complete and independent set of ACE descriptors. The procedure uses a coupling scheme that is maximally symmetric w.r.t. permutation of the atoms, exposing the permutational symmetries of the generalized Wigner symbols, and yields a permutation-adapted rotationally and permutationally invariant basis (PA-RPI ACE). Theoretical support for the approach is presented, as well as numerical evidence of completeness and independence. A summary of explicit enumeration of PA-RPI functions up to rank 6 and polynomial degree 32 is provided. The PA-RPI blocks corresponding to particular generalized Wigner symbols may be either larger or smaller than the corresponding blocks in the simpler rotationally invariant basis. Finally, we demonstrate that basis functions of high polynomial degree persist under strong regularization, indicating the importance of not restricting the maximum degree of basis functions in ACE models a priori.

More Details

Graph-Based Similarity Metrics for Comparing Simulation Model Causal Structures

Naugle, Asmeret B.; Swiler, Laura P.; Lakkaraju, Kiran L.; Verzi, Stephen J.; Warrender, Christina E.; Romero, Vicente J.

The causal structure of a simulation is a major determinant of both its character and behavior, yet most methods we use to compare simulations focus only on simulation outputs. We introduce a method that combines graphical representation with information theoretic metrics to quantitatively compare the causal structures of models. The method applies to agent-based simulations as well as system dynamics models and facilitates comparison within and between types. Comparing models based on their causal structures can illuminate differences in assumptions made by the models, allowing modelers to (1) better situate their models in the context of existing work, including highlighting novelty, (2) explicitly compare conceptual theory and assumptions to simulated theory and assumptions, and (3) investigate potential causal drivers of divergent behavior between models. We demonstrate the method by comparing two epidemiology models at different levels of aggregation.

More Details

Selective amorphization of SiGe in Si/SiGe nanostructures via high energy Si+ implant

Journal of Applied Physics

Turner, Emily M.; Campbell, Quinn C.; Avci, Ibrahim A.; Weber, William J.; Lu, Ping L.; Wang, George T.; Jones, Kevin S.

The selective amorphization of SiGe in Si/SiGe nanostructures via a 1 MeV Si + implant was investigated, resulting in single-crystal Si nanowires (NWs) and quantum dots (QDs) encapsulated in amorphous SiGe fins and pillars, respectively. The Si NWs and QDs are formed during high-temperature dry oxidation of single-crystal Si/SiGe heterostructure fins and pillars, during which Ge diffuses along the nanostructure sidewalls and encapsulates the Si layers. The fins and pillars were then subjected to a 3 × 10 15  ions/cm 2 1 MeV Si + implant, resulting in the amorphization of SiGe, while leaving the encapsulated Si crystalline for larger, 65-nm wide NWs and QDs. Interestingly, the 26-nm diameter Si QDs amorphize, while the 28-nm wide NWs remain crystalline during the same high energy ion implant. This result suggests that the Si/SiGe pillars have a lower threshold for Si-induced amorphization compared to their Si/SiGe fin counterparts. However, Monte Carlo simulations of ion implantation into the Si/SiGe nanostructures reveal similar predicted levels of displacements per cm 3 . Molecular dynamics simulations suggest that the total stress magnitude in Si QDs encapsulated in crystalline SiGe is higher than the total stress magnitude in Si NWs, which may lead to greater crystalline instability in the QDs during ion implant. The potential lower amorphization threshold of QDs compared to NWs is of special importance to applications that require robust QD devices in a variety of radiation environments.

More Details

Electrostatic Relativistic Fluid Models of Electron Emission in a Warm Diode

IEEE International Conference on Plasma Science (ICOPS)

Hamlin, Nathaniel D.; Smith, Thomas M.; Roberds, Nicholas R.; Glines, Forrest W.; Beckwith, Kristian B.

A semi-analytic fluid model has been developed for characterizing relativistic electron emission across a warm diode gap. Here we demonstrate the use of this model in (i) verifying multi-fluid codes in modeling compressible relativistic electron flows (the EMPIRE-Fluid code is used as an example; see also Ref. 1), (ii) elucidating key physics mechanisms characterizing the influence of compressibility and relativistic injection speed of the electron flow, and (iii) characterizing the regimes over which a fluid model recovers physically reasonable solutions.

More Details

Adaptive experimental design for multi-fidelity surrogate modeling of multi-disciplinary systems

International Journal for Numerical Methods in Engineering

Jakeman, John D.; Friedman, Sam; Eldred, Michael S.; Tamellini, Lorenzo; Gorodetsky, Alex A.; Allaire, Doug

We present an adaptive algorithm for constructing surrogate models of multi-disciplinary systems composed of a set of coupled components. With this goal we introduce “coupling” variables with a priori unknown distributions that allow surrogates of each component to be built independently. Once built, the surrogates of the components are combined to form an integrated-surrogate that can be used to predict system-level quantities of interest at a fraction of the cost of the original model. The error in the integrated-surrogate is greedily minimized using an experimental design procedure that allocates the amount of training data, used to construct each component-surrogate, based on the contribution of those surrogates to the error of the integrated-surrogate. The multi-fidelity procedure presented is a generalization of multi-index stochastic collocation that can leverage ensembles of models of varying cost and accuracy, for one or more components, to reduce the computational cost of constructing the integrated-surrogate. Extensive numerical results demonstrate that, for a fixed computational budget, our algorithm is able to produce surrogates that are orders of magnitude more accurate than methods that treat the integrated system as a black-box.

More Details

Scalable algorithms for physics-informed neural and graph networks

Data-Centric Engineering

Shukla, Khemraj; Xu, Mengjia; Trask, Nathaniel A.; Karniadakis, George E.

Physics-informed machine learning (PIML) has emerged as a promising new approach for simulating complex physical and biological systems that are governed by complex multiscale processes for which some data are also available. In some instances, the objective is to discover part of the hidden physics from the available data, and PIML has been shown to be particularly effective for such problems for which conventional methods may fail. Unlike commercial machine learning where training of deep neural networks requires big data, in PIML big data are not available. Instead, we can train such networks from additional information obtained by employing the physical laws and evaluating them at random points in the space-time domain. Such PIML integrates multimodality and multifidelity data with mathematical models, and implements them using neural networks or graph networks. Here, we review some of the prevailing trends in embedding physics into machine learning, using physics-informed neural networks (PINNs) based primarily on feed-forward neural networks and automatic differentiation. For more complex systems or systems of systems and unstructured data, graph neural networks (GNNs) present some distinct advantages, and here we review how physics-informed learning can be accomplished with GNNs based on graph exterior calculus to construct differential operators; we refer to these architectures as physics-informed graph networks (PIGNs). We present representative examples for both forward and inverse problems and discuss what advances are needed to scale up PINNs, PIGNs and more broadly GNNs for large-scale engineering problems.

More Details

Monolithic Multigrid for a Reduced-Quadrature Discretization of Poroelasticity

SIAM Journal on Scientific Computing

Adler, James A.; He, Yunhui H.; Hu, Xiaozhe H.; MacLachlan, Scott M.; Ohm, Peter B.

Advanced finite-element discretizations and preconditioners for models of poroelasticity have attracted significant attention in recent years. The equations of poroelasticity offer significant challenges in both areas, due to the potentially strong coupling between unknowns in the system, saddle-point structure, and the need to account for wide ranges of parameter values, including limiting behavior such as incompressible elasticity. This paper was motivated by an attempt to develop monolithic multigrid preconditioners for the discretization developed in [C. Rodrigo et al., Comput. Methods App. Mech. Engrg, 341 (2018), pp. 467--484]; we show here why this is a difficult task and, as a result, we modify the discretization in [Rodrigo et al.] through the use of a reduced-quadrature approximation, yielding a more “solver-friendly” discretization. Local Fourier analysis is used to optimize parameters in the resulting monolithic multigrid method, allowing a fair comparison between the performance and costs of methods based on Vanka and Braess--Sarazin relaxation. Further, numerical results are presented to validate the local Fourier analysis predictions and demonstrate efficiency of the algorithms. Finally, a comparison to existing block-factorization preconditioners is also given.

More Details

An optimization-based approach to parameter learning for fractional type nonlocal models

Computers and Mathematics with Applications

Burkovska, Olena; Glusa, Christian A.; D'Elia, Marta D.

Nonlocal operators of fractional type are a popular modeling choice for applications that do not adhere to classical diffusive behavior; however, one major challenge in nonlocal simulations is the selection of model parameters. In this work we propose an optimization-based approach to parameter identification for fractional models with an optional truncation radius. We formulate the inference problem as an optimal control problem where the objective is to minimize the discrepancy between observed data and an approximate solution of the model, and the control variables are the fractional order and the truncation length. For the numerical solution of the minimization problem we propose a gradient-based approach, where we enhance the numerical performance by an approximation of the bilinear form of the state equation and its derivative with respect to the fractional order. Several numerical tests in one and two dimensions illustrate the theoretical results and show the robustness and applicability of our method.

More Details

Electronic structure of intrinsic defects in c -gallium nitride: Density functional theory study without the jellium approximation

Physical Review B

Edwards, Arthur H.; Schultz, Peter A.; Dobzynski, Richard M.

We report the first nonjellium, systematic, density functional theory (DFT) study of intrinsic and extrinsic defects and defect levels in zinc-blende (cubic) gallium nitride. We use the local moment counter charge (LMCC) method, the standard Perdew-Becke-Ernzerhoff (PBE) exchange-correlation potential, and two pseudopotentials, where the Ga 3d orbitals are either in the core (d0) or explicitly in the valence set (d10). We studied 64, 216, 512, and 1000 atom supercells, and demonstrated convergence to the infinite limit, crucial for delineating deep from shallow states near band edges, and for demonstrating the elimination of finite cell-size errors. Contrary to common claims, we find that exact exchange is not required to obtain defect levels across the experimental band gap. As was true in silicon, silicon carbide, and gallium arsenide, the extremal LMCC defect levels of the aggregate of defects yield an effective LMCC defect band gap that is within 10% of the experimental gap (3.3 eV) for both pseudopotentials. We demonstrate that the gallium vacancy is more complicated than previously reported. There is dramatic metastability-a nearest-neighbor nitrogen atom shifts into the gallium site, forming an antisite, nitrogen vacancy pair, which is more stable than the simple vacancy for positive charge states. Our assessment of the d0 and d10 pseudopotentials yields minimal differences in defect structures and defect levels. The better agreement of the d0 lattice constant with experiment suggests that the more computationally economical d0 pseudopotentials are sufficient to achieve the fidelity possible within the physical accuracy of DFT, and thereby enable calculations in larger supercells necessary to demonstrate convergence with respect to finite size supercell errors.

More Details

Physics-assisted generative adversarial network for X-ray tomography

Optics Express

Guo, Zhen G.; Song, Jung K.; Barbastathis, George B.; Vaughan, Courtenay T.; Larson, Kurt L.; Alpert, Bradley K.; Levine, Zachary L.; Glinsky, Michael E.

X-ray tomography is capable of imaging the interior of objects in three dimensions non-invasively, with applications in biomedical imaging, materials science, electronic inspection, and other fields. The reconstruction process can be an ill-conditioned inverse problem, requiring regularization to obtain satisfactory results. Recently, deep learning has been adopted for tomographic reconstruction. Unlike iterative algorithms which require a distribution that is known a priori , deep reconstruction networks can learn a prior distribution through sampling the training distributions. In this work, we develop a Physics-assisted Generative Adversarial Network (PGAN), a two-step algorithm for tomographic reconstruction. In contrast to previous efforts, our PGAN utilizes maximum-likelihood estimates derived from the measurements to regularize the reconstruction with both known physics and the learned prior. Compared with methods with less physics assisting in training, PGAN can reduce the photon requirement with limited projection angles to achieve a given error rate. The advantages of using a physics-assisted learned prior in X-ray tomography may further enable low-photon nanoscale imaging.

More Details

The Portals 4.3 Network Programming Interface

Schonbein, William W.; Barrett, Brian W.; Brightwell, Ronald B.; Grant, Ryan G.; Hemmert, Karl S.; Pedretti, Kevin P.; Underwood, Keith U.; Riesen, Rolf R.; Hoefler, Torsten H.; Barbe, Mathieu B.; Filho, Luiz H.; Ratchov, Alexandre R.; Maccabe, Arthur B.

This report presents a specification for the Portals 4 network programming interface. Portals 4 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4 is well suited to massively parallel processing and embedded systems. Portals 4 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandia's Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4 is targeted to the next generation of machines employing advanced network interface architectures that support enhanced offload capabilities.

More Details

Asymptotic preserving methods for fluid electron-fluid models in the large magnetic field limit with mathematically guaranteed properties (Final Report)

Tomas, Ignacio T.; Shadid, John N.; Maier, Matthias M.; Salgado, Abner S.

The current manuscript is a final report on the activities carried out under the Project LDRD-CIS #226834. In scientific terms, the work reported in this manuscript is a continuation of the efforts started with Project LDRD-express #223796 with final report of activities SAND2021-11481, see [83]. In this section we briefly explain what pre-existing developments motivated the current body of work and provide an overview of the activities developed with the funds provided. The overarching goal of the current project LDRD-CIS #226834 and the previous project LDRD-express #223796 is the development of numerical methods with mathematically guaranteed properties in order to solve the Euler-Maxwell system of plasma physics and generalizations thereof. Even though Project #223796 laid out general foundations of space and time discretization of Euler-Maxwell system, overall, it was focused on the development of numerical schemes for purely electrostatic fluid-plasma models. In particular, the project developed a family of schemes with mathematically guaranteed robustness in order to solve the Euler-Poisson model. This model is an asymptotic limit where only electrostatic response of the plasma is considered. Its primary feature is the presence of a non-local force, the electrostatic force, which introduces effects with infinite speed propagation into the problem. Even though instantaneous propagation of perturbations may be considered nonphysical, there are plenty of physical regimes of technical interest where such an approximation is perfectly valid.

More Details

QSCOUT Progress Report, June 2022 [Quantum Scientific Computing Open User Testbed]

Clark, Susan M.; Norris, Haley R.; Landahl, Andrew J.; Yale, Christopher G.; Lobser, Daniel L.; Van Der Wall, Jay W.; Revelle, Melissa R.

Quantum information processing has reached an inflection point, transitioning from proof-of-principle scientific experiments to small, noisy quantum processors. To accelerate this process and eventually move to fault-tolerant quantum computing, it is necessary to provide the scientific community with access to whitebox testbed systems. The Quantum Scientific Computing Open User Testbed (QSCOUT) provides scientists unique access to an innovative system to help advance quantum computing science.

More Details

Theory of the metastable injection-bleached E3c center in GaAs

Physical Review B

Schultz, Peter A.; Hjalmarson, Harold P.

The E3 transition in irradiated GaAs observed in deep level transient spectroscopy (DLTS) was recently discovered in Laplace-DLTS to encompass three distinct components. The component designated E3c was found to be metastable, reversibly bleached under minority carrier (hole) injection, with an introduction rate dependent upon Si doping density. It is shown through first-principles modeling that the E3c must be the intimate Si-vacancy pair, best described as a Si sitting in a divacancy Sivv. The bleached metastable state is enabled by a doubly site-shifting mechanism: Upon recharging, the defect undergoes a second site shift rather returning to its original E3c-active configuration via reversing the first site shift. Identification of this defect offers insights into the short-time annealing kinetics in irradiated GaAs.

More Details

A Taxonomy of Small Markovian Errors

PRX Quantum

Blume-Kohout, Robin J.; da Silva, Marcus P.; Nielsen, Erik N.; Proctor, Timothy J.; Rudinger, Kenneth M.; Sarovar, Mohan S.; Young, Kevin C.

Errors in quantum logic gates are usually modeled by quantum process matrices (CPTP maps). But process matrices can be opaque and unwieldy. We show how to transform the process matrix of a gate into an error generator that represents the same information more usefully. We construct a basis of simple and physically intuitive elementary error generators, classify them, and show how to represent the error generator of any gate as a mixture of elementary error generators with various rates. Finally, we show how to build a large variety of reduced models for gate errors by combining elementary error generators and/or entire subsectors of generator space. We conclude with a few examples of reduced models, including one with just 9N2 parameters that describes almost all commonly predicted errors on an N-qubit processor.

More Details

A Stochastic Reduced-Order Model for Statistical Microstructure Descriptors Evolution

Journal of Computing and Information Science in Engineering

Tran, Anh; Sun, Jing S.; Liu, Dehao L.; Wang, Yan W.; Wildey, Timothy M.

Integrated computational materials engineering (ICME) models have been a crucial building block for modern materials development, relieving heavy reliance on experiments and significantly accelerating the materials design process. However, ICME models are also computationally expensive, particularly with respect to time integration for dynamics, which hinders the ability to study statistical ensembles and thermodynamic properties of large systems for long time scales. To alleviate the computational bottleneck, we propose to model the evolution of statistical microstructure descriptors as a continuous-time stochastic process using a non-linear Langevin equation, where the probability density function (PDF) of the statistical microstructure descriptors, which are also the quantities of interests (QoIs), is modeled by the Fokker–Planck equation. In this work, we discuss how to calibrate the drift and diffusion terms of the Fokker–Planck equation from the theoretical and computational perspectives. The calibrated Fokker–Planck equation can be used as a stochastic reduced-order model to simulate the microstructure evolution of statistical microstructure descriptors PDF. Considering statistical microstructure descriptors in the microstructure evolution as QoIs, we demonstrate our proposed methodology in three integrated computational materials engineering (ICME) models: kinetic Monte Carlo, phase field, and molecular dynamics simulations.

More Details

Strain-tuning of transport gaps and semiconductor-to-conductor phase transition in twinned graphene

Acta Materialia

Mendez Granado, Juan P.

We show, through the use of the Landauer-Büttiker (LB) formalism and a tight-binding (TB) model, that the transport gap of twinned graphene can be tuned through the application of a uniaxial strain in the direction normal to the twin band. Remarkably, we find that the transport gap Egap bears a square-root dependence on the control parameter ϵx – ϵc, where ϵx is the applied uniaxial strain and ϵc ~ 19% is a critical strain. We interpret this dependence as evidence of criticality underlying a continuous phase transition, with ϵx – ϵc playing the role of control parameter and the transport gap Egap playing the role of order parameter. For ϵx < ϵc, the transport gap is non-zero and the material is semiconductor, whereas for ϵx < ϵc the transport gap closes to zero and the material becomes conductor, which evinces a semiconductor-to-conductor phase transition. The computed critical exponent of 1/2 places the transition in the meanfield universality class, which enables far-reaching analogies with other systems in the same class.

More Details

Entangling-gate error from coherently displaced motional modes of trapped ions

Physical Review A

Ruzic, Brandon R.; Barrick, Todd A.; Hunker, Jeffrey D.; Law, Ryan L.; McFarland, Brian M.; McGuinness, Hayden J.; Parazzoli, L.P.; Sterk, Jonathan D.; Van Der Wall, Jay W.; Stick, Daniel L.

Entangling gates in trapped-ion quantum computers are most often applied to stationary ions with initial motional distributions that are thermal and close to the ground state, while those demonstrations that involve transport generally use sympathetic cooling to reinitialize the motional state prior to applying a gate. Future systems with more ions, however, will face greater nonthermal excitation due to increased amounts of ion transport and exacerbated by longer operational times and variations over the trap array. In addition, pregate sympathetic cooling may be limited due to time costs and laser access constraints. In this paper, we analyze the impact of such coherent motional excitation on entangling-gate error by performing simulations of Mølmer-Sørenson (MS) gates on a pair of trapped-ion qubits with both thermal and coherent excitation present in a shared motional mode at the start of the gate. Here, we quantify how a small amount of coherent displacement erodes gate performance in the presence of experimental noise, and we demonstrate that adjusting the relative phase between the initial coherent displacement and the displacement induced by the gate or using Walsh modulation can suppress this error. We then use experimental data from transported ions to analyze the impact of coherent displacement on MS-gate error under realistic conditions.

More Details

Surrogate modeling for efficiently, accurately and conservatively estimating measures of risk

Reliability Engineering and System Safety

Jakeman, John D.; Kouri, Drew P.; Huerta, Jose G.

We present a surrogate modeling framework for conservatively estimating measures of risk from limited realizations of an expensive physical experiment or computational simulation. Risk measures combine objective probabilities with the subjective values of a decision maker to quantify anticipated outcomes. Given a set of samples, we construct a surrogate model that produces estimates of risk measures that are always greater than their empirical approximations obtained from the training data. These surrogate models limit over-confidence in reliability and safety assessments and produce estimates of risk measures that converge much faster to the true value than purely sample-based estimates. We first detail the construction of conservative surrogate models that can be tailored to a stakeholder's risk preferences and then present an approach, based on stochastic orders, for constructing surrogate models that are conservative with respect to families of risk measures. Our surrogate models include biases that permit them to conservatively estimate the target risk measures. We provide theoretical results that show that these biases decay at the same rate as the L2 error in the surrogate model. Numerical demonstrations confirm that risk-adapted surrogate models do indeed overestimate the target risk measures while converging at the expected rate.

More Details

CrossSim Inference Manual v2.0

Xiao, Tianyao X.; Bennett, Christopher H.; Feinberg, Benjamin F.; Marinella, Matthew J.; Agarwal, Sapan A.

Neural networks are largely based on matrix computations. During forward inference, the most heavily used compute kernel is the matrix-vector multiplication (MVM): $W \vec{x} $. Inference is a first frontier for the deployment of next-generation hardware for neural network applications, as it is more readily deployed in edge devices, such as mobile devices or embedded processors with size, weight, and power constraints. Inference is also easier to implement in analog systems than training, which has more stringent device requirements. The main processing kernel used during inference is the MVM.

More Details

A primal–dual algorithm for risk minimization

Mathematical Programming

Kouri, Drew P.; Surowiec, Thomas M.

In this paper, we develop an algorithm to efficiently solve risk-averse optimization problems posed in reflexive Banach space. Such problems often arise in many practical applications as, e.g., optimization problems constrained by partial differential equations with uncertain inputs. Unfortunately, for many popular risk models including the coherent risk measures, the resulting risk-averse objective function is nonsmooth. This lack of differentiability complicates the numerical approximation of the objective function as well as the numerical solution of the optimization problem. To address these challenges, we propose a primal–dual algorithm for solving large-scale nonsmooth risk-averse optimization problems. This algorithm is motivated by the classical method of multipliers and by epigraphical regularization of risk measures. As a result, the algorithm solves a sequence of smooth optimization problems using derivative-based methods. We prove convergence of the algorithm even when the subproblems are solved inexactly and conclude with numerical examples demonstrating the efficiency of our method.

More Details

Mixed precision s-step Lanczos and conjugate gradient algorithms

Numerical Linear Algebra with Applications

Carson, Erin; Gergelits, Tomáš; Yamazaki, Ichitaro Y.

Compared to the classical Lanczos algorithm, the s-step Lanczos variant has the potential to improve performance by asymptotically decreasing the synchronization cost per iteration. However, this comes at a price; despite being mathematically equivalent, the s-step variant may behave quite differently in finite precision, potentially exhibiting greater loss of accuracy and slower convergence relative to the classical algorithm. It has previously been shown that the errors in the s-step version follow the same structure as the errors in the classical algorithm, but are amplified by a factor depending on the square of the condition number of the (Formula presented.) -dimensional Krylov bases computed in each outer loop. As the condition number of these s-step bases grows (in some cases very quickly) with s, this limits the s values that can be chosen and thus can limit the attainable performance. In this work, we show that if a select few computations in s-step Lanczos are performed in double the working precision, the error terms then depend only linearly on the conditioning of the s-step bases. This has the potential for drastically improving the numerical behavior of the algorithm with little impact on per-iteration performance. Our numerical experiments demonstrate the improved numerical behavior possible with the mixed precision approach, and also show that this improved behavior extends to mixed precision s-step CG. We present preliminary performance results on NVIDIA V100 GPUs that show that the overhead of extra precision is minimal if one uses precisions implemented in hardware.

More Details

Low-order preconditioning of the Stokes equations

Numerical Linear Algebra with Applications

Voronin, Alexey; He, Yunhui; MacLachlan, Scott; Olson, Luke N.; Tuminaro, Raymond S.

A well-known strategy for building effective preconditioners for higher-order discretizations of some PDEs, such as Poisson's equation, is to leverage effective preconditioners for their low-order analogs. In this work, we show that high-quality preconditioners can also be derived for the Taylor–Hood discretization of the Stokes equations in much the same manner. In particular, we investigate the use of geometric multigrid based on the (Formula presented.) discretization of the Stokes operator as a preconditioner for the (Formula presented.) discretization of the Stokes system. We utilize local Fourier analysis to optimize the damping parameters for Vanka and Braess–Sarazin relaxation schemes and to achieve robust convergence. These results are then verified and compared against the measured multigrid performance. While geometric multigrid can be applied directly to the (Formula presented.) system, our ultimate motivation is to apply algebraic multigrid within solvers for (Formula presented.) systems via the (Formula presented.) discretization, which will be considered in a companion paper.

More Details

A Hybrid Method for Tensor Decompositions that Leverages Stochastic and Deterministic Optimization

Myers, Jeremy M.; Dunlavy, Daniel D.

In this paper, we propose a hybrid method that uses stochastic and deterministic search to compute the maximum likelihood estimator of a low-rank count tensor with Poisson loss via state-of-theart local methods. Our approach is inspired by Simulated Annealing for global optimization and allows for fine-grain parameter tuning as well as adaptive updates to algorithm parameters. We present numerical results that indicate our hybrid approach can compute better approximations to the maximum likelihood estimator with less computation than the state-of-the-art methods by themselves.

More Details

In Their Shoes: Persona-Based Approaches to Software Quality Practice Incentivization

Computing in Science and Engineering

Mundt, Miranda R.; Milewicz, Reed M.; Raybourn, Elaine M.

Many teams struggle to adapt and right-size software engineering best practices for quality assurance to fit their context. Introducing software quality is not usually framed in a way that motivates teams to take action, thus resulting in it becoming a “check the box for compliance” activity instead of a cultural practice that values software quality and the effort to achieve it. When and how can we provide effective incentives for software teams to adopt and integrate meaningful and enduring software quality practices? Here, we explored this question through a persona-based ideation exercise at the 2021 Collegeville Workshop on Scientific Software in which we created three unique personas that represent different scientific software developer perspectives.

More Details

The Ground Truth Program: Simulations as Test Beds for Social Science Research Methods.

Computational and Mathematical Organization Theory

Naugle, Asmeret B.; Russell, Adam R.; Lakkaraju, Kiran L.; Swiler, Laura P.; Verzi, Stephen J.; Romero, Vicente J.

Social systems are uniquely complex and difficult to study, but understanding them is vital to solving the world’s problems. The Ground Truth program developed a new way of testing the research methods that attempt to understand and leverage the Human Domain and its associated complexities. The program developed simulations of social systems as virtual world test beds. Not only were these simulations able to produce data on future states of the system under various circumstances and scenarios, but their causal ground truth was also explicitly known. Research teams studied these virtual worlds, facilitating deep validation of causal inference, prediction, and prescription methods. The Ground Truth program model provides a way to test and validate research methods to an extent previously impossible, and to study the intricacies and interactions of different components of research.

More Details

An Accurate, Error-Tolerant, and Energy-Efficient Neural Network Inference Engine Based on SONOS Analog Memory

IEEE Transactions on Circuits and Systems I: Regular Papers

Xiao, T.P.; Feinberg, Benjamin F.; Bennett, Christopher H.; Agrawal, Vineet; Saxena, Prashant; Prabhakar, Venkatraman; Ramkumar, Krishnaswamy; Medu, Harsha; Raghavan, Vijay; Chettuvetty, Ramesh; Agarwal, Sapan A.; Marinella, Matthew J.

We demonstrate SONOS (silicon-oxide-nitride-oxide-silicon) analog memory arrays that are optimized for neural network inference. The devices are fabricated in a 40nm process and operated in the subthreshold regime for in-memory matrix multiplication. Subthreshold operation enables low conductances to be implemented with low error, which matches the typical weight distribution of neural networks, which is heavily skewed toward near-zero values. This leads to high accuracy in the presence of programming errors and process variations. We simulate the end-To-end neural network inference accuracy, accounting for the measured programming error, read noise, and retention loss in a fabricated SONOS array. Evaluated on the ImageNet dataset using ResNet50, the accuracy using a SONOS system is within 2.16% of floating-point accuracy without any retraining. The unique error properties and high On/Off ratio of the SONOS device allow scaling to large arrays without bit slicing, and enable an inference architecture that achieves 20 TOPS/W on ResNet50, a > 10× gain in energy efficiency over state-of-The-Art digital and analog inference accelerators.

More Details

Sensitivity Analysis for Solutions to Heterogeneous Nonlocal Systems. Theoretical and Numerical Studies

Journal of Peridynamics and Nonlocal Modeling

Buczkowski, Nicole E.; Foss, Mikil D.; Parks, Michael L.; Radu, Petronela R.

The paper presents a collection of results on continuous dependence for solutions to nonlocal problems under perturbations of data and system parameters. The integral operators appearing in the systems capture interactions via heterogeneous kernels that exhibit different types of weak singularities, space dependence, even regions of zero-interaction. Here, the stability results showcase explicit bounds involving the measure of the domain and of the interaction collar size, nonlocal Poincaré constant, and other parameters. In the nonlinear setting, the bounds quantify in different Lp norms the sensitivity of solutions under different nonlinearity profiles. The results are validated by numerical simulations showcasing discontinuous solutions, varying horizons of interactions, and symmetric and heterogeneous kernels.

More Details

Self-Induced Curvature in an Internally Loaded Peridynamic Fiber

Silling, Stewart A.

A straight fiber with nonlocal forces that are independent of bond strain is considered. These internal loads can either stabilize or destabilize the straight configuration. Transverse waves with long wavelength have unstable dispersion properties for certain combinations of nonlocal kernels and internal loads. When these unstable waves occur, deformation of the straight fiber into a circular arc can lower its potential energy in equilibrium. The equilibrium value of the radius of curvature is computed explicitly.

More Details

Quantitative Performance Assessment of Proxy Apps and Parents (Report for ECP Proxy App Project Milestone ADCD-504-28)

Cook, Jeanine C.; Aaziz, Omar R.; Chen, Si C.; Godoy, William F.; Powell, Amy J.; Watson, Gregory W.; Vaughan, Courtenay T.; Wildani, Avani W.

The ECP Proxy Application Project has an annual milestone to assess the state of ECP proxy applications and their role in the overall ECP ecosystem. Our FY22 March/April milestone (ADCD- 504-28) proposed to: Assess the fidelity of proxy applications compared to their respective parents in terms of kernel and I/O behavior, and predictability. Similarity techniques will be applied for quantitative comparison of proxy/parent kernel behavior. MACSio evaluation will continue and support for OpenPMD backends will be explored. The execution time predictability of proxy apps with respect to their parents will be explored through a carefully designed scaling study and code comparisons. Note that in this FY, we also have quantitative assessment milestones that are due in September and are, therefore, not included in the description above or in this report. Another report on these deliverables will be generated and submitted upon completion of these milestones. To satisfy this milestone, the following specific tasks were completed: Study the ability of MACSio to represent I/O workloads of adaptive mesh codes. Re-define the performance counter groups for contemporary Intel and IBM platforms to better match specific hardware components and to better align across platforms (make cross-platform comparison more accurate). Perform cosine similarity study based on the new performance counter groups on the Intel and IBM P9 platforms. Perform detailed analysis of performance counter data to accurately average and align the data to maintain phases across all executions and develop methods to reduce the set of collected performance counters used in cosine similarity analysis. Apply a quantitative similarity comparison between proxy and parent CPU kernels. Perform scaling studies to understand the accuracy of predictability of the parent performance using its respective proxy application. This report presents highlights of these efforts.

More Details

Kokkos 3: Programming Model Extensions for the Exascale Era

IEEE Transactions on Parallel and Distributed Systems

Trott, Christian R.; Lebrun-Grandie, Damien; Arndt, Daniel; Ciesko, Jan; Dang, Vinh Q.; Ellingwood, Nathan D.; Gayatri, Rahulkumar; Harvey, Evan C.; Hollman, Daisy S.; Ibanez, Dan; Liber, Nevin; Madsen, Jonathan; Miles, Jeff; Poliakoff, David Z.; Powell, Amy J.; Rajamanickam, Sivasankaran R.; Simberg, Mikael; Sunderland, Dan; Turcksin, Bruno; Wilke, Jeremiah

As the push towards exascale hardware has increased the diversity of system architectures, performance portability has become a critical aspect for scientific software. We describe the Kokkos Performance Portable Programming Model that allows developers to write single source applications for diverse high-performance computing architectures. Kokkos provides key abstractions for both the compute and memory hierarchy of modern hardware. We describe the novel abstractions that have been added to Kokkos version 3 such as hierarchical parallelism, containers, task graphs, and arbitrary-sized atomic operations to prepare for exascale era architectures. We demonstrate the performance of these new features with reproducible benchmarks on CPUs and GPUs.

More Details
Results 1–100 of 9,998
Results 1–100 of 9,998