Publications

Results 201–250 of 9,998

Search results

Jump to search filters

Inelastic peridynamic model for molecular crystal particles

Computational Particle Mechanics

Silling, Stewart; Barr, Christopher M.; Cooper, Marcia; Lechman, Jeremy B.; Bufford, Daniel C.

The peridynamic theory of solid mechanics is applied to modeling the deformation and fracture of micrometer-sized particles made of organic crystalline material. A new peridynamic material model is proposed to reproduce the elastic–plastic response, creep, and fracture that are observed in experiments. The model is implemented in a three-dimensional, meshless Lagrangian simulation code. In the small deformation, elastic regime, the model agrees well with classical Hertzian contact analysis for a sphere compressed between rigid plates. Under higher load, material and geometrical nonlinearity is predicted, leading to fracture. The material parameters for the energetic material CL-20 are evaluated from nanoindentation test data on the cyclic compression and failure of micrometer-sized grains.

More Details

Al-alkyls as acceptor dopant precursors for atomic-scale devices

Journal of Physics Condensed Matter

Owen, J.H.G.; Campbell, Quinn; Santini, R.; Ivie, Jeffrey A.; Baczewski, Andrew D.; Schmucker, Scott W.; Bussmann, Ezra; Misra, Shashank; Randall, J.N.

Atomically precise ultradoping of silicon is possible with atomic resists, area-selective surface chemistry, and a limited set of hydride and halide precursor molecules, in a process known as atomic precision advanced manufacturing (APAM). It is desirable to expand this set of precursors to include dopants with organic functional groups and here we consider aluminium alkyls, to expand the applicability of APAM. We explore the impurity content and selectivity that results from using trimethyl aluminium and triethyl aluminium precursors on Si(001) to ultradope with aluminium through a hydrogen mask. Comparison of the methylated and ethylated precursors helps us understand the impact of hydrocarbon ligand selection on incorporation surface chemistry. Combining scanning tunneling microscopy and density functional theory calculations, we assess the limitations of both classes of precursor and extract general principles relevant to each.

More Details

Neuromorphic Graph Algorithms

Parekh, Ojas D.; Wang, Yipu; Ho, Yang; Phillips, Cynthia A.; Pinar, Ali P.; Aimone, James B.; Severa, William M.

Graph algorithms enable myriad large-scale applications including cybersecurity, social network analysis, resource allocation, and routing. The scalability of current graph algorithm implementations on conventional computing architectures are hampered by the demise of Moore’s law. We present a theoretical framework for designing and assessing the performance of graph algorithms executing in networks of spiking artificial neurons. Although spiking neural networks (SNNs) are capable of general-purpose computation, few algorithmic results with rigorous asymptotic performance analysis are known. SNNs are exceptionally well-motivated practically, as neuromorphic computing systems with 100 million spiking neurons are available, and systems with a billion neurons are anticipated in the next few years. Beyond massive parallelism and scalability, neuromorphic computing systems offer energy consumption orders of magnitude lower than conventional high-performance computing systems. We employ our framework to design and analyze new spiking algorithms for shortest path and dynamic programming problems. Our neuromorphic algorithms are message-passing algorithms relying critically on data movement for computation. For fair and rigorous comparison with conventional algorithms and architectures, which is challenging but paramount, we develop new models of data-movement in conventional computing architectures. This allows us to prove polynomial-factor advantages, even when we assume a SNN consisting of a simple grid-like network of neurons. To the best of our knowledge, this is one of the first examples of a rigorous asymptotic computational advantage for neuromorphic computing.

More Details

Harnessing exascale for whole wind farm high-fidelity simulations to improve wind farm efficiency

Crozier, Paul; Adcock, Christiane; Ananthan, Shreyas; Berger-Vergiat, Luc; Brazell, Michael; Brunhart-Lupo, Nicholas; Henry De Frahan, Marc T.; Hu, Jonathan J.; Knaus, Robert C.; Melvin, Jeremy; Moser, Bob; Mullowney, Paul; Rood, Jon; Sharma, Ashesh; Thomas, Stephen; Vijayakumar, Ganesh; Williams, Alan B.; Wilson, Robert; Yamazaki, Ichitaro; Sprague, Michael A.

Abstract not provided.

CSRI Summer Proceedings 2021

Smith, J.D.; Galvan, Edgar

The Computer Science Research Institute (CSRI) brings university faculty and students to Sandia National Laboratories for focused collaborative research on Department of Energy (DOE) computer and computational science problems. The institute provides an opportunity for university researches to learn about problems in computer and computational science at DOE laboratories, and help transfer results of their research to programs at the labs. Some specific CSRI research interest areas are: scalable solvers, optimization, algebraic preconditioners, graph-based, discrete, and combinatorial algorithms, uncertainty estimation, validation and verification methods, mesh generation, dynamic load-balancing, virus and other malicious-code defense, visualization, scalable cluster computers, beyond Moore’s Law computing, exascale computing tools and application design, reduced order and multiscale modeling, parallel input/output, and theoretical computer science. The CSRI Summer Program is organized by CSRI and includes a weekly seminar series and the publication of a summer proceedings.

More Details

Analysis and mitigation of parasitic resistance effects for analog in-memory neural network acceleration

Semiconductor Science and Technology

Xiao, Tianyao P.; Feinberg, Benjamin; Rohan, Jacob N.; Bennett, Christopher; Agarwal, Sapan; Marinella, Matthew

To support the increasing demands for efficient deep neural network processing, accelerators based on analog in-memory computation of matrix multiplication have recently gained significant attention for reducing the energy of neural network inference. However, analog processing within memory arrays must contend with the issue of parasitic voltage drops across the metal interconnects, which distort the results of the computation and limit the array size. This work analyzes how parasitic resistance affects the end-to-end inference accuracy of state-of-the-art convolutional neural networks, and comprehensively studies how various design decisions at the device, circuit, architecture, and algorithm levels affect the system's sensitivity to parasitic resistance effects. A set of guidelines are provided for how to design analog accelerator hardware that is intrinsically robust to parasitic resistance, without any explicit compensation or re-training of the network parameters.

More Details

Dakota, A Multilevel Parallel Object-Oriented Framework for Design Optimization, Parameter Estimation, Uncertainty Quantification, and Sensitivity Analysis: Version 6.15 User's Manual

Adams, Brian M.; Bohnhoff, William J.; Dalbey, Keith R.; Ebeida, Mohamed S.; Eddy, John P.; Eldred, Michael S.; Hooper, Russell W.; Hough, Patricia D.; Hu, Kenneth T.; Jakeman, John D.; Khalil, Mohammad; Maupin, Kathryn A.; Monschke, Jason A.; Ridgway, Elliott M.; Rushdi, Ahmad A.; Seidl, Daniel T.; Stephens, John A.; Winokur, Justin G.

The Dakota toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. Dakota contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic expansion methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the Dakota toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers.

More Details

srMO-BO-3GP: A sequential regularized multi-objective Bayesian optimization for constrained design applications using an uncertain Pareto classifier

Journal of Mechanical Design

Foulk, James W.; Eldred, Michael; Mccann, Scott; Wang, Yan

Bayesian optimization (BO) is an efficient and flexible global optimization framework that is applicable to a very wide range of engineering applications. To leverage the capability of the classical BO, many extensions, including multi-objective, multi-fidelity, parallelization, and latent-variable modeling, have been proposed to address the limitations of the classical BO framework. In this work, we propose a novel multi-objective BO formalism, called srMO-BO-3GP, to solve multi-objective optimization problems in a sequential setting. Three different Gaussian processes (GPs) are stacked together, where each of the GPs is assigned with a different task. The first GP is used to approximate a single-objective computed from the multi-objective definition, the second GP is used to learn the unknown constraints, and the third one is used to learn the uncertain Pareto frontier. At each iteration, a multi-objective augmented Tchebycheff function is adopted to convert multi-objective to single-objective, where the regularization with a regularized ridge term is also introduced to smooth the single-objective function. Finally, we couple the third GP along with the classical BO framework to explore the convergence and diversity of the Pareto frontier by the acquisition function for exploitation and exploration. The proposed framework is demonstrated using several numerical benchmark functions, as well as a thermomechanical finite element model for flip-chip package design optimization.

More Details

Computational Offload with BlueField Smart NICs

Karamati, Sara; Young, Jeffrey; Conte, Tom; Hemmert, Karl S.; Grant, Ryan; Hughes, Clayton; Vuduc, Rich

The recent introduction of a new generation of "smart NICs" have provided new accelerator platforms that include CPU cores or reconfigurable fabric in addition to traditional networking hardware and packet offloading capabilities. While there are currently several proposals for using these smartNICs for low-latency, in-line packet processing operations, there remains a gap in knowledge as to how they might be used as computational accelerators for traditional high-performance applications. This work aims to look at benchmarks and mini-applications to evaluate possible benefits of using a smartNIC as a compute accelerator for HPC applications. We investigate NVIDIA's current-generation BlueField-2 card, which includes eight Arm CPUs along with a small amount of storage, and we test the networking and data movement performance of these cards compared to a standard Intel server host. We then detail how two different applications, YASK and miniMD can be modified to make more efficient use of the BlueField-2 device with a focus on overlapping computation and communication for operations like neighbor building and halo exchanges. Our results show that while the overall compute performance of these devices is limited, using them with a modified miniMD algorithm allows for potential speedups of 5 to 20% over the host CPU baseline with no loss in simulation accuracy.

More Details

An introduction to neuromorphic computing and its potential impact for unattended ground sensors

Hill, Aaron; Vineyard, Craig M.

Neuromorphic computers are hardware systems that mimic the brain’s computational process phenomenology. This is in contrast to neural network accelerators, such as the Google TPU or the Intel Neural Compute Stick, which seek to accelerate the fundamental computation and data flows of neural network models used in the field of machine learning. Neuromorphic computers emulate the integrate and fire neuron dynamics of the brain to achieve a spiking communication architecture for computation. While neural networks are brain-inspired, they drastically oversimplify the brain’s computation model. Neuromorphic architectures are closer to the true computation model of the brain (albeit, still simplified). Neuromorphic computing models herald a 1000x power improvement over conventional CPU architectures. Sandia National Labs is a major contributor to the research community on neuromorphic systems by performing design analysis, evaluation, and algorithm development for neuromorphic computers. Space-based remote sensing development has been a focused target of funding for exploratory research into neuromorphic systems for their potential advantage in that program area; SNL has led some of these efforts. Recently, neuromorphic application evaluation has reached the NA-22 program area. This same exploratory research and algorithm development should penetrate the unattended ground sensor space for SNL’s mission partners and program areas. Neuromorphic computing paradigms offer a distinct advantage for the SWaP-constrained embedded systems of our diverse sponsor-driven program areas.

More Details

Controlled Formation of Stacked Si Quantum Dots in Vertical SiGe Nanowires

Nano Letters

Turner, Emily M.; Campbell, Quinn; Pizarro, Joaquin; Yang, Hongbin; Sapkota, Keshab R.; Lu, Ping; Baczewski, Andrew D.; Wang, George T.; Jones, Kevin S.

We demonstrate the ability to fabricate vertically stacked Si quantum dots (QDs) within SiGe nanowires with QD diameters down to 2 nm. These QDs are formed during high-temperature dry oxidation of Si/SiGe heterostructure pillars, during which Ge diffuses along the pillars' sidewalls and encapsulates the Si layers. Continued oxidation results in QDs with sizes dependent on oxidation time. The formation of a Ge-rich shell that encapsulates the Si QDs is observed, a configuration which is confirmed to be thermodynamically favorable with molecular dynamics and density functional theory. The type-II band alignment of the Si dot/SiGe pillar suggests that charge trapping on the Si QDs is possible, and electron energy loss spectra show that a conduction band offset of at least 200 meV is maintained for even the smallest Si QDs. Our approach is compatible with current Si-based manufacturing processes, offering a new avenue for realizing Si QD devices.

More Details

An optimization-based strategy for peridynamic-FEM coupling and for the prescription of nonlocal boundary conditions

D'Elia, Marta; Bochev, Pavel B.; Perego, Mauro; Trageser, Jeremy; Littlewood, David J.

We develop and analyze an optimization-based method for the coupling of a static peridynamic (PD) model and a static classical elasticity model. The approach formulates the coupling as a control problem in which the states are the solutions of the PD and classical equations, the objective is to minimize their mismatch on an overlap of the PD and classical domains, and the controls are virtual volume constraints and boundary conditions applied at the local-nonlocal interface. Our numerical tests performed on three-dimensional geometries illustrate the consistency and accuracy of our method, its numerical convergence, and its applicability to realistic engineering geometries. We demonstrate the coupling strategy as a means to reduce computational expense by confining the nonlocal model to a subdomain of interest, and as a means to transmit local (e.g., traction) boundary conditions applied at a surface to a nonlocal model in the bulk of the domain.

More Details

A-SST Initial Specification

Rodrigues, Arun; Hammond, Simon; Hemmert, Karl S.; Hughes, Clayton; Kenny, Joseph; Voskuilen, Gwendolyn R.

The U.S. Army Research Office (ARO), in partnership with IARPA, are investigating innovative, efficient, and scalable computer architectures that are capable of executing next-generation large scale data-analytic applications. These applications are increasingly sparse, unstructured, non-local, and heterogeneous. Under the Advanced Graphic Intelligence Logical computing Environment (AGILE) program, Performer teams will be asked to design computer architectures to meet the future needs of the DoD and the Intelligence Community (IC). This design effort will require flexible, scalable, and detailed simulation to assess the performance, efficiency, and validity of their designs. To support AGILE, Sandia National Labs will be providing the AGILE-enhanced Structural Simulation Toolkit (A-SST). This toolkit is a computer architecture simulation framework designed to support fast, parallel, and multi-scale simulation of novel architectures. This document describes the A-SST framework, some of its library of simulation models, and how it may be used by AGILE Performers.

More Details
Results 201–250 of 9,998
Results 201–250 of 9,998