Publications

Results 2801–2825 of 9,998

Search results

Jump to search filters

Computing with spikes: The advantage of fine-grained timing

Neural Computation

Verzi, Stephen J.; Rothganger, Fredrick R.; Parekh, Ojas D.; Quach, Tu T.; Miner, Nadine E.; Vineyard, Craig M.; James, Conrad D.; Aimone, James B.

Neural-inspired spike-based computing machines often claim to achieve considerable advantages in terms of energy and time efficiency by using spikes for computation and communication. However, fundamental questions about spike-based computation remain unanswered. For instance, how much advantage do spike-based approaches have over conventionalmethods, and underwhat circumstances does spike-based computing provide a comparative advantage? Simply implementing existing algorithms using spikes as the medium of computation and communication is not guaranteed to yield an advantage. Here, we demonstrate that spike-based communication and computation within algorithms can increase throughput, and they can decrease energy cost in some cases. We present several spiking algorithms, including sorting a set of numbers in ascending/descending order, as well as finding the maximum or minimum ormedian of a set of numbers.We also provide an example application: a spiking median-filtering approach for image processing providing a low-energy, parallel implementation. The algorithms and analyses presented here demonstrate that spiking algorithms can provide performance advantages and offer efficient computation of fundamental operations useful in more complex algorithms.

More Details

Analysis of Microgrid Locations Benefitting Community Resilience for Puerto Rico

Jeffers, Robert; Staid, Andrea; Baca, Michael J.; Currie, Frank M.; Fogleman, William E.; Derosa, Sean; Wachtel, Amanda; Outkin, Alexander V.

An analysis of microgrids to increase resilience was conducted for the island of Puerto Rico. Critical infrastructure throughout the island was mapped to the key services provided by those sectors to help inform primary and secondary service sources during a major disruption to the electrical grid. Additionally, a resilience metric of burden was developed to quantify community resilience, and a related baseline resilience figure was calculated for the area. To improve resilience, Sandia performed an analysis of where clusters of critical infrastructure are located and used these suggested resilience node locations to create a portfolio of 159 microgrid options throughout Puerto Rico. The team then calculated the impact of these microgrids on the region's ability to provide critical services during an outage, and compared this impact to high-level estimates of cost for each microgrid to generate a set of efficient microgrid portfolios costing in the range of 218-917M dollars. This analysis is a refinement of the analysis delivered on June 01, 2018.

More Details

A survey of MPI usage in the US exascale computing project

Concurrency and Computation. Practice and Experience

Bernholdt, David E.; Boehm, Swen; Bosilca, George; Venkata, Manjunath G.; Grant, Ryan; Naughton, Thomas; Pritchard, Howard P.; Schulz, Martin; Vallee, Geoffroy R.

The Exascale Computing Project (ECP) is currently the primary effort in the United States focused on developing “exascale” levels of computing capabilities, including hardware, software, and applications. In order to obtain a more thorough understanding of how the software projects under the ECP are using, and planning to use the Message Passing Interface (MPI), and help guide the work of our own project within the ECP, we created a survey. Of the 97 ECP projects active at the time the survey was distributed, we received 77 responses, 56 of which reported that their projects were using MPI. Furthermore, this paper reports the results of that survey for the benefit of the broader community of MPI developers.

More Details

Topology Optimization for Nonlinear Transient Applications Using a Minimally Invasive Approach (LDRD Final Report)

Robbins, Joshua

The purpose of this project was to devise, implement, and demonstrate a method that can use Sandia's existing analysis codes (e.g., Sierra, Alegra, the CTH hydro code) with minimal modification to generate objective function gradients for optimization-based design in transient, non-linear, coupled-physics applications. The approach uses a Moving Least Squares representation of the geometry to substantially reduce the number of geometric degrees of freedom. A Multiple-Program Multiple-Data computing model is then used to compute objective gradients via finite differencing. Details of the formulation and implementation are provided, and example applications are presented that show effectiveness and scalability of the approach.

More Details

Sierra/SolidMechanics 4.50 Theory Manual

Merewether, Mark T.; Plews, Julia A.; Crane, Nathan K.; De Frias, Gabriel J.; San LeSan; Littlewood, David J.; Mosby, Matthew D.; Pierson, Kendall H.; Porter, Vicki L.; Shelton, Timothy R.; Thomas, Jesse D.; Tupek, Michael R.; Veilleux, Michael G.; Xavier, Patrick G.; Manktelow, Kevin; Clutz, Christopher C.

Presented in this document are the theoretical aspects of capabilities contained in the Sierra / SM code. This manuscript serves as an ideal starting point for understanding the theoretical foundations of the code. For a comprehensive study of these capabilities, the reader is encouraged to explore the many references to scientific articles and textbooks contained in this manual. It is important to point out that some capabilities are still in development and may not be presented in this document. Further updates to this manuscript will be made as these capabilities come closer to production level.

More Details

Using simulation to examine the effect of MPI message matching costs on application performance

ACM International Conference Proceeding Series

Levy, Scott L.N.; Ferreira, Kurt

Attaining high performance with MPI applications requires efficient message matching to minimize message processing overheads and the latency these overheads introduce into application communication. In this paper, we use a validated simulation-based approach to examine the relationship between MPI message matching performance and application time-to-solution. Specifically, we examine how the performance of several important HPC workloads is affected by the time required for matching. Our analysis yields several important contributions: (i) the performance of current workloads is unlikely to be significantly affected by MPI matching unless match queue operations get much slower or match queues get much longer; (ii) match queue designs that provide sublinear performance as a function of queue length are unlikely to yield much benefit unless match queue lengths increase dramatically; and (iii) we provide guidance on how long the mean time per match attempt may be without significantly affecting application performance. The results and analysis in this paper provide valuable guidance on the design and development of MPI message match queues.

More Details

Adjoint-based Calibration of Plasticity Model Parameters from Digital Image Correlation Data

Granzow, Brian N.; Seidl, D.T.

Parameter estimation for mechanical models of plastic deformation utilized in nuclear weapons systems is a laborious process for both experimentalists and constitutive modelers and is critical to producing meaningful numerical predictions. In this work we derive an adjoint-based optimization approach for a stabilized, large-deformation J2 plasticity model that is considerably more computationally efficient but no less accurate than current state of the art methods. Unlike most approaches to model calibration, we drive the inversion procedure with full-field deformation data that can be experimentally measured through established digital image or volume correlation techniques. We present numerical results for two and three dimensional model problems and comment on various directions of future research.

More Details

Exploring Applications of Random Walks on Spiking Neural Algorithms

Reeder, Leah; Hill, Aaron; Aimone, James B.; Severa, William M.

Neuromorphic computing has many promises in the future of computing due to its energy efficient and scalable implementation. Here we extend a neural algorithm that is able to solve the diffusion equation PDE by implementing random walks on neuromorphic hardware. Additionally, we introduce four random walk applications that use this spiking neural algorithm. The four applications currently implemented are: generating a random walk to replicate an image, finding a path between two nodes, finding triangles in a graph, and partitioning a graph into two sections. We then made these four applications available to be implemented on software using a graphical user interface (GUI).

More Details

ECP STPR04 Milestone 5 Report

Trott, Christian R.

This report documents the completion of milestone STPRO4-5 Kokkos interoperability with general SIMD types to force vectorization on ATS-1. The Kokkos team worked with application developers to enable the utilization of SIMD intrinsics, which allowed up to 3.7x improvement of the affected kernels on ATS-1 in a proxy application. SIMD types are now deployed in the production code base.

More Details

Kokkos R&D: Remote Memory Spaces WBS STPR 04 Milestone 7

Trott, Christian R.

This report documents the completion of milestone STPRO4-7 Kokkos R&D: Remote Memory Spaces for One-Sided Halo-Exchange. The goal of this milestone was to develop and deploy an initial capability to support PGAS like communication models integrated into Kokkos via Remote Memory Spaces. The team developed semantic requirements for Remote Memory Spaces and implemented a prototype library leveraging four different communication libraries: libQUO, SHMEM, MPI-OneSided and NVSHMEM. In conjunction with ADCD02-COPA the Remote Memory Space capability was used in ExaMiniMD — a Molecular Dynamics Proxy Application — to explore the current state of the technology and its usability. The obtained results demonstrate that usability is very good, allowing a significant simplification communication routines, but performance is still lacking.

More Details

STPR 04 Milestone 6 Report

Trott, Christian R.; Ibanez-Granados, Daniel A.; Ellingwood, Nathan D.; Bova, Steven W.; Labreche, Duane A.

This report documents the completion of milestone STPRO4-6 Kokkos Support for ASC applications and libraries. The team provided consultation and support for numerous ASC code projects including Sandias SPARC, EMPIRE, Aria, GEMMA, Alexa, Trilinos, LAMMPS and nimbleSM. Over the year more than 350 Kokkos github issues were resolved, with over 220 requiring fixes and enhancements to the code base. Resolving these requests, with many of them issued by ASC code teams, provided applications with the necessary capabilities in Kokkos to be successful.

More Details

WBS STPR 04 Milestone 4 Report

Sunderland, Daniel; Hoemmen, Mark F.; Trott, Christian R.

This report documents the completion of milestone STPRO4-4 Kokkos back-ends research, collaborations, development, optimization, and documentation. The Kokkos team updated its existing backend to support the software stack and hardware of DOE's Sierra, Summit and Astra machines. They also collaborated with ECP PathForward vendors on developing backends for possible exa-scale architectures. Furthermore, the team ramped up its engagement with the ISO/C++ committee to accelerate the adoption of features important for the HPC community into the C++ standard.

More Details

WBS STPR 04 Milestone 4 Report

Trott, Christian R.; Sunderland, Daniel; Hoemmen, Mark F.

This report documents the completion of milestone STPRO4-4 Kokkos back-ends research, collaborations, development, optimization, and documentation. The Kokkos team updated its existing backend to support the software stack and hardware of DOE's Sierra, Summit and Astra machines. They also collaborated with ECP PathForward vendors on developing backends for possible exa-scale architectures. Furthermore, the team ramped up its engagement with the ISO/C++ committee to accelerate the adoption of features important for the HPC community into the C++ standard.

More Details

ASC ATDM Level 2 Milestone #6358: Assess Status of Next Generation Components and Physics Models in EMPIRE

Bettencourt, Matthew T.; Kramer, Richard M.J.; Cartwright, Keith; Phillips, Edward; Ober, Curtis C.; Pawlowski, Roger; Swan, Matthew S.; Tezaur, Irina K.; Phipps, Eric T.; Conde, Sidafa; Cyr, Eric C.; Ulmer, Craig; Kordenbrock, Todd; Levy, Scott L.N.; Templet, Gary J.; Hu, Jonathan J.; Lin, Paul T.; Glusa, Christian; Siefert, Christopher; Glass, Micheal W.

This report documents the outcome from the ASC ATDM Level 2 Milestone 6358: Assess Status of Next Generation Components and Physics Models in EMPIRE. This Milestone is an assessment of the EMPIRE (ElectroMagnetic Plasma In Realistic Environments) application and three software components. The assessment focuses on the electromagnetic and electrostatic particle-in-cell solutions for EMPIRE and its associated solver, time integration, and checkpoint-restart components. This information provides a clear understanding of the current status of the EMPIRE application and will help to guide future work in FY19 in order to ready the application for the ASC ATDM L1 Milestone in FY20. It is clear from this assessment that performance of the linear solver will have to be a focus in FY19.

More Details

Characterizing MPI matching via trace-based simulation

Parallel Computing

Ferreira, Kurt; Levy, Scott L.N.; Foulk, James W.; Grant, Ryan

With the increased scale expected on future leadership-class systems, detailed information about the resource usage and performance of MPI message matching provides important insights into how to maintain application performance on next-generation systems. However, obtaining MPI message matching performance data is often not possible without significant effort. A common approach is to instrument an MPI implementation to collect relevant statistics. While this approach can provide important data, collecting matching data at runtime perturbs the application's execution, including its matching performance, and is highly dependent on the MPI library's matchlist implementation. In this paper, we introduce a trace-based simulation approach to obtain detailed MPI message matching performance data for MPI applications without perturbing their execution. Using a number of key parallel workloads and microbenchmarks, we demonstrate that this simulator approach can rapidly and accurately characterize matching behavior. Specifically, we use our simulator to collect several important statistics about the operation of the MPI posted and unexpected queues. For example, we present data about search lengths and the duration that messages spend in the queues waiting to be matched. Data gathered using this simulation-based approach have significant potential to aid hardware designers in determining resource allocation for MPI matching functions and provide application and middleware developers with insight into the scalability issues associated with MPI message matching.

More Details

Recent Diagnostic Platform Accomplishments for Studying Vacuum Power Flow Physics at the Sandia Z Accelerator

Laity, George R.; Aragon, Carlos; Bennett, Nichelle L.; Bliss, David E.; Foulk, James W.; Fierro, Andrew S.; Gomez, Matthew R.; Hess, Mark H.; Hutsel, Brian T.; Jennings, Christopher A.; Johnston, Mark D.; Kossow, Michael R.; Lamppa, Derek C.; Martin, Matthew R.; Patel, Sonal G.; Porwitzky, Andrew J.; Robinson, Allen C.; Rose, David; Vandevender, Pace; Waisman, Eduardo M.; Webb, Timothy J.; Welch, Dale; Rochau, G.A.; Savage, Mark E.; Stygar, William; White, William M.; Sinars, Daniel; Cuneo, Michael E.

Abstract not provided.

Optimal Design and Control of Qubits

Von Winckel, Gregory

Research interest in developing computing systems that represent logic states using quantum mechanical observables has only increased in the few decades since its inception. While quantum computers, with Josephson junction based qubits, have now been commercially available in the last three years, there is also significant research initiative to develop scalable quantum computers with so-called donor qubits. B.E. Kane first published on a device implementation of a silicon-based quantum computer in 1998, which sparked a wave of follow-on advances due to the attractive nature of silicon-based computing[7]. Nearly all commercial computing systems using classical binary logic are fabricated using a silicon substrate and it is inarguably the most mature material system for semiconductor devices, so that coupling classical and quantum bits on a single substrate is possible. The process of growing and processing silicon crystals into wafers is extremely robust and leads to minimal impurities or structural defects.

More Details

Born Qualified Grand Challenge LDRD Final Report

Roach, Robert A.; Argibay, Nicolas; Allen, Kyle; Balch, Dorian K.; Beghini, Lauren L.; Bishop, Joseph E.; Boyce, Brad L.; Brown, Judith A.; Burchard, Ross L.; Chandross, Michael E.; Cook, Adam; Diantonio, Christopher; Dressler, Amber D.; Forrest, Eric C.; Ford, Kurtis; Ivanoff, Thomas; Jared, Bradley H.; Johnson, Kyle L.; Kammler, Daniel; Koepke, Joshua R.; Kustas, Andrew B.; Lavin, Judith M.; Leathe, Nicholas S.; Lester, Brian T.; Madison, Jonathan D.; Mani, Seethambal; Martinez, Mario J.; Moser, Daniel R.; Rodgers, Theron M.; Seidl, D.T.; Brown-Shaklee, Harlan J.; Stanford, Joshua; Stender, Michael; Sugar, Joshua D.; Swiler, Laura P.; Taylor, Samantha; Trembacki, Bradley L.

This SAND report fulfills the final report requirement for the Born Qualified Grand Challenge LDRD. Born Qualified was funded from FY16-FY18 with a total budget of ~$13M over the 3 years of funding. Overall 70+ staff, Post Docs, and students supported this project over its lifetime. The driver for Born Qualified was using Additive Manufacturing (AM) to change the qualification paradigm for low volume, high value, high consequence, complex parts that are common in high-risk industries such as ND, defense, energy, aerospace, and medical. AM offers the opportunity to transform design, manufacturing, and qualification with its unique capabilities. AM is a disruptive technology, allowing the capability to simultaneously create part and material while tightly controlling and monitoring the manufacturing process at the voxel level, with the inherent flexibility and agility in printing layer-by-layer. AM enables the possibility of measuring critical material and part parameters during manufacturing, thus changing the way we collect data, assess performance, and accept or qualify parts. It provides an opportunity to shift from the current iterative design-build-test qualification paradigm using traditional manufacturing processes to design-by-predictivity where requirements are addressed concurrently and rapidly. The new qualification paradigm driven by AM provides the opportunity to predict performance probabilistically, to optimally control the manufacturing process, and to implement accelerated cycles of learning. Exploiting these capabilities to realize a new uncertainty quantification-driven qualification that is rapid, flexible, and practical is the focus of this effort.

More Details
Results 2801–2825 of 9,998
Results 2801–2825 of 9,998