Publications Search

A Test Platform to Characterize Emerging Nonvolatile Memories for Computing

Agarwal, Sapan; Wilson, Donald E.; Gilbert, Nad; Spear, Matthew E.; Short, Jesse C.; Bennett, Christopher; Wahby, William; Kim, Joshua; Jacobs-Gedrim, Robin B.; Xiao, Tianyao P.; Marinella, Matthew

Abstract not provided.

More Details

TYPE Conference Presentation YEAR 2024

DOI OSTI

TCAM-SSD: A Framework for Search-Based Computing in Solid-State Drives

Wong, Ryan; Kim, Nikita; Higgs, Kevin; Ipek, Engin; Agarwal, Sapan; Ghose, Saugata; Feinberg, Benjamin

Abstract not provided.

More Details

TYPE Conference Presentation YEAR 2024

OSTI

Biological Dynamics Enabling Training of Binary Recurrent Networks

2024 IEEE Neuro Inspired Computational Elements Conference, NICE 2024 - Proceedings

Foulk, James W.; Agarwal, Sapan; Xiao, Tianyao P.; Hays, Park E.; Musuvathy, Srideep S.

Neuromorphic computing systems have been used for the processing of spatiotemporal video-like data, requiring the use of recurrent networks, while attempting to minimize power consumption by utilizing binary activation functions. However, previous work on binary activation networks has primarily focused on training of feed-forward networks due to difficulties in training recurrent binary networks. Spiking neural networks however have been successfully trained in recurrent networks, despite the fact that they operate with binary communication. Intrigued by this discrepancy, we design a generalized leaky-integrate and fire neuron which can be deconstructed to a binary activation unit, allowing us to investigate the minimal dynamics from a spiking network that are required to allow binary activation networks to be trained. We find that a subthreshold integrative membrane potential is the only requirement to allow an otherwise standard binary activation unit to be trained in a recurrent network. Investigating further the trained networks, we find that these stateful binary networks learn a soft reset mechanism by recurrent weights, allowing them to approximate the explicit reset of spiking networks.

More Details

TYPE Conference Paper YEAR 2024

DOI OSTI Scopus

Bayesian Neural Network Implemented by Dynamically Programmable Noise in Vanadium Oxide

Oh, Sangheon; Xiao, Tianyao P.; Bennett, Christopher; Weiss, Alex J.; Bishop, Sean R.; Finnegan, Patrick S.; Fuller, Elliot J.; Agarwal, Sapan; Talin, Albert A.

Abstract not provided.

More Details

TYPE Conference Presentation YEAR 2023

DOI OSTI

Explainable machine learning for hydrogen diffusion in metals and random binary alloys

Physical Review Materials

Lu, Grace M.; Witman, Matthew D.; Agarwal, Sapan; Stavila, Vitalie; Trinkle, Dallas R.

Hydrogen diffusion in metals and alloys plays an important role in the discovery of new materials for fuel cell and energy storage technology. While analytic models use hand-selected features that have clear physical ties to hydrogen diffusion, they often lack accuracy when making quantitative predictions. Machine learning models are capable of making accurate predictions, but their inner workings are obscured, rendering it unclear which physical features are truly important. To develop interpretable machine learning models to predict the activation energies of hydrogen diffusion in metals and random binary alloys, we create a database for physical and chemical properties of the species and use it to fit six machine learning models. Our models achieve root-mean-squared errors between 98-119 meV on the testing data and accurately predict that elemental Ru has a large activation energy, while elemental Cr and Fe have small activation energies. By analyzing the feature importances of these fitted models, we identify relevant physical properties for predicting hydrogen diffusivity. While metrics for measuring the individual feature importances for machine learning models exist, correlations between the features lead to disagreement between models and limit the conclusions that can be drawn. Instead grouped feature importance, formed by combining the features via their correlations, agree across the six models and reveal that the two groups containing the packing factor and electronic specific heat are particularly significant for predicting hydrogen diffusion in metals and random binary alloys. This framework allows us to interpret machine learning models and enables rapid screening of new materials with the desired rates of hydrogen diffusion.

More Details

TYPE Journal Article YEAR 2023

DOI OSTI Scopus

Towards Pareto optimal high entropy hydrides via data-driven materials discovery

Journal of Materials Chemistry A

Witman, Matthew D.; Ling, Sanliang; Wadge, Matthew; Bouzidi, Anis; Pineda-Romero, Nayely; Clulow, Rebecca; Ek, Gustav; Chames, Jeffery M.; Allendorf, Emily J.; Agarwal, Sapan; Allendorf, Mark; Walker, Gavin S.; Grant, David M.; Sahlberg, Martin; Zlotea, Claudia; Stavila, Vitalie

The ability to rapidly screen material performance in the vast space of high entropy alloys is of critical importance to efficiently identify optimal hydride candidates for various use cases. Given the prohibitive complexity of first principles simulations and large-scale sampling required to rigorously predict hydrogen equilibrium in these systems, we turn to compositional machine learning models as the most feasible approach to screen on the order of tens of thousands of candidate equimolar high entropy alloys (HEAs). Critically, we show that machine learning models can predict hydride thermodynamics and capacities with reasonable accuracy (e.g. a mean absolute error in desorption enthalpy prediction of ∼5 kJ molH2−1) and that explainability analyses capture the competing trade-offs that arise from feature interdependence. We can therefore elucidate the multi-dimensional Pareto optimal set of materials, i.e., where two or more competing objective properties can't be simultaneously improved by another material. This provides rapid and efficient down-selection of the highest priority candidates for more time-consuming density functional theory investigations and experimental validation. Various targets were selected from the predicted Pareto front (with saturation capacities approaching two hydrogen per metal and desorption enthalpy less than 60 kJ molH2−1) and were experimentally synthesized, characterized, and tested amongst an international collaboration group to validate the proposed novel hydrides. Additional top-predicted candidates are suggested to the community for future synthesis efforts, and we conclude with an outlook on improving the current approach for the next generation of computational HEA hydride discovery efforts.

More Details

TYPE Journal Article YEAR 2023

DOI OSTI Scopus

Parallel Matrix Multiplication Using Voltage-Controlled Magnetic Anisotropy Domain Wall Logic

IEEE Journal on Exploratory Solid-State Computational Devices and Circuits

Zogbi, Nicholas; Liu, Samuel; Bennett, Christopher; Agarwal, Sapan; Marinella, Matthew J.; Incorvia, Jean A.C.; Xiao, Tianyao P.

The domain wall-magnetic tunnel junction (DW-MTJ) is a versatile device that can simultaneously store data and perform computations. These three-terminal devices are promising for digital logic due to their nonvolatility, low-energy operation, and radiation hardness. Here, we augment the DW-MTJ logic gate with voltage-controlled magnetic anisotropy (VCMA) to improve the reliability of logical concatenation in the presence of realistic process variations. VCMA creates potential wells that allow for reliable and repeatable localization of domain walls (DWs). The DW-MTJ logic gate supports different fanouts, allowing for multiple inputs and outputs for a single device without affecting the area. We simulate a systolic array of DW-MTJ multiply-accumulate (MAC) units with 4-bit and 8-bit precision, which uses the nonvolatility of DW-MTJ logic gates to enable fine-grained pipelining and high parallelism. The DW-MTJ systolic array provides comparable throughput and efficiency to state-of-the-art CMOS systolic arrays while being radiation-hard. These results improve the feasibility of using DW-based processors, especially for extreme-environment applications such as space.

More Details

TYPE Journal Article YEAR 2023

DOI OSTI Scopus DOI OSTI Scopus

Enabling High-Speed, High-Resolution Space-based Focal Plane Arrays with Analog In-Memory Computing Presentation Slides

Xiao, Tianyao P.; Wahby, William; Bennett, Christopher; Hays, Park E.; Agrawal, Vineet; Marinella, Matthew; Agarwal, Sapan

Abstract not provided.

More Details

TYPE Conference Presentation YEAR 2023

DOI OSTI

ATHENA: An Analytical Analog Neuromorphic Hardware Estimation Tool

Plagge, Mark; Cardwell, Suma G.; Hughes, Clayton; Agarwal, Sapan

Abstract not provided.

More Details

TYPE Conference Paper YEAR 2022

OSTI

Subthreshold operation of SONOS analog memory to enable accurate low-power neural network inference

Xiao, Tianyao P.; Bennett, Christopher; Feinberg, Benjamin; Marinella, Matthew; Agarwal, Sapan

Abstract not provided.

More Details

TYPE Conference Presentation YEAR 2022

DOI OSTI

ATHNEA: Enabling Codesign for Next-Generation AI/ML Architectures

Plagge, Mark; Feinberg, Benjamin; Rothganger, Fredrick R.; Agarwal, Sapan; Hughes, Clayton; Cardwell, Suma G.

Abstract not provided.

More Details

TYPE Conference Presentation YEAR 2022

DOI OSTI

ATHENA: Analytical Tool for Heterogeneous Neuromorphic Architectures

Cardwell, Suma G.; Plagge, Mark; Hughes, Clayton; Rothganger, Fredrick R.; Agarwal, Sapan; Feinberg, Benjamin; Awad, Amro; Mcfarland, John; Parker, Luke

The ASC program seeks to use machine learning to improve efficiencies in its stockpile stewardship mission. Moreover, there is a growing market for technologies dedicated to accelerating AI workloads. Many of these emerging architectures promise to provide savings in energy efficiency, area, and latency when compared to traditional CPUs for these types of applications — neuromorphic analog and digital technologies provide both low-power and configurable acceleration of challenging artificial intelligence (AI) algorithms. If designed into a heterogeneous system with other accelerators and conventional compute nodes, these technologies have the potential to augment the capabilities of traditional High Performance Computing (HPC) platforms [5]. This expanded computation space requires not only a new approach to physics simulation, but the ability to evaluate and analyze next-generation architectures specialized for AI/ML workloads in both traditional HPC and embedded ND applications. Developing this capability will enable ASC to understand how this hardware performs in both HPC and ND environments, improve our ability to port our applications, guide the development of computing hardware, and inform vendor interactions, leading them toward solutions that address ASC’s unique requirements.

More Details

TYPE SAND Report YEAR 2022

DOI OSTI

Probabilistic Nanomagnetic Memories for Uncertain and Robust Machine Learning

Bennett, Christopher; Xiao, Tianyao P.; Liu, Samuel; Humphrey, Leonard; Incorvia, Jean A.; Debusschere, Bert; Ries, Daniel; Agarwal, Sapan

This project evaluated the use of emerging spintronic memory devices for robust and efficient variational inference schemes. Variational inference (VI) schemes, which constrain the distribution for each weight to be a Gaussian distribution with a mean and standard deviation, are a tractable method for calculating posterior distributions of weights in a Bayesian neural network such that this neural network can also be trained using the powerful backpropagation algorithm. Our project focuses on domain-wall magnetic tunnel junctions (DW-MTJs), a powerful multi-functional spintronic synapse design that can achieve low power switching while also opening the pathway towards repeatable, analog operation using fabricated notches. Our initial efforts to employ DW-MTJs as an all-in-one stochastic synapse with both a mean and standard deviation didn’t end up meeting the quality metrics for hardware-friendly VI. In the future, new device stacks and methods for expressive anisotropy modification may make this idea still possible. However, as a fall back that immediately satisfies our requirements, we invented and detailed how the combination of a DW-MTJ synapse encoding the mean and a probabilistic Bayes-MTJ device, programmed via a ferroelectric or ionically modifiable layer, can robustly and expressively implement VI. This design includes a physics-informed small circuit model, that was scaled up to perform and demonstrate rigorous uncertainty quantification applications, up to and including small convolutional networks on a grayscale image classification task, and larger (Residual) networks implementing multi-channel image classification. Lastly, as these results and ideas all depend upon the idea of an inference application where weights (spintronic memory states) remain non-volatile, the retention of these synapses for the notched case was further interrogated. These investigations revealed and emphasized the importance of both notch geometry and anisotropy modification in order to further enhance the endurance of written spintronic states. In the near future, these results will be mapped to effective predictions for room temperature and elevated operation DW-MTJ memory retention, and experimentally verified when devices become available.

More Details

TYPE SAND Report YEAR 2022

DOI OSTI

Modeling Analog Tile-Based Accelerators Using SST

Feinberg, Benjamin; Agarwal, Sapan; Plagge, Mark; Rothganger, Fredrick R.; Cardwell, Suma G.; Hughes, Clayton

Analog computing has been widely proposed to improve the energy efficiency of multiple important workloads including neural network operations, and other linear algebra kernels. To properly evaluate analog computing and explore more complex workloads such as systems consisting of multiple analog data paths, system level simulations are required. Moreover, prior work on system architectures for analog computing often rely on custom simulators creating signficant additional design effort and complicating comparisons between different systems. To remedy these issues, this report describes the design and implementation of a flexible tile-based analog accelerator element for the Structural Simulation Toolkit (SST). The element focuses on heavily on the tile controller—an often neglected aspect of prior work—that is sufficiently versatile to simulate a wide range of different tile operations including neural network layers, signal processing kernels, and generic linear algebra operations without major constraints. The tile model also interoperates with existing SST memory and network models to reduce the overall development load and enable future simulation of heterogeneous systems with both conventional digital logic and analog compute tiles. Finally, both the tile and array models are designed to easily support future extensions as new analog operations and applications that can benefit from analog computing are developed.

More Details

TYPE SAND Report YEAR 2022

DOI OSTI

Single Event Upset and Total Ionizing Dose Response of 12LP FinFET Digital Circuits

Spear, Matthew; Wallace, Trace; Wilson, Donald E.; Solano, Jose; Irumva, Gedeon; Esqueda, Ivan S.; Barnaby, Hugh J.; Clark, Lawrence; Brunhaver, John; Turowski, Marek; Mikkola, Esko; Hughart, David R.; Young, Joshua M.; Manuel, Jack; Agarwal, Sapan; Vaandrager, Bastiaan L.; Vizkelethy, Gyorgy; Gutierrez, Amos; Trippe, James; King, Michael P.; Bielejec, Edward S.; Marinella, Matthew

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2022

DOI OSTI

Single Event Upset and Total Ionizing Dose Response of 12LP FinFET Digital Circuits

Spear, Matthew; Wallace, Trace; Wilson, Donald E.; Solano, Jose; Irumva, Gedeon; Esqueda, Ivan S.; Barnaby, Hugh J.; Clark, Lawrence; Brunhaver, John; Turowski, Marek; Mikkola, Esko; Hughart, David R.; Young, Joshua M.; Manuel, Jack; Agarwal, Sapan; Vaandrager, Bastiaan L.; Vizkelethy, Gyorgy; King, Michael P.; Marinella, Matthew

Abstract not provided.

More Details

TYPE Conference Proceeding YEAR 2022

OSTI

CrossSim Inference Manual v2.0

Xiao, Tianyao P.; Bennett, Christopher; Feinberg, Benjamin; Marinella, Matthew; Agarwal, Sapan

Neural networks are largely based on matrix computations. During forward inference, the most heavily used compute kernel is the matrix-vector multiplication (MVM): $W \vec{x} $. Inference is a first frontier for the deployment of next-generation hardware for neural network applications, as it is more readily deployed in edge devices, such as mobile devices or embedded processors with size, weight, and power constraints. Inference is also easier to implement in analog systems than training, which has more stringent device requirements. The main processing kernel used during inference is the MVM.

More Details

TYPE Other Report YEAR 2022

DOI OSTI

Characterization of Memory Devices for Energy Efficient Analog In-Memory Neural Computing at the Edge

Marinella, Matthew; Xiao, Tianyao P.; Bennett, Christopher; Wahby, William; Jacobs-Gedrim, Robin B.; Hughart, David R.; Fuller, Elliot J.; Talin, Albert A.; Agarwal, Sapan

Abstract not provided.

More Details

TYPE Conference Presentation YEAR 2022

DOI OSTI

An Accurate, Error-Tolerant, and Energy-Efficient Neural Network Inference Engine Based on SONOS Analog Memory

IEEE Transactions on Circuits and Systems I: Regular Papers

Xiao, Tianyao P.; Feinberg, Benjamin; Bennett, Christopher; Agrawal, Vineet; Saxena, Prashant; Prabhakar, Venkatraman; Ramkumar, Krishnaswamy; Medu, Harsha; Raghavan, Vijay; Chettuvetty, Ramesh; Agarwal, Sapan; Marinella, Matthew

We demonstrate SONOS (silicon-oxide-nitride-oxide-silicon) analog memory arrays that are optimized for neural network inference. The devices are fabricated in a 40nm process and operated in the subthreshold regime for in-memory matrix multiplication. Subthreshold operation enables low conductances to be implemented with low error, which matches the typical weight distribution of neural networks, which is heavily skewed toward near-zero values. This leads to high accuracy in the presence of programming errors and process variations. We simulate the end-To-end neural network inference accuracy, accounting for the measured programming error, read noise, and retention loss in a fabricated SONOS array. Evaluated on the ImageNet dataset using ResNet50, the accuracy using a SONOS system is within 2.16% of floating-point accuracy without any retraining. The unique error properties and high On/Off ratio of the SONOS device allow scaling to large arrays without bit slicing, and enable an inference architecture that achieves 20 TOPS/W on ResNet50, a > 10× gain in energy efficiency over state-of-The-Art digital and analog inference accelerators.

More Details

TYPE Journal Article YEAR 2022

DOI OSTI Scopus

Ionic floating-gate memory for neuromorphic computing

Fuller, Elliot J.; Keene, S.T.; Melianas, A.; Wang, Z.; Agarwal, Sapan; Tuchman, Y.; James, Conrad D.; Marinella, Matthew; Yang, J.J.; Salleo, A.; Talin, Albert A.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2022

OSTI

Single-Event Effects Induced by Heavy Ions in SONOS Charge Trapping Memory Arrays

IEEE Transactions on Nuclear Science

Xiao, Tianyao P.; Bennett, Christopher; Agarwal, Sapan; Hughart, David R.; Barnaby, Hugh J.; Puchner, Helmut; Talin, Albert A.; Marinella, Matthew

We investigate the sensitivity of silicon-oxide-nitride-silicon-oxide (SONOS) charge trapping memory technology to heavy-ion induced single-event effects. Threshold voltage ( V_T ) statistics were collected across multiple test chips that contained in total 18 Mb of 40-nm SONOS memory arrays. The arrays were irradiated with Kr and Ar ion beams, and the changes in their V_T distributions were analyzed as a function of linear energy transfer (LET), beam fluence, and operating temperature. We observe that heavy ion irradiation induces a tail of disturbed devices in the 'program' state distribution, which has also been seen in the response of floating-gate (FG) flash cells. However, the V_T distribution of SONOS cells lacks a distinct secondary peak, which is generally attributed to direct ion strikes to the gate-stack of FG cells. This property, combined with the observed change in the V_T distribution with LET, suggests that SONOS cells are not particularly sensitive to direct ion strikes but cells in the proximity of an ion's absorption can still experience a V_T shift. These results shed new light on the physical mechanisms underlying the V_T shift induced by a single heavy ion in scaled charge trap memory.

More Details

TYPE Conference Presentation YEAR 2022

DOI OSTI Scopus

Domain wall-magnetic tunnel junction synapses for Bayesian neural networks

Liu, Samuel; Xiao, Tianyao P.; Agarwal, Sapan; Debusschere, Bert; Bennett, Christopher; Incorvia, Jean A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2022

OSTI

Achieving Accurate In-Memory Neural Network Inference with Highly Overlapping Nonvolatile Memory State Distributions

Marinella, Matthew; Xiao, Tianyao P.; Feinberg, Benjamin; Bennett, Christopher; Agrawal, Vineet; Puchner, Helmut; Agarwal, Sapan

Abstract not provided.

More Details

TYPE Conference Presentation YEAR 2022

DOI OSTI

Energy Efficient Deep Neural Network Processing: Digital CMOS Limits and Prospects for Analog In-Memory Computing

Marinella, Matthew; Xiao, Tianyao P.; Bennett, Christopher; Feinberg, Benjamin; Wahby, William; Jacobs-Gedrim, Robin B.; Agrawal, Vineet; Puchner, Helmut; Barnaby, Hugh; Agarwal, Sapan

Abstract not provided.

More Details

TYPE Conference Presentation YEAR 2022

DOI OSTI

Eris: Fault Injection and Tracking Framework for Reliability Analysis of Open-Source Hardware

Proceedings 2022 IEEE International Symposium on Performance Analysis of Systems and Software Ispass 2022

Nema, Shubham; Kirschner, Justin; Adak, Debpratim; Agarwal, Sapan; Feinberg, Benjamin; Rodrigues, Arun; Marinella, Matthew; Awad, Amro

As transistors have been scaled over the past decade, modern systems have become increasingly susceptible to faults. Increased transistor densities and lower capacitances make a particle strike more likely to cause an upset. At the same time, complex computer systems are increasingly integrated into safety-critical systems such as autonomous vehicles. These two trends make the study of system reliability and fault tolerance essential for modern systems. To analyze and improve system reliability early in the design process, new tools are needed for RTL fault analysis.This paper proposes Eris, a novel framework to identify vulnerable components in hardware designs through fault-injection and fault propagation tracking. Eris builds on ESSENT - a fast C/C++ RTL simulation framework - to provide fault injection, fault tracking, and control-flow deviation detection capabilities for RTL designs. To demonstrate Eris' capabilities, we analyze the reliability of the open source Rocket Chip SoC by randomly injecting faults during thousands of runs on four microbenchmarks. As part of this analysis we measure the sensitivity of different hardware structures to faults based on the likelihood of a random fault causing silent data corruption, unrecoverable data errors, program crashes, and program hangs. We detect control flow deviations and determine whether or not they are benign. Additionally, using Eris' novel fault-tracking capabilities we are able to find 78% more vulnerable components in the same number of simulations compared to RTL-based fault injection techniques without these capabilities. We will release Eris as an open-source tool to aid future research into processor reliability and hardening.

More Details

TYPE Conference Paper YEAR 2022

DOI OSTI Scopus

Publications

Search results