Publications

Results 1–25 of 149

Search results

Jump to search filters

Explainable machine learning for hydrogen diffusion in metals and random binary alloys

Physical Review Materials

Lu, Grace M.; Witman, Matthew; Agarwal, Sapan A.; Stavila, Vitalie S.; Trinkle, Dallas R.

Hydrogen diffusion in metals and alloys plays an important role in the discovery of new materials for fuel cell and energy storage technology. While analytic models use hand-selected features that have clear physical ties to hydrogen diffusion, they often lack accuracy when making quantitative predictions. Machine learning models are capable of making accurate predictions, but their inner workings are obscured, rendering it unclear which physical features are truly important. To develop interpretable machine learning models to predict the activation energies of hydrogen diffusion in metals and random binary alloys, we create a database for physical and chemical properties of the species and use it to fit six machine learning models. Our models achieve root-mean-squared errors between 98-119 meV on the testing data and accurately predict that elemental Ru has a large activation energy, while elemental Cr and Fe have small activation energies. By analyzing the feature importances of these fitted models, we identify relevant physical properties for predicting hydrogen diffusivity. While metrics for measuring the individual feature importances for machine learning models exist, correlations between the features lead to disagreement between models and limit the conclusions that can be drawn. Instead grouped feature importance, formed by combining the features via their correlations, agree across the six models and reveal that the two groups containing the packing factor and electronic specific heat are particularly significant for predicting hydrogen diffusion in metals and random binary alloys. This framework allows us to interpret machine learning models and enables rapid screening of new materials with the desired rates of hydrogen diffusion.

More Details

Towards Pareto optimal high entropy hydrides via data-driven materials discovery

Journal of Materials Chemistry A

Witman, Matthew; Ling, Sanliang; Wadge, Matthew; Bouzidi, Anis; Pineda-Romero, Nayely; Clulow, Rebecca; Ek, Gustav; Chames, Jeffery M.; Allendorf, Emily J.; Agarwal, Sapan A.; Allendorf, Mark D.; Walker, Gavin S.; Grant, David M.; Sahlberg, Martin; Zlotea, Claudia; Stavila, Vitalie S.

The ability to rapidly screen material performance in the vast space of high entropy alloys is of critical importance to efficiently identify optimal hydride candidates for various use cases. Given the prohibitive complexity of first principles simulations and large-scale sampling required to rigorously predict hydrogen equilibrium in these systems, we turn to compositional machine learning models as the most feasible approach to screen on the order of tens of thousands of candidate equimolar high entropy alloys (HEAs). Critically, we show that machine learning models can predict hydride thermodynamics and capacities with reasonable accuracy (e.g. a mean absolute error in desorption enthalpy prediction of ∼5 kJ molH2−1) and that explainability analyses capture the competing trade-offs that arise from feature interdependence. We can therefore elucidate the multi-dimensional Pareto optimal set of materials, i.e., where two or more competing objective properties can't be simultaneously improved by another material. This provides rapid and efficient down-selection of the highest priority candidates for more time-consuming density functional theory investigations and experimental validation. Various targets were selected from the predicted Pareto front (with saturation capacities approaching two hydrogen per metal and desorption enthalpy less than 60 kJ molH2−1) and were experimentally synthesized, characterized, and tested amongst an international collaboration group to validate the proposed novel hydrides. Additional top-predicted candidates are suggested to the community for future synthesis efforts, and we conclude with an outlook on improving the current approach for the next generation of computational HEA hydride discovery efforts.

More Details

Parallel Matrix Multiplication Using Voltage-Controlled Magnetic Anisotropy Domain Wall Logic

IEEE Journal on Exploratory Solid-State Computational Devices and Circuits

Zogbi, Nicholas; Liu, Samuel; Bennett, Christopher H.; Agarwal, Sapan A.; Marinella, Matthew J.; Incorvia, Jean A.C.; Xiao, Tianyao X.

The domain wall-magnetic tunnel junction (DW-MTJ) is a versatile device that can simultaneously store data and perform computations. These three-terminal devices are promising for digital logic due to their nonvolatility, low-energy operation, and radiation hardness. Here, we augment the DW-MTJ logic gate with voltage-controlled magnetic anisotropy (VCMA) to improve the reliability of logical concatenation in the presence of realistic process variations. VCMA creates potential wells that allow for reliable and repeatable localization of domain walls (DWs). The DW-MTJ logic gate supports different fanouts, allowing for multiple inputs and outputs for a single device without affecting the area. We simulate a systolic array of DW-MTJ multiply-accumulate (MAC) units with 4-bit and 8-bit precision, which uses the nonvolatility of DW-MTJ logic gates to enable fine-grained pipelining and high parallelism. The DW-MTJ systolic array provides comparable throughput and efficiency to state-of-the-art CMOS systolic arrays while being radiation-hard. These results improve the feasibility of using DW-based processors, especially for extreme-environment applications such as space.

More Details

ATHENA: Analytical Tool for Heterogeneous Neuromorphic Architectures

Cardwell, Suma G.; Plagge, Mark P.; Hughes, Clayton H.; Rothganger, Fredrick R.; Agarwal, Sapan A.; Feinberg, Benjamin F.; Awad, Amro; Mcfarland, John; Parker, Luke G.

The ASC program seeks to use machine learning to improve efficiencies in its stockpile stewardship mission. Moreover, there is a growing market for technologies dedicated to accelerating AI workloads. Many of these emerging architectures promise to provide savings in energy efficiency, area, and latency when compared to traditional CPUs for these types of applications — neuromorphic analog and digital technologies provide both low-power and configurable acceleration of challenging artificial intelligence (AI) algorithms. If designed into a heterogeneous system with other accelerators and conventional compute nodes, these technologies have the potential to augment the capabilities of traditional High Performance Computing (HPC) platforms [5]. This expanded computation space requires not only a new approach to physics simulation, but the ability to evaluate and analyze next-generation architectures specialized for AI/ML workloads in both traditional HPC and embedded ND applications. Developing this capability will enable ASC to understand how this hardware performs in both HPC and ND environments, improve our ability to port our applications, guide the development of computing hardware, and inform vendor interactions, leading them toward solutions that address ASC’s unique requirements.

More Details

Probabilistic Nanomagnetic Memories for Uncertain and Robust Machine Learning

Bennett, Christopher H.; Xiao, Tianyao X.; Liu, Samuel; Humphrey, Leonard; Incorvia, Jean A.; Debusschere, Bert D.; Ries, Daniel R.; Agarwal, Sapan A.

This project evaluated the use of emerging spintronic memory devices for robust and efficient variational inference schemes. Variational inference (VI) schemes, which constrain the distribution for each weight to be a Gaussian distribution with a mean and standard deviation, are a tractable method for calculating posterior distributions of weights in a Bayesian neural network such that this neural network can also be trained using the powerful backpropagation algorithm. Our project focuses on domain-wall magnetic tunnel junctions (DW-MTJs), a powerful multi-functional spintronic synapse design that can achieve low power switching while also opening the pathway towards repeatable, analog operation using fabricated notches. Our initial efforts to employ DW-MTJs as an all-in-one stochastic synapse with both a mean and standard deviation didn’t end up meeting the quality metrics for hardware-friendly VI. In the future, new device stacks and methods for expressive anisotropy modification may make this idea still possible. However, as a fall back that immediately satisfies our requirements, we invented and detailed how the combination of a DW-MTJ synapse encoding the mean and a probabilistic Bayes-MTJ device, programmed via a ferroelectric or ionically modifiable layer, can robustly and expressively implement VI. This design includes a physics-informed small circuit model, that was scaled up to perform and demonstrate rigorous uncertainty quantification applications, up to and including small convolutional networks on a grayscale image classification task, and larger (Residual) networks implementing multi-channel image classification. Lastly, as these results and ideas all depend upon the idea of an inference application where weights (spintronic memory states) remain non-volatile, the retention of these synapses for the notched case was further interrogated. These investigations revealed and emphasized the importance of both notch geometry and anisotropy modification in order to further enhance the endurance of written spintronic states. In the near future, these results will be mapped to effective predictions for room temperature and elevated operation DW-MTJ memory retention, and experimentally verified when devices become available.

More Details

Modeling Analog Tile-Based Accelerators Using SST

Feinberg, Benjamin F.; Agarwal, Sapan A.; Plagge, Mark P.; Rothganger, Fredrick R.; Cardwell, Suma G.; Hughes, Clayton H.

Analog computing has been widely proposed to improve the energy efficiency of multiple important workloads including neural network operations, and other linear algebra kernels. To properly evaluate analog computing and explore more complex workloads such as systems consisting of multiple analog data paths, system level simulations are required. Moreover, prior work on system architectures for analog computing often rely on custom simulators creating signficant additional design effort and complicating comparisons between different systems. To remedy these issues, this report describes the design and implementation of a flexible tile-based analog accelerator element for the Structural Simulation Toolkit (SST). The element focuses on heavily on the tile controller—an often neglected aspect of prior work—that is sufficiently versatile to simulate a wide range of different tile operations including neural network layers, signal processing kernels, and generic linear algebra operations without major constraints. The tile model also interoperates with existing SST memory and network models to reduce the overall development load and enable future simulation of heterogeneous systems with both conventional digital logic and analog compute tiles. Finally, both the tile and array models are designed to easily support future extensions as new analog operations and applications that can benefit from analog computing are developed.

More Details

Single Event Upset and Total Ionizing Dose Response of 12LP FinFET Digital Circuits

Spear, Matthew; Wallace, Trace; Wilson, Donald; Solano, Jose; Irumva, Gedeon; Esqueda, Ivan S.; Barnaby, Hugh J.; Clark, Lawrence T.; Brunhaver, John; Turowski, Marek; Mikkola, Esko; Hughart, David R.; Young, Joshua M.; Manuel, Jack E.; Agarwal, Sapan A.; Vaandrager, Bastiaan L.; Vizkelethy, Gyorgy V.; King, Michael P.; Marinella, Matthew J.

Abstract not provided.

Single Event Upset and Total Ionizing Dose Response of 12LP FinFET Digital Circuits

Spear, Matthew; Wallace, Trace; Wilson, Donald; Solano, Jose; Irumva, Gedeon; Esqueda, Ivan S.; Barnaby, Hugh J.; Clark, Lawrence T.; Brunhaver, John; Turowski, Marek; Mikkola, Esko; Hughart, David R.; Young, Joshua M.; Manuel, Jack E.; Agarwal, Sapan A.; Vaandrager, Bastiaan L.; Vizkelethy, Gyorgy V.; Gutierrez, Amos; Trippe, James M.; King, Michael P.; Bielejec, Edward S.; Marinella, Matthew J.

Abstract not provided.

CrossSim Inference Manual v2.0

Xiao, Tianyao X.; Bennett, Christopher H.; Feinberg, Benjamin F.; Marinella, Matthew J.; Agarwal, Sapan A.

Neural networks are largely based on matrix computations. During forward inference, the most heavily used compute kernel is the matrix-vector multiplication (MVM): $W \vec{x} $. Inference is a first frontier for the deployment of next-generation hardware for neural network applications, as it is more readily deployed in edge devices, such as mobile devices or embedded processors with size, weight, and power constraints. Inference is also easier to implement in analog systems than training, which has more stringent device requirements. The main processing kernel used during inference is the MVM.

More Details

An Accurate, Error-Tolerant, and Energy-Efficient Neural Network Inference Engine Based on SONOS Analog Memory

IEEE Transactions on Circuits and Systems I: Regular Papers

Xiao, Tianyao X.; Feinberg, Benjamin F.; Bennett, Christopher H.; Agrawal, Vineet; Saxena, Prashant; Prabhakar, Venkatraman; Ramkumar, Krishnaswamy; Medu, Harsha; Raghavan, Vijay; Chettuvetty, Ramesh; Agarwal, Sapan A.; Marinella, Matthew J.

We demonstrate SONOS (silicon-oxide-nitride-oxide-silicon) analog memory arrays that are optimized for neural network inference. The devices are fabricated in a 40nm process and operated in the subthreshold regime for in-memory matrix multiplication. Subthreshold operation enables low conductances to be implemented with low error, which matches the typical weight distribution of neural networks, which is heavily skewed toward near-zero values. This leads to high accuracy in the presence of programming errors and process variations. We simulate the end-To-end neural network inference accuracy, accounting for the measured programming error, read noise, and retention loss in a fabricated SONOS array. Evaluated on the ImageNet dataset using ResNet50, the accuracy using a SONOS system is within 2.16% of floating-point accuracy without any retraining. The unique error properties and high On/Off ratio of the SONOS device allow scaling to large arrays without bit slicing, and enable an inference architecture that achieves 20 TOPS/W on ResNet50, a > 10× gain in energy efficiency over state-of-The-Art digital and analog inference accelerators.

More Details

Single-Event Effects Induced by Heavy Ions in SONOS Charge Trapping Memory Arrays

IEEE Transactions on Nuclear Science

Xiao, Tianyao X.; Bennett, Christopher H.; Agarwal, Sapan A.; Hughart, David R.; Barnaby, Hugh J.; Puchner, Helmut; Talin, A.A.; Marinella, Matthew J.

We investigate the sensitivity of silicon-oxide-nitride-silicon-oxide (SONOS) charge trapping memory technology to heavy-ion induced single-event effects. Threshold voltage ( V_T ) statistics were collected across multiple test chips that contained in total 18 Mb of 40-nm SONOS memory arrays. The arrays were irradiated with Kr and Ar ion beams, and the changes in their V_T distributions were analyzed as a function of linear energy transfer (LET), beam fluence, and operating temperature. We observe that heavy ion irradiation induces a tail of disturbed devices in the 'program' state distribution, which has also been seen in the response of floating-gate (FG) flash cells. However, the V_T distribution of SONOS cells lacks a distinct secondary peak, which is generally attributed to direct ion strikes to the gate-stack of FG cells. This property, combined with the observed change in the V_T distribution with LET, suggests that SONOS cells are not particularly sensitive to direct ion strikes but cells in the proximity of an ion's absorption can still experience a V_T shift. These results shed new light on the physical mechanisms underlying the V_T shift induced by a single heavy ion in scaled charge trap memory.

More Details

Analog Neural Network Inference Accuracy in One-Selector One-Resistor Memory Arrays

Proceedings - 2022 IEEE International Conference on Rebooting Computing, ICRC 2022

Xiao, Tianyao X.; Bennett, Christopher H.; Wilson, Donald; Feinberg, Benjamin F.; Agarwal, Sapan A.; Marinella, Matthew J.

Non-volatile memory arrays require select devices to ensure accurate programming. The one-selector one-resistor (1S1R) array where a two-terminal nonlinear select device is placed in series with a resistive memory element is attractive due to its high-density data storage; however, the effect of the nonlinear select device on the accuracy of analog in-memory computing has not been explored. This work evaluates the impact of select and memory device properties on the results of analog matrix-vector multiplications. We integrate nonlinear circuit simulations into CrossSim and perform end-to-end neural network inference simulations to study how the select device affects the accuracy of neural network inference. We propose an adjustment to the input voltage that can effectively compensate for the electrical load of the select device. Our results show that for deep residual networks trained on CIFAR-10, a compensation that is uniform across all devices in the system can mitigate these effects over a wide range of values for the select device I-V steepness and memory device On/Off ratio. A realistic I-V curve steepness of 60 mV/dec can yield an accuracy on CIFAR-10 that is within 0.44% of the floating-point accuracy.

More Details

Self-correcting Flip-flops for Triple Modular Redundant Logic in a 12-nm Technology

Proceedings - IEEE International Symposium on Circuits and Systems

Clark, Lawrence T.; Duvnjak, Alen; Young-Sciortino, Clifford; Cannon, Matthew J.; Brunhaver, John; Agarwal, Sapan A.; Wilson, Donald; Barnaby, Hugh; Marinella, Matthew J.

Area efficient self-correcting flip-flops for use with triple modular redundant (TMR) soft-error hardened logic are implemented in a 12-nm finFET process technology. The TMR flip-flop slave latches self-correct in the clock low phase using Muller C-elements in the latch feedback. These C-elements are driven by the two redundant stored values and not by the slave latch itself, saving area over a similar implementation using majority gate feedback. These flip-flops are implemented as large shift-register arrays on a test chip and have been experimentally tested for their soft-error mitigation in static and dynamic modes of operation using heavy ions and protons. We show how high clock skew can result in susceptibility to soft-errors in the dynamic mode, and explain the potential failure mechanism.

More Details

Achieving Accurate In-Memory Neural Network Inference with Highly Overlapping Nonvolatile Memory State Distributions

6th IEEE Electron Devices Technology and Manufacturing Conference, EDTM 2022

Marinella, Matthew J.; Xiao, Tianyao X.; Feinberg, Benjamin F.; Bennett, Christopher H.; Agrawal, Vineet; Puchner, Helmut; Agarwal, Sapan A.

Analog in-memory computing is a method to improve the efficiency of deep neural network inference by orders of magnitude, by utilizing analog properties of a nonvolatile memory. This places new requirements on the memory device, which physically represent neural net weights as analog states. By carefully considering the algorithm implications when mapping weights to physical states, it is possible to achieve precision very close to that of a digital accelerator using a 40nm embedded SONOS.

More Details

Eris: Fault Injection and Tracking Framework for Reliability Analysis of Open-Source Hardware

Proceedings - 2022 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2022

Nema, Shubham; Kirschner, Justin; Adak, Debpratim; Agarwal, Sapan A.; Feinberg, Benjamin F.; Rodrigues, Arun; Marinella, Matthew J.; Awad, Amro

As transistors have been scaled over the past decade, modern systems have become increasingly susceptible to faults. Increased transistor densities and lower capacitances make a particle strike more likely to cause an upset. At the same time, complex computer systems are increasingly integrated into safety-critical systems such as autonomous vehicles. These two trends make the study of system reliability and fault tolerance essential for modern systems. To analyze and improve system reliability early in the design process, new tools are needed for RTL fault analysis.This paper proposes Eris, a novel framework to identify vulnerable components in hardware designs through fault-injection and fault propagation tracking. Eris builds on ESSENT - a fast C/C++ RTL simulation framework - to provide fault injection, fault tracking, and control-flow deviation detection capabilities for RTL designs. To demonstrate Eris' capabilities, we analyze the reliability of the open source Rocket Chip SoC by randomly injecting faults during thousands of runs on four microbenchmarks. As part of this analysis we measure the sensitivity of different hardware structures to faults based on the likelihood of a random fault causing silent data corruption, unrecoverable data errors, program crashes, and program hangs. We detect control flow deviations and determine whether or not they are benign. Additionally, using Eris' novel fault-tracking capabilities we are able to find 78% more vulnerable components in the same number of simulations compared to RTL-based fault injection techniques without these capabilities. We will release Eris as an open-source tool to aid future research into processor reliability and hardening.

More Details
Results 1–25 of 149
Results 1–25 of 149