Publications

Results 51–100 of 376

Search results

Jump to search filters

Filament-Free Bulk Resistive Memory Enables Deterministic Analogue Switching

Advanced Materials

Talin, Albert A.; Fuller, Elliot J.; Li, Yiyang; Marinella, Matthew; Sugar, Joshua D.; Bennett, Christopher; Bartsch, Michael S.; Horton, Robert D.; Yoo, Sangmin; Ashby, David S.; Lu, Edwin

Digital computing is nearing its physical limits as computing needs and energy consumption rapidly increase. Analogue-memory-based neuromorphic computing can be orders of magnitude more energy efficient at data-intensive tasks like deep neural networks, but has been limited by the inaccurate and unpredictable switching of analogue resistive memory. Filamentary resistive random access memory (RRAM) suffers from stochastic switching due to the random kinetic motion of discrete defects in the nanometer-sized filament. In this work, this stochasticity is overcome by incorporating a solid electrolyte interlayer, in this case, yttria-stabilized zirconia (YSZ), toward eliminating filaments. Filament-free, bulk-RRAM cells instead store analogue states using the bulk point defect concentration, yielding predictable switching because the statistical ensemble behavior of oxygen vacancy defects is deterministic even when individual defects are stochastic. Both experiments and modeling show bulk-RRAM devices using TiO2-X switching layers and YSZ electrolytes yield deterministic and linear analogue switching for efficient inference and training. Bulk-RRAM solves many outstanding issues with memristor unpredictability that have inhibited commercialization, and can, therefore, enable unprecedented new applications for energy-efficient neuromorphic computing. Beyond RRAM, this work shows how harnessing bulk point defects in ionic materials can be used to engineer deterministic nanoelectronic materials and devices.

More Details

Analog architectures for neural network acceleration based on non-volatile memory

Applied Physics Reviews

Xiao, Tianyao P.; Bennett, Christopher; Feinberg, Benjamin; Agarwal, Sapan; Marinella, Matthew

Analog hardware accelerators, which perform computation within a dense memory array, have the potential to overcome the major bottlenecks faced by digital hardware for data-heavy workloads such as deep learning. Exploiting the intrinsic computational advantages of memory arrays, however, has proven to be challenging principally due to the overhead imposed by the peripheral circuitry and due to the non-ideal properties of memory devices that play the role of the synapse. We review the existing implementations of these accelerators for deep supervised learning, organizing our discussion around the different levels of the accelerator design hierarchy, with an emphasis on circuits and architecture. We explore and consolidate the various approaches that have been proposed to address the critical challenges faced by analog accelerators, for both neural network inference and training, and highlight the key design trade-offs underlying these techniques.

More Details

PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-Efficient ReRAM

IEEE Transactions on Computers

Ankit, Aayush; El Hajj, Izzat; Agarwal, Sapan; Marinella, Matthew; Foltin, Martin; Strachan, John P.; Milojicic, Dejan; Hwu, Wen M.; Roy, Kaushik

The wide adoption of deep neural networks has been accompanied by ever-increasing energy and performance demands due to the expensive nature of training them. Numerous special-purpose architectures have been proposed to accelerate training: both digital and hybrid digital-analog using resistive RAM (ReRAM) crossbars. ReRAM-based accelerators have demonstrated the effectiveness of ReRAM crossbars at performing matrix-vector multiplication operations that are prevalent in training. However, they still suffer from inefficiency due to the use of serial reads and writes for performing the weight gradient and update step. A few works have demonstrated the possibility of performing outer products in crossbars, which can be used to realize the weight gradient and update step without the use of serial reads and writes. However, these works have been limited to low precision operations which are not sufficient for typical training workloads. Moreover, they have been confined to a limited set of training algorithms for fully-connected layers only. To address these limitations, we propose a bit-slicing technique for enhancing the precision of ReRAM-based outer products, which is substantially different from bit-slicing for matrix-vector multiplication only. We incorporate this technique into a crossbar architecture with three variants catered to different training algorithms. To evaluate our design on different types of layers in neural networks (fully-connected, convolutional, etc.) and training algorithms, we develop PANTHER, an ISA-programmable training accelerator with compiler support. Our design can also be integrated into other accelerators in the literature to enhance their efficiency. Our evaluation shows that PANTHER achieves up to 8.02×, 54.21×, and 103× energy reductions as well as 7.16×, 4.02×, and 16× execution time reductions compared to digital accelerators, ReRAM-based accelerators, and GPUs, respectively.

More Details

Three Artificial Spintronic Leaky Integrate-and-Fire Neurons

SPIN

Brigner, Wesley H.; Hu, Xuan; Hassan, Naimul; Jiang-Wei, Lucian; Bennett, Christopher; Akinola, Otitoaleke; Pasquale, Massimo; Marinella, Matthew; Incorvia, Jean A.C.; Friedman, Joseph S.

Due to their nonvolatility and intrinsic current integration capabilities, spintronic devices that rely on domain wall (DW) motion through a free ferromagnetic track have garnered significant interest in the field of neuromorphic computing. Although a number of such devices have already been proposed, they require the use of external circuitry to implement several important neuronal behaviors. As such, they are likely to result in either a decrease in energy efficiency, an increase in fabrication complexity, or even both. To resolve this issue, we have proposed three individual neurons that are capable of performing these functionalities without the use of any external circuitry. To implement leaking, the first neuron uses a dipolar coupling field, the second uses an anisotropy gradient and the third uses shape variations of the DW track.

More Details

Maximized lateral inhibition in paired magnetic domain wall racetracks for neuromorphic computing

Nanotechnology

Cui, Can; Akinola, Otitoaleke G.; Hassan, Naimul; Bennett, Christopher; Marinella, Matthew; Friedman, Joseph S.; Incorvia, Jean A.C.

Lateral inhibition is an important functionality in neuromorphic computing, modeled after the biological neuron behavior that a firing neuron deactivates its neighbors belonging to the same layer and prevents them from firing. In most neuromorphic hardware platforms lateral inhibition is implemented by external circuitry, thereby decreasing the energy efficiency and increasing the area overhead of such systems. Recently, the domain wall - magnetic tunnel junction (DW-MTJ) artificial neuron is demonstrated in modeling to be intrinsically inhibitory. Without peripheral circuitry, lateral inhibition in DW-MTJ neurons results from magnetostatic interaction between neighboring neuron cells. However, the lateral inhibition mechanism in DW-MTJ neurons has not been studied thoroughly, leading to weak inhibition only in very closely-spaced devices. This work approaches these problems by modeling current- and field- driven DW motion in a pair of adjacent DW-MTJ neurons. We maximize the magnitude of lateral inhibition by tuning the magnetic interaction between the neurons. The results are explained by current-driven DW velocity characteristics in response to an external magnetic field and quantified by an analytical model. Dependence of lateral inhibition strength on device parameters is also studied. Finally, lateral inhibition behavior in an array of 1000 DW-MTJ neurons is demonstrated. Our results provide a guideline for the optimization of lateral inhibition implementation in DW-MTJ neurons. With strong lateral inhibition achieved, a path towards competitive learning algorithms such as the winner-take-all are made possible on such neuromorphic devices.

More Details

Device-aware inference operations in SONOS nonvolatile memory arrays

IEEE International Reliability Physics Symposium Proceedings

Bennett, Christopher; Xiao, Tianyao P.; Dellana, Ryan; Feinberg, Benjamin; Agarwal, Sapan; Marinella, Matthew; Agrawal, Vineet; Prabhakar, Venkatraman; Ramkumar, Krishnaswamy; Hinh, Long; Saha, Swatilekha; Raghavan, Vijay; Chettuvetty, Ramesh

Non-volatile memory arrays can deploy pre-trained neural network models for edge inference. However, these systems are affected by device-level noise and retention issues. Here, we examine damage caused by these effects, introduce a mitigation strategy, and demonstrate its use in fabricated array of SONOS (Silicon-Oxide-Nitride-Oxide-Silicon) devices. On MNIST, fashion-MNIST, and CIFAR-10 tasks, our approach increases resilience to synaptic noise and drift. We also show strong performance can be realized with ADCs of 5-8 bits precision.

More Details

Plasticity-enhanced domain-wall MTJ neural networks for energy-efficient online learning

Proceedings - IEEE International Symposium on Circuits and Systems

Bennett, Christopher; Xiao, Tianyao P.; Cui, Can; Hassan, Naimul; Akinola, Otitoaleke G.; Incorvia, Jean A.C.; Velasquez, Alvaro; Friedman, Joseph S.; Marinella, Matthew

Machine learning implements backpropagation via abundant training samples. We demonstrate a multi-stage learning system realized by a promising non-volatile memory device, the domain-wall magnetic tunnel junction (DW-MTJ). The system consists of unsupervised (clustering) as well as supervised sub-systems, and generalizes quickly (with few samples). We demonstrate interactions between physical properties of this device and optimal implementation of neuroscience-inspired plasticity learning rules, and highlight performance on a suite of tasks. Our energy analysis confirms the value of the approach, as the learning budget stays below 20µJ even for large tasks used typically in machine learning.

More Details

Lateral inhibition in magnetic domain wall racetrack arrays for neuromorphic computing

Proceedings of SPIE - The International Society for Optical Engineering

Cui, Can; Akinola, Otitoaleke G.; Hassan, Naimul; Bennett, Christopher; Marinella, Matthew; Friedman, Joseph S.; Incorvia, Jean A.C.

Neuromorphic computing captures the quintessential neural behaviors of the brain and is a promising candidate for the beyond-von Neumann computer architectures, featuring low power consumption and high parallelism. The neuronal lateral inhibition feature, closely associated with the biological receptive field, is crucial to neuronal competition in the nervous system as well as its neuromorphic hardware counterpart. The domain wall - magnetic tunnel junction (DW-MTJ) neuron is an emerging spintronic artificial neuron device exhibiting intrinsic lateral inhibition. This work discusses lateral inhibition mechanism of the DW-MTJ neuron and shows by micromagnetic simulation that lateral inhibition is efficiently enhanced by the Dzyaloshinskii-Moriya interaction (DMI).

More Details

Energy and Performance Benchmarking of a Domain Wall-Magnetic Tunnel Junction Multibit Adder

IEEE Journal on Exploratory Solid-State Computational Devices and Circuits

Xiao, Tianyao P.; Bennett, Christopher; Hu, Xuan; Feinberg, Benjamin; Jacobs-Gedrim, Robin B.; Agarwal, Sapan; Brunhaver, John S.; Friedman, Joseph S.; Incorvia, Jean A.C.; Marinella, Matthew

The domain-wall (DW)-magnetic tunnel junction (MTJ) device implements universal Boolean logic in a manner that is naturally compact and cascadable. However, an evaluation of the energy efficiency of this emerging technology for standard logic applications is still lacking. In this article, we use a previously developed compact model to construct and benchmark a 32-bit adder entirely from DW-MTJ devices that communicates with DW-MTJ registers. The results of this large-scale design and simulation indicate that while the energy cost of systems driven by spin-Transfer torque (STT) DW motion is significantly higher than previously predicted, the same concept using spin-orbit torque (SOT) switching benefits from an improvement in the energy per operation by multiple orders of magnitude, attaining competitive energy values relative to a comparable CMOS subprocessor component. This result clarifies the path toward practical implementations of an all-magnetic processor system.

More Details

Shape-Based Magnetic Domain Wall Drift for an Artificial Spintronic Leaky Integrate-and-Fire Neuron

IEEE Transactions on Electron Devices

Brigner, Wesley H.; Hassan, Naimul; Jiang-Wei, Lucian; Hu, Xuan; Saha, Diptish; Bennett, Christopher; Marinella, Matthew; Incorvia, Jean A.C.; Garcia-Sanchez, Felipe; Friedman, Joseph S.

Spintronic devices based on domain wall (DW) motion through ferromagnetic nanowire tracks have received great interest as components of neuromorphic information processing systems. Previous proposals for spintronic artificial neurons required external stimuli to perform the leaking functionality, one of the three fundamental functions of a leaky integrate-and-fire (LIF) neuron. The use of this external magnetic field or electrical current stimulus results in either a decrease in energy efficiency or an increase in fabrication complexity. In this article, we modify the shape of previously demonstrated three-terminal magnetic tunnel junction neurons to perform the leaking operation without any external stimuli. The trapezoidal structure causes a shape-based DW drift, thus intrinsically providing the leaking functionality with no hardware cost. This LIF neuron, therefore, promises to advance the development of spintronic neural network crossbar arrays.

More Details

Comparison of Radiation Effects in Custom-and Commercially-Fabricated Resistive Memory Devices

IEEE Transactions on Nuclear Science

Holt, Joshua S.; Alamgir, Zahiruddin; Beckmann, Karsten; Suguitan, Nadia; Russell, Sierra; Iler, Evan; Bakhru, Hassaram; Bielejec, Edward S.; Jacobs-Gedrim, Robin B.; Hughart, David R.; Marinella, Matthew; Yang-Scharlotta, Jean; Cady, Nathaniel C.

The radiation response of TaOx-based RRAM devices fabricated in academic (Set A) and industrial (Set B) settings was compared. Ionization damage from a 60Co gamma source did not cause any changes in device resistance for either device type, up to 45 Mrad(Si). Displacement damage from a heavy ion beam caused the Set B in the high resistance state to decrease in resistance at 1 x 1021 oxygen displacements per cm3; meanwhile, the Set A devices did not exhibit any decrease in resistance due to displacement damage. Both types of devices exhibited an increase in resistance around 3 x 1022 oxygen displacements per cm3, possibly due to damage at the oxide/metal interfaces. These extremely high levels of damage represent near-total atomic disruption, and if this level of damage were ever reached, other circuit elements would likely fail before the RRAM devices in this study. Overall, both sets of devices were much more resistant to radiation effects than other devices reported in the literature. Displacement damage effects were only observed in the Set A devices once the displacement-induced oxygen vacancies surpassed the intrinsic vacancy concentration in the devices, suggesting that high oxygen vacancy concentration played a role in the devices’ high tolerance to displacement damage.

More Details

Three-terminal magnetic tunnel junction synapse circuits showing spike-timing-dependent plasticity

Journal of Physics D: Applied Physics

Akinola, Otitoaleke; Hu, Xuan; Bennett, Christopher; Marinella, Matthew; Friedman, Joseph S.; Incorvia, Jean A.C.

There have been recent efforts towards the development of biologically-inspired neuromorphic devices and architecture. Here, we show a synapse circuit that is designed to perform spike-timing-dependent plasticity which works with the leaky, integrate, and fire neuron in a neuromorphic computing architecture. The circuit consists of a three-terminal magnetic tunnel junction with a mobile domain wall between two low-pass filters and has been modeled in SPICE. The results show that the current flowing through the synapse is highly correlated to the timing delay between the pre-synaptic and post-synaptic neurons. Using micromagnetic simulations, we show that introducing notches along the length of the domain wall track pins the domain wall at each successive notch to properly respond to the timing between the input and output current pulses of the circuit, producing a multi-state resistance representing synaptic weights. We show in SPICE that a notch-free ideal magnetic device also shows spike-timing dependent plasticity in response to the circuit current. This work is key progress towards making more bio-realistic artificial synapses with multiple weights, which can be trained online with a promise of CMOS compatibility and energy efficiency.

More Details

Redox transistors for neuromorphic computing

IBM Journal of Research and Development

Talin, Albert A.; Fuller, Elliot J.; Bennett, Christopher; Marinella, Matthew; Li, Yiyang

Efficiency bottlenecks inherent to conventional computing in executing neural algorithms have spurred the development of novel devices capable of “in-memory” computing. Commonly known as “memristors,” a variety of device concepts including conducting bridge, vacancy filament, phase change, and other types have been proposed as promising elements in artificial neural networks for executing inference and learning algorithms. In this article, we review the recent advances in memristor technology for neuromorphic computing and discuss strategies for addressing the most significant performance challenges, including nonlinearity, high read/write currents, and endurance. As an alternative to two-terminal memristors, we introduce the three-terminal electrochemical memory based on the redox transistor (RT), which uses a gate to tune the redox state of the channel. Decoupling the “read” and “write” operations using a third terminal and storage of information as a charge-compensated redox reaction in the bulk of the transistor enables high-density information storage. These properties enable low-energy operation without compromising analog performance and nonvolatility. Finally, we discuss the RT operating mechanisms using organic and inorganic materials, approaches for array integration, and prospects for achieving the device density and switching speeds necessary to make electrochemical memory competitive with established digital technology.

More Details
Results 51–100 of 376
Results 51–100 of 376