Energy Efficient Neuromorphic Algorithm Training with Analog Memory Arrays
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
IEEE International Reliability Physics Symposium Proceedings
Scaling arrays of non-volatile memory devices from academic demonstrations to reliable, manufacturable systems requires a better understanding of variability at array and wafer-scale levels. CrossSim models the accuracy of neural networks implemented on an analog resistive memory accelerator using the cycle-to-cycle variability of a single device. In this work, we extend this modeling tool to account for device-to-device variation in a realistic way, and evaluate the impact of this reliability issue in the context of neuromorphic online learning tasks.
2019 International Symposium on VLSI Technology, Systems and Application, VLSI-TSA 2019
Analog crossbars have the potential to reduce the energy and latency required to train a neural network by three orders of magnitude when compared to an optimized digital ASIC. The crossbar simulator, CrossSim, can be used to model device nonidealities and determine what device properties are needed to create an accurate neural network accelerator. Experimentally measured device statistics are used to simulate neural network training accuracy and compare different classes of devices including TaOx ReRAM, Lir-Co-Oz devices, and conventional floating gate SONOS memories. A technique called 'Periodic Carry' can overcomes device nonidealities by using a positional number system while maintaining the benefit of parallel analog matrix operations.
Abstract not provided.
IEEE Transactions on Nuclear Science
The image classification accuracy of a TaOx ReRAM-based neuromorphic computing accelerator is evaluated after intentionally inducing a displacement damage up to a fluence of 1014 2.5-MeV Si ions/cm2 on the analog devices that are used to store weights. Results are consistent with a radiation-induced oxygen vacancy production mechanism. When the device is in the high-resistance state during heavy ion radiation, the device resistance, linearity, and accuracy after training are only affected by high fluence levels. The findings in this paper are in accordance with the results of previous studies on TaOx-based digital resistive random access memory. When the device is in the low-resistance state during irradiation, no resistance change was detected, but devices with a 4-kΩ inline resistor did show a reduction in accuracy after training at 1014 2.5-MeV Si ions/cm2. This indicates that changes in resistance can only be somewhat correlated with changes to devices' analog properties. This paper demonstrates that TaOx devices are radiation tolerant not only for high radiation environment digital memory applications but also when operated in an analog mode suitable for neuromorphic computation and training on new data sets.
IEEE Transactions on Nuclear Science
With the growing interest to explore Jupiter's moons, technologies with +10 Mrad(Si) tolerance are now needed, to survive the Jovian environment. Conductive-bridging random access memory (CBRAM) is a nonvolatile memory that has shown a high tolerance to total ionizing dose (TID). However, it is not well understood how CBRAM behaves in an energetic ion environment where displacement damage (DD) effects may also be an issue. In this paper, the response of CBRAM to 100-keV Li, 1-MeV Ta, and 200-keV Si ion irradiations is examined. Ion bombardment was performed with increasing fluence steps until the CBRAM devices failed to hold their programed state. The TID and DD dose (DDD) at the fluence of failure were calculated and compared against tested ion species. Results indicate that failures are more highly correlated with TID than DDD. DC cycling tests were performed during 100-keV Li irradiations and evidence was found that the mobile Ag ion supply diminished with increasing fluence. The cycling results, in addition to prior 14-MeV neutron work, suggest that DD may play a role in the eventual failure of a CBRAM device in a combined radiation environment.
International High-Level Radioactive Waste Management 2019, IHLRWM 2019
PFLOTRAN is well-established in single-phase reactive transport problems, and current research is expanding its visibility and capability in two-phase subsurface problems. A critical part of the development of simulation software is quality assurance (QA). The purpose of the present work is QA testing to verify the correct implementation and accuracy of two-phase flow models in PFLOTRAN. An important early step in QA is to verify the code against exact solutions from the literature. In this work a series of QA tests on models that have known analytical solutions are conducted using PFLOTRAN. In each case the simulated saturation profile is rigorously shown to converge to the exact analytical solution. These results verify the accuracy of PFLOTRAN for use in a wide variety of two-phase modelling problems with a high degree of nonlinearity in the interaction between phase behavior and fluid flow.
Proceedings of SPIE - The International Society for Optical Engineering
Advances in machine intelligence have sparked interest in hardware accelerators to implement these algorithms, yet embedded electronics have stringent power, area budgets, and speed requirements that may limit non- volatile memory (NVM) integration. In this context, the development of fast nanomagnetic neural networks using minimal training data is attractive. Here, we extend an inference-only proposal using the intrinsic physics of domain-wall MTJ (DW-MTJ) neurons for online learning to implement fully unsupervised pattern recognition operation, using winner-take-all networks that contain either random or plastic synapses (weights). Meanwhile, a read-out layer trains in a supervised fashion. We find our proposed design can approach state-of-the-art success on the task relative to competing memristive neural network proposals, while eliminating much of the area and energy overhead that would typically be required to build the neuronal layers with CMOS devices.
Science
Neuromorphic computers could overcome efficiency bottlenecks inherent to conventional computing through parallel programming and readout of artificial neural network weights in a crossbar memory array. However, selective and linear weight updates and <10-nanoampere read currents are required for learning that surpasses conventional computing efficiency. We introduce an ionic floating-gate memory array based on a polymer redox transistor connected to a conductive-bridge memory (CBM). Selective and linear programming of a redox transistor array is executed in parallel by overcoming the bridging threshold voltage of the CBMs. Synaptic weight readout with currents <10 nanoamperes is achieved by diluting the conductive polymer with an insulator to decrease the conductance. The redox transistors endure >1 billion write-read operations and support >1-megahertz write-read frequencies.
IEEE Access
Emerging memory devices, such as resistive crossbars, have the capacity to store large amounts of data in a single array. Acquiring the data stored in large-capacity crossbars in a sequential fashion can become a bottleneck. We present practical methods, based on sparse sampling, to quickly acquire sparse data stored on emerging memory devices that support the basic summation kernel, reducing the acquisition time from linear to sub-linear. The experimental results show that at least an order of magnitude improvement in acquisition time can be achieved when the data are sparse. In addition, we show that the energy cost associated with our approach is competitive to that of the sequential method.
IEEE Access
Recently, a Cambrian explosion of a novel, non-volatile memory (NVM) devices known as memristive devices have inspired effort in building hardware neural networks that learn like the brain. Early experimental prototypes built simple perceptrons from nanosynapses, and recently, fully-connected multi-layer perceptron (MLP) learning systems have been realized. However, while backpropagating learning systems pair well with high-precision computer memories and achieve state-of-the-art performances, this typically comes with a massive energy budget. For future Internet of Things/peripheral use cases, system energy footprint will be a major constraint, and emerging NVM devices may fill the gap by sacrificing high bit precision for lower energy. In this paper, we contrast the well-known MLP approach with the extreme learning machine (ELM) or NoProp approach, which uses a large layer of random weights to improve the separability of high-dimensional tasks, and is usually considered inferior in a software context. However, we find that when taking the device non-linearity into account, NoProp manages to equal hardware MLP system in terms of accuracy. While also using a sign-based adaptation of the delta rule for energy-savings, we find that NoProp can learn effectively with four to six 'bits' of device analog capacity, while MLP requires eight-bit capacity with the same rule. This may allow the requirements for memristive devices to be relaxed in the context of online learning. By comparing the energy footprint of these systems for several candidate nanosynapses and realistic peripherals, we confirm that memristive NoProp systems save energy compared with MLP systems. Lastly, we show that ELM/NoProp systems can achieve better generalization abilities than nanosynaptic MLP systems when paired with pre-processing layers (which do not require backpropagated error). Collectively, these advantages make such systems worthy of consideration in future accelerators or embedded hardware.
IEEE Access
Emerging memory devices, such as resistive crossbars, have the capacity to store large amounts of data in a single array. Acquiring the data stored in large-capacity crossbars in a sequential fashion can become a bottleneck. We present practical methods, based on sparse sampling, to quickly acquire sparse data stored on emerging memory devices that support the basic summation kernel, reducing the acquisition time from linear to sub-linear. The experimental results show that at least an order of magnitude improvement in acquisition time can be achieved when the data are sparse. Finally, in addition, we show that the energy cost associated with our approach is competitive to that of the sequential method.
Abstract not provided.
Abstract not provided.
Applied Physics. A, Materials Science and Processing
In this study, we demonstrate creation of electroforming-free TaOx memristive devices using focused ion beam irradiations to locally define conductive filaments in TaOx films. Electrical characterization shows that these irradiations directly create fully functional memristors without the need for electroforming. Finally, ion beam forming of conductive filaments combined with state-of-the-art nano-patterning presents a CMOS compatible approach to wafer level fabrication of fully formed and operational memristors.
Abstract not provided.
Abstract not provided.