Publications

Results 1–25 of 37
Skip to search filters

Designing and modeling analog neural network training accelerators

2019 International Symposium on VLSI Technology, Systems and Application, VLSI-TSA 2019

Agarwal, Sapan A.; Jacobs-Gedrim, Robin B.; Bennett, Christopher H.; Hsia, Alexander W.; Adee, Shane M.; Hughart, David R.; Fuller, Elliot J.; Li, Yiyang; Talin, A.A.; Marinella, Matthew J.

Analog crossbars have the potential to reduce the energy and latency required to train a neural network by three orders of magnitude when compared to an optimized digital ASIC. The crossbar simulator, CrossSim, can be used to model device nonidealities and determine what device properties are needed to create an accurate neural network accelerator. Experimentally measured device statistics are used to simulate neural network training accuracy and compare different classes of devices including TaOx ReRAM, Lir-Co-Oz devices, and conventional floating gate SONOS memories. A technique called 'Periodic Carry' can overcomes device nonidealities by using a positional number system while maintaining the benefit of parallel analog matrix operations.

More Details

Piecewise empirical model (PEM) of resistive memory for pulsed analog and neuromorphic applications

Journal of Computational Electronics

Niroula, John N.; Agarwal, Sapan A.; Jacobs-Gedrim, Robin B.; Schiek, Richard L.; Hughart, David R.; Hsia, Alexander W.; James, Conrad D.; Marinella, Matthew J.

With the end of Dennard scaling and the ever-increasing need for more efficient, faster computation, resistive switching devices (ReRAM), often referred to as memristors, are a promising candidate for next generation computer hardware. These devices show particular promise for use in an analog neuromorphic computing accelerator as they can be tuned to multiple states and be updated like the weights in neuromorphic algorithms. Modeling a ReRAM-based neuromorphic computing accelerator requires a compact model capable of correctly simulating the small weight update behavior associated with neuromorphic training. These small updates have a nonlinear dependence on the initial state, which has a significant impact on neural network training. Consequently, we propose the piecewise empirical model (PEM), an empirically derived general purpose compact model that can accurately capture the nonlinearity of an arbitrary two-terminal device to match pulse measurements important for neuromorphic computing applications. By defining the state of the device to be proportional to its current, the model parameters can be extracted from a series of voltages pulses that mimic the behavior of a device in an analog neuromorphic computing accelerator. This allows for a general, accurate, and intuitive compact circuit model that is applicable to different resistance-switching device technologies. In this work, we explain the details of the model, implement the model in the circuit simulator Xyce, and give an example of its usage to model a specific Ta / TaO x device.

More Details

Achieving ideal accuracies in analog neuromorphic computing using periodic carry

Digest of Technical Papers - Symposium on VLSI Technology

Agarwal, Sapan A.; Jacobs-Gedrim, Robin B.; Hsia, Alexander W.; Hughart, David R.; Fuller, Elliot J.; Talin, A.A.; James, Conrad D.; Plimpton, Steven J.; Marinella, Matthew J.

Analog resistive memories promise to reduce the energy of neural networks by orders of magnitude. However, the write variability and write nonlinearity of current devices prevent neural networks from training to high accuracy. We present a novel periodic carry method that uses a positional number system to overcome this while maintaining the benefit of parallel analog matrix operations. We demonstrate how noisy, nonlinear TaOx devices that could only train to 80% accuracy on MNIST, can now reach 97% accuracy, only 1% away from an ideal numeric accuracy of 98%. On a file type dataset, the TaOx devices achieve ideal numeric accuracy. In addition, low noise, linear Li1-xCoO2 devices train to ideal numeric accuracies using periodic carry on both datasets.

More Details

Designing an analog crossbar based neuromorphic accelerator

2017 5th Berkeley Symposium on Energy Efficient Electronic Systems, E3S 2017 - Proceedings

Agarwal, Sapan A.; Hsia, Alexander W.; Jacobs-Gedrim, Robin B.; Hughart, David R.; Plimpton, Steven J.; James, Conrad D.; Marinella, Matthew J.

Resistive memory crossbars can dramatically reduce the energy required to perform computations in neural algorithms by three orders of magnitude when compared to an optimized digital ASIC [1]. For data intensive applications, the computational energy is dominated by moving data between the processor, SRAM, and DRAM. Analog crossbars overcome this by allowing data to be processed directly at each memory element. Analog crossbars accelerate three key operations that are the bulk of the computation in a neural network as illustrated in Fig 1: vector matrix multiplies (VMM), matrix vector multiplies (MVM), and outer product rank 1 updates (OPU)[2]. For an NxN crossbar the energy for each operation scales as the number of memory elements O(N2) [2]. This is because the crossbar performs its entire computation in one step, charging all the capacitances only once. Thus the CV2 energy of the array scales as array size. This fundamentally better than trying to read or write a digital memory. Each row of any NxN digital memory must be accessed one at a time, resulting in N columns of length O(N) being charged N times, requiring O(N3) energy to read a digital memory. Thus an analog crossbar has a fundamental O(N) energy scaling advantage over a digital system. Furthermore, if the read operation is done at low voltage and is therefore noise limited, the read energy can even be independent of the crossbar size, O(1) [2].

More Details

Resistive memory device requirements for a neural algorithm accelerator

Proceedings of the International Joint Conference on Neural Networks

Agarwal, Sapan A.; Plimpton, Steven J.; Hughart, David R.; Hsia, Alexander W.; Richter, Isaac; Cox, Jonathan A.; James, Conrad D.; Marinella, Matthew J.

Resistive memories enable dramatic energy reductions for neural algorithms. We propose a general purpose neural architecture that can accelerate many different algorithms and determine the device properties that will be needed to run backpropagation on the neural architecture. To maintain high accuracy, the read noise standard deviation should be less than 5% of the weight range. The write noise standard deviation should be less than 0.4% of the weight range and up to 300% of a characteristic update (for the datasets tested). Asymmetric nonlinearities in the change in conductance vs pulse cause weight decay and significantly reduce the accuracy, while moderate symmetric nonlinearities do not have an effect. In order to allow for parallel reads and writes the write current should be less than 100 nA as well.

More Details
Results 1–25 of 37
Results 1–25 of 37