Publications

Publications / Journal Article

An Accurate, Error-Tolerant, and Energy-Efficient Neural Network Inference Engine Based on SONOS Analog Memory

Xiao, T.P.; Feinberg, Benjamin F.; Bennett, Christopher H.; Agrawal, Vineet; Saxena, Prashant; Prabhakar, Venkatraman; Ramkumar, Krishnaswamy; Medu, Harsha; Raghavan, Vijay; Chettuvetty, Ramesh; Agarwal, Sapan A.; Marinella, Matthew J.

We demonstrate SONOS (silicon-oxide-nitride-oxide-silicon) analog memory arrays that are optimized for neural network inference. The devices are fabricated in a 40nm process and operated in the subthreshold regime for in-memory matrix multiplication. Subthreshold operation enables low conductances to be implemented with low error, which matches the typical weight distribution of neural networks, which is heavily skewed toward near-zero values. This leads to high accuracy in the presence of programming errors and process variations. We simulate the end-To-end neural network inference accuracy, accounting for the measured programming error, read noise, and retention loss in a fabricated SONOS array. Evaluated on the ImageNet dataset using ResNet50, the accuracy using a SONOS system is within 2.16% of floating-point accuracy without any retraining. The unique error properties and high On/Off ratio of the SONOS device allow scaling to large arrays without bit slicing, and enable an inference architecture that achieves 20 TOPS/W on ResNet50, a > 10× gain in energy efficiency over state-of-The-Art digital and analog inference accelerators.

SAND Number

SAND2022-0047J

Journal

IEEE Transactions on Circuits and Systems I: Regular Papers
Volume 69, Issue 4, Page 1480-1493

Date Published

April 1, 2022

Published By

IEEE (United States)

Funding Sponsor

Sandia Laboratory Directed Research & Development (LDRD)

LDRD Project Number

214546

Research Partners

Sandia National Laboratories, New Mexico

Infineon Technologies, North America

ISSN

15580806 15498328

DOI

10.1109/TCSI.2021.3134313

Subject

Mathematics and computing

Keywords

Sonos

Charge trap memory

Neuromorphic

Neural network

Analog

In-memory computing

Inference accelerator.