Publications

Results 76–100 of 9,998

Search results

Jump to search filters

Selective amorphization of SiGe in Si/SiGe nanostructures via high energy Si+ implant

Journal of Applied Physics

Turner, Emily M.; Campbell, Quinn C.; Avci, Ibrahim; Weber, William J.; Lu, Ping L.; Wang, George T.; Jones, Kevin S.

The selective amorphization of SiGe in Si/SiGe nanostructures via a 1 MeV Si+ implant was investigated, resulting in single-crystal Si nanowires (NWs) and quantum dots (QDs) encapsulated in amorphous SiGe fins and pillars, respectively. The Si NWs and QDs are formed during high-temperature dry oxidation of single-crystal Si/SiGe heterostructure fins and pillars, during which Ge diffuses along the nanostructure sidewalls and encapsulates the Si layers. The fins and pillars were then subjected to a 3 × 1015 ions/cm2 1 MeV Si+ implant, resulting in the amorphization of SiGe, while leaving the encapsulated Si crystalline for larger, 65-nm wide NWs and QDs. Interestingly, the 26-nm diameter Si QDs amorphize, while the 28-nm wide NWs remain crystalline during the same high energy ion implant. This result suggests that the Si/SiGe pillars have a lower threshold for Si-induced amorphization compared to their Si/SiGe fin counterparts. However, Monte Carlo simulations of ion implantation into the Si/SiGe nanostructures reveal similar predicted levels of displacements per cm3. Molecular dynamics simulations suggest that the total stress magnitude in Si QDs encapsulated in crystalline SiGe is higher than the total stress magnitude in Si NWs, which may lead to greater crystalline instability in the QDs during ion implant. The potential lower amorphization threshold of QDs compared to NWs is of special importance to applications that require robust QD devices in a variety of radiation environments.

More Details

Electrostatic Relativistic Fluid Models of Electron Emission in a Warm Diode

IEEE International Conference on Plasma Science (ICOPS)

Hamlin, Nathaniel D.; Smith, Thomas M.; Roberds, Nicholas R.; Laros, James H.; Beckwith, Kristian B.

A semi-analytic fluid model has been developed for characterizing relativistic electron emission across a warm diode gap. Here we demonstrate the use of this model in (i) verifying multi-fluid codes in modeling compressible relativistic electron flows (the EMPIRE-Fluid code is used as an example; see also Ref. 1), (ii) elucidating key physics mechanisms characterizing the influence of compressibility and relativistic injection speed of the electron flow, and (iii) characterizing the regimes over which a fluid model recovers physically reasonable solutions.

More Details

Adaptive experimental design for multi-fidelity surrogate modeling of multi-disciplinary systems

International Journal for Numerical Methods in Engineering

Jakeman, John D.; Friedman, Sam; Eldred, Michael S.; Tamellini, Lorenzo; Gorodetsky, Alex A.; Allaire, Doug

We present an adaptive algorithm for constructing surrogate models of multi-disciplinary systems composed of a set of coupled components. With this goal we introduce “coupling” variables with a priori unknown distributions that allow surrogates of each component to be built independently. Once built, the surrogates of the components are combined to form an integrated-surrogate that can be used to predict system-level quantities of interest at a fraction of the cost of the original model. The error in the integrated-surrogate is greedily minimized using an experimental design procedure that allocates the amount of training data, used to construct each component-surrogate, based on the contribution of those surrogates to the error of the integrated-surrogate. The multi-fidelity procedure presented is a generalization of multi-index stochastic collocation that can leverage ensembles of models of varying cost and accuracy, for one or more components, to reduce the computational cost of constructing the integrated-surrogate. Extensive numerical results demonstrate that, for a fixed computational budget, our algorithm is able to produce surrogates that are orders of magnitude more accurate than methods that treat the integrated system as a black-box.

More Details

Scalable algorithms for physics-informed neural and graph networks

Data-Centric Engineering

Shukla, Khemraj; Xu, Mengjia; Trask, Nathaniel A.; Karniadakis, George E.

Physics-informed machine learning (PIML) has emerged as a promising new approach for simulating complex physical and biological systems that are governed by complex multiscale processes for which some data are also available. In some instances, the objective is to discover part of the hidden physics from the available data, and PIML has been shown to be particularly effective for such problems for which conventional methods may fail. Unlike commercial machine learning where training of deep neural networks requires big data, in PIML big data are not available. Instead, we can train such networks from additional information obtained by employing the physical laws and evaluating them at random points in the space-time domain. Such PIML integrates multimodality and multifidelity data with mathematical models, and implements them using neural networks or graph networks. Here, we review some of the prevailing trends in embedding physics into machine learning, using physics-informed neural networks (PINNs) based primarily on feed-forward neural networks and automatic differentiation. For more complex systems or systems of systems and unstructured data, graph neural networks (GNNs) present some distinct advantages, and here we review how physics-informed learning can be accomplished with GNNs based on graph exterior calculus to construct differential operators; we refer to these architectures as physics-informed graph networks (PIGNs). We present representative examples for both forward and inverse problems and discuss what advances are needed to scale up PINNs, PIGNs and more broadly GNNs for large-scale engineering problems.

More Details

Theory of the metastable injection-bleached E3c center in GaAs

Physical Review. B

Schultz, Peter A.; Hjalmarson, Harold P.

The E3 transition in irradiated GaAs observed in deep level transient spectroscopy (DLTS) was recently discovered in Laplace-DLTS to encompass three distinct components. The component designated E3c was found to be metastable, reversibly bleached under minority carrier (hole) injection, with an introduction rate dependent upon Si doping density. It is shown through first-principles modeling that the E3c must be the intimate Si-vacancy pair, best described as a Si sitting in a divacancy Sivv. The bleached metastable state is enabled by a doubly site-shifting mechanism: Upon recharging, the defect undergoes a second site shift rather returning to its original E3c-active configuration via reversing the first site shift. Identification of this defect offers insights into the short-time annealing kinetics in irradiated GaAs.

More Details

Physics-assisted generative adversarial network for X-ray tomography

Optics Express

Guo, Zhen; Song, Jung K.; Barbastathis, George; Vaughan, Courtenay T.; Larson, Kurt W.; Alpert, Bradley K.; Levine, Zachary H.; Glinsky, Michael E.

X-ray tomography is capable of imaging the interior of objects in three dimensions non-invasively, with applications in biomedical imaging, materials science, electronic inspection, and other fields. The reconstruction process can be an ill-conditioned inverse problem, requiring regularization to obtain satisfactory results. Recently, deep learning has been adopted for tomographic reconstruction. Unlike iterative algorithms which require a distribution that is known a priori, deep reconstruction networks can learn a prior distribution through sampling the training distributions. In this work, we develop a Physics-assisted Generative Adversarial Network (PGAN), a two-step algorithm for tomographic reconstruction. In contrast to previous efforts, our PGAN utilizes maximum-likelihood estimates derived from the measurements to regularize the reconstruction with both known physics and the learned prior. Compared with methods with less physics assisting in training, PGAN can reduce the photon requirement with limited projection angles to achieve a given error rate. The advantages of using a physics-assisted learned prior in X-ray tomography may further enable low-photon nanoscale imaging.

More Details

An optimization-based approach to parameter learning for fractional type nonlocal models

Computers and Mathematics with Applications

D'Elia, Marta D.; Glusa, Christian A.; Burkovska, Olena

Nonlocal operators of fractional type are a popular modeling choice for applications that do not adhere to classical diffusive behavior; however, one major challenge in nonlocal simulations is the selection of model parameters. In this work we propose an optimization-based approach to parameter identification for fractional models with an optional truncation radius. We formulate the inference problem as an optimal control problem where the objective is to minimize the discrepancy between observed data and an approximate solution of the model, and the control variables are the fractional order and the truncation length. For the numerical solution of the minimization problem we propose a gradient-based approach, where we enhance the numerical performance by an approximation of the bilinear form of the state equation and its derivative with respect to the fractional order. Several numerical tests in one and two dimensions illustrate the theoretical results and show the robustness and applicability of our method.

More Details

Electronic structure of intrinsic defects in c-gallium nitride: Density functional theory study without the jellium approximation

Physical Review. B

Edwards, Arthur H.; Schultz, Peter A.; Dobzynski, Richard M.

Here, we report the first nonjellium, systematic, density functional theory (DFT) study of intrinsic and extrinsic defects and defect levels in zinc-blende (cubic) gallium nitride. We use the local moment counter charge (LMCC) method, the standard Perdew-Becke-Ernzerhoff (PBE) exchange-correlation potential, and two pseudopotentials, where the Ga 3$\textit{d}$ orbitals are either in the core ($d^0$) or explicitly in the valence set ($d^{10}$). We studied 64, 216, 512, and 1000 atom supercells, and demonstrated convergence to the infinite limit, crucial for delineating deep from shallow states near band edges, and for demonstrating the elimination of finite cell-size errors. Contrary to common claims, we find that exact exchange is not required to obtain defect levels across the experimental band gap. As was true in silicon, silicon carbide, and gallium arsenide, the extremal LMCC defect levels of the aggregate of defects yield an effective LMCC defect band gap that is within 10% of the experimental gap (3.3 eV) for both pseudopotentials. We demonstrate that the gallium vacancy is more complicated than previously reported. There is dramatic metastability–a nearest-neighbor nitrogen atom shifts into the gallium site, forming an antisite, nitrogen vacancy pair, which is more stable than the simple vacancy for positive charge states. Our assessment of the $d^0$ and $d^{10}$ pseudopotentials yields minimal differences in defect structures and defect levels. The better agreement of the $d^0$ lattice constant with experiment suggests that the more computationally economical $d^0$ pseudopotentials are sufficient to achieve the fidelity possible within the physical accuracy of DFT, and thereby enable calculations in larger supercells necessary to demonstrate convergence with respect to finite size supercell errors.

More Details

The Portals 4.3 Network Programming Interface

Schonbein, William W.; Barrett, Brian W.; Brightwell, Ronald B.; Grant, Ryan E.; Hemmert, Karl S.; Laros, James H.; Underwood, Keith; Riesen, Rolf; Hoefler, Torsten; Barbe, Mathieu; Suraty Filho, Luiz H.; Ratchov, Alexandre; Maccabe, Arthur B.

This report presents a specification for the Portals 4 network programming interface. Portals 4 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4 is well suited to massively parallel processing and embedded systems. Portals 4 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandia's Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4 is targeted to the next generation of machines employing advanced network interface architectures that support enhanced offload capabilities.

More Details

Asymptotic preserving methods for fluid electron-fluid models in the large magnetic field limit with mathematically guaranteed properties (Final Report)

Tomas, Ignacio T.; Shadid, John N.; Maier, Matthias; Salgado, Abner

The current manuscript is a final report on the activities carried out under the Project LDRD-CIS #226834. In scientific terms, the work reported in this manuscript is a continuation of the efforts started with Project LDRD-express #223796 with final report of activities SAND2021-11481, see [83]. In this section we briefly explain what pre-existing developments motivated the current body of work and provide an overview of the activities developed with the funds provided. The overarching goal of the current project LDRD-CIS #226834 and the previous project LDRD-express #223796 is the development of numerical methods with mathematically guaranteed properties in order to solve the Euler-Maxwell system of plasma physics and generalizations thereof. Even though Project #223796 laid out general foundations of space and time discretization of Euler-Maxwell system, overall, it was focused on the development of numerical schemes for purely electrostatic fluid-plasma models. In particular, the project developed a family of schemes with mathematically guaranteed robustness in order to solve the Euler-Poisson model. This model is an asymptotic limit where only electrostatic response of the plasma is considered. Its primary feature is the presence of a non-local force, the electrostatic force, which introduces effects with infinite speed propagation into the problem. Even though instantaneous propagation of perturbations may be considered nonphysical, there are plenty of physical regimes of technical interest where such an approximation is perfectly valid.

More Details

Crack nucleation at forging flaws studied by non-local peridynamics simulations

Mathematics and Mechanics of Solids

Rezaul Karim, Mohammad; Narasimhachary, Santosh; Radaelli, Francesco; Amann, Christian; Dayal, Kaushik; Silling, Stewart A.; Germann, Timothy C.

We present a computational study and framework that allows us to study and understand the crack nucleation process from forging flaws. Forging flaws may be present in large steel rotor components commonly used for rotating power generation equipment including gas turbines, electrical generators, and steam turbines. The service life of these components is often limited by crack nucleation and subsequent growth from such forging flaws, which frequently exhibit themselves as non-metallic oxide inclusions. The fatigue crack growth process can be described by established engineering fracture mechanics methods. However, the initial crack nucleation process from a forging flaw is challenging for traditional engineering methods to quantify as it depends on the details of the flaw, including flaw morphology. We adopt the peridynamics method to describe and study this crack nucleation process. For a specific industrial gas turbine rotor steel, we present how we integrate and fit commonly known base material property data such as elastic properties, yield strength, and S-N curves, as well as fatigue crack growth data into a peridynamic model. The obtained model is then utilized in a series of high-performance two-dimensional peridynamic simulations to study the crack nucleation process from forging flaws for ambient and elevated temperatures in a rectangular simulation cell specimen. The simulations reveal an initial local nucleation at multiple small oxide inclusions followed by micro-crack propagation, arrest, coalescence, and eventual emergence of a dominant micro-crack that governs the crack nucleation process. The dependence on temperature and density of oxide inclusions of both the details of the microscopic processes and cycles to crack nucleation is also observed. The results are compared with fatigue experiments performed with specimens containing forging flaws of the same rotor steel.

More Details

QSCOUT Progress Report, June 2022 [Quantum Scientific Computing Open User Testbed]

Clark, Susan M.; Norris, Haley R.; Landahl, Andrew J.; Yale, Christopher G.; Lobser, Daniel L.; Van Der Wall, Jay W.; Revelle, Melissa R.

Quantum information processing has reached an inflection point, transitioning from proof-of-principle scientific experiments to small, noisy quantum processors. To accelerate this process and eventually move to fault-tolerant quantum computing, it is necessary to provide the scientific community with access to whitebox testbed systems. The Quantum Scientific Computing Open User Testbed (QSCOUT) provides scientists unique access to an innovative system to help advance quantum computing science.

More Details

A Taxonomy of Small Markovian Errors

PRX Quantum

Blume-Kohout, Robin J.; Da Silva, Marcus P.; Nielsen, Erik N.; Proctor, Timothy J.; Rudinger, Kenneth M.; Sarovar, Mohan S.; Young, Kevin C.

Errors in quantum logic gates are usually modeled by quantum process matrices (CPTP maps). But process matrices can be opaque and unwieldy. We show how to transform the process matrix of a gate into an error generator that represents the same information more usefully. We construct a basis of simple and physically intuitive elementary error generators, classify them, and show how to represent the error generator of any gate as a mixture of elementary error generators with various rates. Finally, we show how to build a large variety of reduced models for gate errors by combining elementary error generators and/or entire subsectors of generator space. We conclude with a few examples of reduced models, including one with just 9N2 parameters that describes almost all commonly predicted errors on an N-qubit processor.

More Details

Combining DPG in space with DPG time-marching scheme for the transient advection-reaction equation

Munoz-Matute, Judit; Demkowicz, Leszek; Roberts, Nathan V.

In this article, we present a general methodology to combine the Discontinuous PetrovGalerkin (DPG) method in space and time in the context of methods of lines for transient advection-reaction problems. We first introduce a semidiscretization in space with a DPG method redefining the ideas of optimal testing and practicality of the method in this context. Then, we apply the recently developed DPG-based time-marching scheme, which is of exponential-type, to the resulting system of Ordinary Differential Equations (ODEs). We also discuss how to efficiently compute the action of the exponential of the matrix coming from the space semidiscretization without assembling the full matrix. Finally, we verify the proposed method for 1D+time advection-reaction problems showing optimal convergence rates for smooth solutions and more stable results for linear conservation laws comparing to the classical exponential integrators.

More Details

Entangling-gate error from coherently displaced motional modes of trapped ions

Physical Review A

Ruzic, Brandon R.; Barrick, Todd A.; Hunker, Jeffrey D.; Law, Ryan L.; McFarland, Brian M.; McGuinness, Hayden J.; Parazzoli, L.P.; Sterk, Jonathan D.; Van Der Wall, Jay W.; Stick, Daniel L.

Entangling gates in trapped-ion quantum computers are most often applied to stationary ions with initial motional distributions that are thermal and close to the ground state, while those demonstrations that involve transport generally use sympathetic cooling to reinitialize the motional state prior to applying a gate. Future systems with more ions, however, will face greater nonthermal excitation due to increased amounts of ion transport and exacerbated by longer operational times and variations over the trap array. In addition, pregate sympathetic cooling may be limited due to time costs and laser access constraints. In this paper, we analyze the impact of such coherent motional excitation on entangling-gate error by performing simulations of Mølmer-Sørenson (MS) gates on a pair of trapped-ion qubits with both thermal and coherent excitation present in a shared motional mode at the start of the gate. We quantify how a small amount of coherent displacement erodes gate performance in the presence of experimental noise, and we demonstrate that adjusting the relative phase between the initial coherent displacement and the displacement induced by the gate or using Walsh modulation can suppress this error. We then use experimental data from transported ions to analyze the impact of coherent displacement on MS-gate error under realistic conditions.

More Details

Low-order preconditioning of the Stokes equations

Numerical Linear Algebra with Applications

Voronin, Alexey; He, Yunhui; Maclachlan, Scott; Olson, Luke N.; Tuminaro, Raymond S.

A well-known strategy for building effective preconditioners for higher-order discretizations of some PDEs, such as Poisson's equation, is to leverage effective preconditioners for their low-order analogs. In this work, we show that high-quality preconditioners can also be derived for the Taylor–Hood discretization of the Stokes equations in much the same manner. In particular, we investigate the use of geometric multigrid based on the (Formula presented.) discretization of the Stokes operator as a preconditioner for the (Formula presented.) discretization of the Stokes system. We utilize local Fourier analysis to optimize the damping parameters for Vanka and Braess–Sarazin relaxation schemes and to achieve robust convergence. These results are then verified and compared against the measured multigrid performance. While geometric multigrid can be applied directly to the (Formula presented.) system, our ultimate motivation is to apply algebraic multigrid within solvers for (Formula presented.) systems via the (Formula presented.) discretization, which will be considered in a companion paper.

More Details

Surrogate modeling for efficiently, accurately and conservatively estimating measures of risk

Reliability Engineering and System Safety

Jakeman, John D.; Kouri, Drew P.; Huerta, Jose G.

We present a surrogate modeling framework for conservatively estimating measures of risk from limited realizations of an expensive physical experiment or computational simulation. Risk measures combine objective probabilities with the subjective values of a decision maker to quantify anticipated outcomes. Given a set of samples, we construct a surrogate model that produces estimates of risk measures that are always greater than their empirical approximations obtained from the training data. These surrogate models limit over-confidence in reliability and safety assessments and produce estimates of risk measures that converge much faster to the true value than purely sample-based estimates. We first detail the construction of conservative surrogate models that can be tailored to a stakeholder's risk preferences and then present an approach, based on stochastic orders, for constructing surrogate models that are conservative with respect to families of risk measures. Our surrogate models include biases that permit them to conservatively estimate the target risk measures. We provide theoretical results that show that these biases decay at the same rate as the L2 error in the surrogate model. Numerical demonstrations confirm that risk-adapted surrogate models do indeed overestimate the target risk measures while converging at the expected rate.

More Details

CrossSim Inference Manual v2.0

Xiao, Tianyao X.; Bennett, Christopher H.; Feinberg, Benjamin F.; Marinella, Matthew J.; Agarwal, Sapan A.

Neural networks are largely based on matrix computations. During forward inference, the most heavily used compute kernel is the matrix-vector multiplication (MVM): $W \vec{x} $. Inference is a first frontier for the deployment of next-generation hardware for neural network applications, as it is more readily deployed in edge devices, such as mobile devices or embedded processors with size, weight, and power constraints. Inference is also easier to implement in analog systems than training, which has more stringent device requirements. The main processing kernel used during inference is the MVM.

More Details

A primal–dual algorithm for risk minimization

Mathematical Programming

Kouri, Drew P.; Surowiec, Thomas M.

In this paper, we develop an algorithm to efficiently solve risk-averse optimization problems posed in reflexive Banach space. Such problems often arise in many practical applications as, e.g., optimization problems constrained by partial differential equations with uncertain inputs. Unfortunately, for many popular risk models including the coherent risk measures, the resulting risk-averse objective function is nonsmooth. This lack of differentiability complicates the numerical approximation of the objective function as well as the numerical solution of the optimization problem. To address these challenges, we propose a primal–dual algorithm for solving large-scale nonsmooth risk-averse optimization problems. This algorithm is motivated by the classical method of multipliers and by epigraphical regularization of risk measures. As a result, the algorithm solves a sequence of smooth optimization problems using derivative-based methods. We prove convergence of the algorithm even when the subproblems are solved inexactly and conclude with numerical examples demonstrating the efficiency of our method.

More Details

A Hybrid Method for Tensor Decompositions that Leverages Stochastic and Deterministic Optimization

Myers, Jeremy M.; Dunlavy, Daniel D.

In this paper, we propose a hybrid method that uses stochastic and deterministic search to compute the maximum likelihood estimator of a low-rank count tensor with Poisson loss via state-of-theart local methods. Our approach is inspired by Simulated Annealing for global optimization and allows for fine-grain parameter tuning as well as adaptive updates to algorithm parameters. We present numerical results that indicate our hybrid approach can compute better approximations to the maximum likelihood estimator with less computation than the state-of-the-art methods by themselves.

More Details

The Ground Truth Program: Simulations as Test Beds for Social Science Research Methods.

Computational and Mathematical Organization Theory

Naugle, Asmeret B.; Russell, Adam; Lakkaraju, Kiran L.; Swiler, Laura P.; Verzi, Stephen J.; Romero, Vicente J.

Social systems are uniquely complex and difficult to study, but understanding them is vital to solving the world’s problems. The Ground Truth program developed a new way of testing the research methods that attempt to understand and leverage the Human Domain and its associated complexities. The program developed simulations of social systems as virtual world test beds. Not only were these simulations able to produce data on future states of the system under various circumstances and scenarios, but their causal ground truth was also explicitly known. Research teams studied these virtual worlds, facilitating deep validation of causal inference, prediction, and prescription methods. The Ground Truth program model provides a way to test and validate research methods to an extent previously impossible, and to study the intricacies and interactions of different components of research.

More Details

The strip method for shape derivatives

International Journal for Numerical Methods in Engineering

Hardesty, Sean H.; Antil, Harbir; Kouri, Drew P.; Ridzal, Denis R.

A major challenge in shape optimization is the coupling of finite element method (FEM) codes in a way that facilitates efficient computation of shape derivatives. This is particularly difficult with multiphysics problems involving legacy codes, where the costs of implementing and maintaining shape derivative capabilities are prohibitive. The volume and boundary methods are two approaches to computing shape derivatives. Each has a major drawback: the boundary method is less accurate, while the volume method is more invasive to the FEM code. We introduce the strip method, which computes shape derivatives on a strip adjacent to the boundary. The strip method makes code coupling simple. Like the boundary method, it queries the state and adjoint solutions at quadrature nodes, but requires no knowledge of the FEM code implementations. At the same time, it exhibits the higher accuracy of the volume method. As an added benefit, its computational complexity is comparable to that of the boundary method, that is, it is faster than the volume method. We illustrate the benefits of the strip method with numerical examples.

More Details

Self-Induced Curvature in an Internally Loaded Peridynamic Fiber

Silling, Stewart A.

A straight fiber with nonlocal forces that are independent of bond strain is considered. These internal loads can either stabilize or destabilize the straight configuration. Transverse waves with long wavelength have unstable dispersion properties for certain combinations of nonlocal kernels and internal loads. When these unstable waves occur, deformation of the straight fiber into a circular arc can lower its potential energy in equilibrium. The equilibrium value of the radius of curvature is computed explicitly.

More Details

An Accurate, Error-Tolerant, and Energy-Efficient Neural Network Inference Engine Based on SONOS Analog Memory

IEEE Transactions on Circuits and Systems I: Regular Papers

Xiao, Tianyao X.; Feinberg, Benjamin F.; Bennett, Christopher H.; Agrawal, Vineet; Saxena, Prashant; Prabhakar, Venkatraman; Ramkumar, Krishnaswamy; Medu, Harsha; Raghavan, Vijay; Chettuvetty, Ramesh; Agarwal, Sapan A.; Marinella, Matthew J.

We demonstrate SONOS (silicon-oxide-nitride-oxide-silicon) analog memory arrays that are optimized for neural network inference. The devices are fabricated in a 40nm process and operated in the subthreshold regime for in-memory matrix multiplication. Subthreshold operation enables low conductances to be implemented with low error, which matches the typical weight distribution of neural networks, which is heavily skewed toward near-zero values. This leads to high accuracy in the presence of programming errors and process variations. We simulate the end-To-end neural network inference accuracy, accounting for the measured programming error, read noise, and retention loss in a fabricated SONOS array. Evaluated on the ImageNet dataset using ResNet50, the accuracy using a SONOS system is within 2.16% of floating-point accuracy without any retraining. The unique error properties and high On/Off ratio of the SONOS device allow scaling to large arrays without bit slicing, and enable an inference architecture that achieves 20 TOPS/W on ResNet50, a > 10× gain in energy efficiency over state-of-The-Art digital and analog inference accelerators.

More Details
Results 76–100 of 9,998
Results 76–100 of 9,998