Proton and Heavy Ion SEE Data on NVIDIA and AMD GPUs
Abstract not provided.
Abstract not provided.
IEEE Radiation Effects Data Workshop
We present the SEU sensitivity and SEL results from proton and heavy ion testing performed on NVIDIA Xavier NX and AMD Ryzen V1605B GPU devices in both static and dynamic operation.
Proceedings - IEEE International Symposium on Circuits and Systems
Area efficient self-correcting flip-flops for use with triple modular redundant (TMR) soft-error hardened logic are implemented in a 12-nm finFET process technology. The TMR flip-flop slave latches self-correct in the clock low phase using Muller C-elements in the latch feedback. These C-elements are driven by the two redundant stored values and not by the slave latch itself, saving area over a similar implementation using majority gate feedback. These flip-flops are implemented as large shift-register arrays on a test chip and have been experimentally tested for their soft-error mitigation in static and dynamic modes of operation using heavy ions and protons. We show how high clock skew can result in susceptibility to soft-errors in the dynamic mode, and explain the potential failure mechanism.
Abstract not provided.
Abstract not provided.
Abstract not provided.
IEEE Transactions on Nuclear Science
Integration-technology feature shrink increases computing-system susceptibility to single-event effects (SEE). While modeling SEE faults will be critical, an integrated processor's scope makes physically correct modeling computationally intractable. Without useful models, presilicon evaluation of fault-tolerance approaches becomes impossible. To incorporate accurate transistor-level effects at a system scope, we present a multiscale simulation framework. Charge collection at the 1) device level determines 2) circuit-level transient duration and state-upset likelihood. Circuit effects, in turn, impact 3) register-transfer-level architecture-state corruption visible at 4) the system level. Thus, the physically accurate effects of SEEs in large-scale systems, executed on a high-performance computing (HPC) simulator, could be used to drive cross-layer radiation hardening by design. We demonstrate the capabilities of this model with two case studies. First, we determine a D flip-flop's sensitivity at the transistor level on 14-nm FinFet technology, validating the model against published cross sections. Second, we track and estimate faults in a microprocessor without interlocked pipelined stages (MIPS) processor for Adams 90% worst case environment in an isotropic space environment.
Conference Proceedings from the International Symposium for Testing and Failure Analysis
Global thinning of integrated circuits is a technique that enables backside failure analysis and radiation testing. Prior work also shows increased thresholds for single-event latchup and upset in thinned devices. We present impacts of global thinning on device performance and reliability of 28 nm node field programmable gate arrays (FPGA). Devices are thinned to values of 50, 10, and 3 microns using a micromachining and polishing method. Lattice damage, in the form of dislocations, extend about 1 micron below the machined surface. The damage layer is removed after polishing with colloidal SiO2 slurry. We create a 2D finite-element model with liner elasticity equations and flip-chip packaged device geometry to show that thinning increases compressive global stress in the Si, while C4 bumps increase stress locally. Measurements of stress using Raman spectroscopy qualitatively agree with our stress model but also reveal the need for more complex structural models to account for nonlinear effects occurring in devices thinned to 3 microns and after temperature cycling to 125 °C. Thermal imaging shows that increased local heating occurs with increased thinning but the maximum temperature difference across the 3-micron die is less than 2 °C. Ring oscillators (ROs) programmed throughout the FPGA fabric slow about 0.5% after thinning compared to full thickness values. Temperature cycling the devices to 125 °C further decreases RO frequency about 0.5%, which we attribute to stress changes in the Si.
Abstract not provided.
Abstract not provided.
Abstract not provided.