We present the SEU sensitivity and SEL results from proton and heavy ion testing performed on NVIDIA Xavier NX and AMD Ryzen V1605B GPU devices in both static and dynamic operation.
Global thinning of integrated circuits is a technique that enables backside failure analysis and radiation testing. Prior work also shows increased thresholds for single-event latchup and upset in thinned devices. We present impacts of global thinning on device performance and reliability of 28 nm node field programmable gate arrays (FPGA). Devices are thinned to values of 50, 10, and 3 microns using a micromachining and polishing method. Lattice damage, in the form of dislocations, extend about 1 micron below the machined surface. The damage layer is removed after polishing with colloidal SiO2 slurry. We create a 2D finite-element model with liner elasticity equations and flip-chip packaged device geometry to show that thinning increases compressive global stress in the Si, while C4 bumps increase stress locally. Measurements of stress using Raman spectroscopy qualitatively agree with our stress model but also reveal the need for more complex structural models to account for nonlinear effects occurring in devices thinned to 3 microns and after temperature cycling to 125 °C. Thermal imaging shows that increased local heating occurs with increased thinning but the maximum temperature difference across the 3-micron die is less than 2 °C. Ring oscillators (ROs) programmed throughout the FPGA fabric slow about 0.5% after thinning compared to full thickness values. Temperature cycling the devices to 125 °C further decreases RO frequency about 0.5%, which we attribute to stress changes in the Si.
This study examines the single-event upset and single-event latch-up susceptibility of the Xilinx 16nm FinFET Zynq UltraScale+ RFSoC FPGA in proton irradiation. Results for SEU in configuration memory, BlockRAM memory, and device SEL are given.
This study examines the single-event upset and single-event latch-up susceptibility of the Xilinx 16nm FinFET Zynq UltraScale+ RFSoC FPGA in proton irradiation. Results for SEU in configuration memory, BlockRAM memory, and device SEL are given.
This study examined high-current events observed in Xilinx Field-Programmable Gate Arrays irradiated with heavy ions. A probable cause and proposed changes to the test methodology to prevent these high-current events is described.
Lee, David S.; Swift, Gary M.; Wirthlin, Michael J.; Draper, Jeffrey
This study describes complications introduced by angular direct ionization events on space error rate predictions. In particular, prevalence of multiple-cell upsets and a breakdown in the application of effective linear energy transfer in modern-scale devices can skew error rates approximated from currently available estimation models. This paper highlights the importance of angular testing and proposes a methodology to extend existing error estimation tools to properly consider angular strikes in modern-scale devices. These techniques are illustrated with test data provided from a modern 28 nm SRAM-based device.
Low-and high-energy proton experimental data and error rate predictions are presented for many bulk Si and SOI circuits from the 20-90 nm technology nodes to quantify how much low-energy protons (LEPs) can contribute to the total on-orbit single-event upset (SEU) rate. Every effort was made to predict LEP error rates that are conservatively high; even secondary protons generated in the spacecraft shielding have been included in the analysis. Across all the environments and circuits investigated, and when operating within 10% of the nominal operating voltage, LEPs were found to increase the total SEU rate to up to 4.3 times as high as it would have been in the absence of LEPs. Therefore, the best approach to account for LEP effects may be to calculate the total error rate from high-energy protons and heavy ions, and then multiply it by a safety margin of 5. If that error rate can be tolerated then our findings suggest that it is justified to waive LEP tests in certain situations. Trends were observed in the LEP angular responses of the circuits tested. Grazing angles were the worst case for the SOI circuits, whereas the worst-case angle was at or near normal incidence for the bulk circuits.
Lee, David S.; Swift, Gary M.; Wirthlin, Michael J.; Draper, Jeffrey
This study describes complications introduced by angular direct ionization events on space error rate predictions. In particular, prevalence of multiple-cell upsets and a breakdown in the application of effective linear energy transfer in modern-scale devices can skew error rates approximated from currently available estimation models. This paper highlights the importance of angular testing and proposes a methodology to extend existing error estimation tools to properly consider angular strikes in modern-scale devices. These techniques are illustrated with test data provided from a modern 28 nm SRAM-based device.
Low-and high-energy proton experimental data and error rate predictions are presented for many bulk Si and SOI circuits from the 20-90 nm technology nodes to quantify how much low-energy protons (LEPs) can contribute to the total on-orbit single-event upset (SEU) rate. Every effort was made to predict LEP error rates that are conservatively high; even secondary protons generated in the spacecraft shielding have been included in the analysis. Across all the environments and circuits investigated, and when operating within 10% of the nominal operating voltage, LEPs were found to increase the total SEU rate to up to 4.3 times as high as it would have been in the absence of LEPs. Therefore, the best approach to account for LEP effects may be to calculate the total error rate from high-energy protons and heavy ions, and then multiply it by a safety margin of 5. If that error rate can be tolerated then our findings suggest that it is justified to waive LEP tests in certain situations. Trends were observed in the LEP angular responses of the circuits tested. Grazing angles were the worst case for the SOI circuits, whereas the worst-case angle was at or near normal incidence for the bulk circuits.
This study examines the single-event response of the Xilinx 20 nm Kintex UltraScale Field-Programmable Gate Array irradiated with heavy ions. Results for single-event latch-up and single-event upset on configuration SRAM cells and Block RAM memories are provided.
Low- and high-energy proton experimental data and error rate predictions are presented for many bulk Si and SOI circuits from the 20-90 nm technology nodes to quantify how much low-energy protons (LEPs) can contribute to the total on-orbit single-event upset (SEU) rate. Every effort was made to predict LEP error rates that are conservatively high; even secondary protons generated in the spacecraft shielding have been included in the analysis. Across all the environments and circuits investigated, and when operating within 10% of the nominal operating voltage, LEPs were found to increase the total SEU rate to up to 4.3 times as high as it would have been in the absence of LEPs. Therefore, the best approach to account for LEP effects may be to calculate the total error rate from high-energy protons and heavy ions, and then multiply it by a safety margin of 5. If that error rate can be tolerated then our findings suggest that it is justified to waive LEP tests in certain situations. Trends were observed in the LEP angular responses of the circuits tested. As a result, grazing angles were the worst case for the SOI circuits, whereas the worst-case angle was at or near normal incidence for the bulk circuits.
Lee, David S.; Wirthlin, Michael; Swift, Gary; Le, Anthony C.
This study examines the single-event response of the Xilinx 28 nm Kintex-7 FPGA irradiated with heavy ions. Results for single-event effects on configuration SRAM cells, user-accessible Flip-Flop cells, and BlockRAM™ memory are provided. This study also describes an unconventional single event latch-up signature observed during testing.
Lee, David S.; Wirthlin, Michael; Swift, Gary; Le, Anthony C.
This study examines the single-event response of the Xilinx 28 nm Kintex-7 FPGA irradiated with heavy ions. Results for single-event effects on configuration SRAM cells, user-accessible Flip-Flop cells, and BlockRAM™ memory are provided. This study also describes an unconventional single event latch-up signature observed during testing.
With the continuing development of more capable data gathering sensors, comes an increased demand on the bandwidth for transmitting larger quantities of data. To help counteract that trend, a study was undertaken to determine appropriate lossy data compression strategies for minimizing their impact on target detection and characterization. The survey of current compression techniques led us to the conclusion that wavelet compression was well suited for this purpose. Wavelet analysis essentially applies a low-pass and high-pass filter to the data, converting the data into the related coefficients that maintain spatial information as well as frequency information. Wavelet compression is achieved by zeroing the coefficients that pertain to the noise in the signal, i.e. the high frequency, low amplitude portion. This approach is well suited for our goal because it reduces the noise in the signal with only minimal impact on the larger, lower frequency target signatures. The resulting coefficients can then be encoded using lossless techniques with higher compression levels because of the lower entropy and significant number of zeros. No significant signal degradation or difficulties in target characterization or detection were observed or measured when wavelet compression was applied to simulated and real data, even when over 80% of the coefficients were zeroed. While the exact level of compression will be data set dependent, for the data sets we studied, compression factors over 10 were found to be satisfactory where conventional lossless techniques achieved levels of less than 3.
This report documents the implementation results of a hardware demonstration utilizing the Serial RapidIO{trademark} and SpaceWire protocols that was funded by Sandia National Laboratories (SNL's) Laboratory Directed Research and Development (LDRD) office. This demonstration was one of the activities in the Modeling and Design of High-Speed Networks for Satellite Applications LDRD. This effort has demonstrated the transport of application layer packets across both RapidIO and SpaceWire networks to a common downlink destination using small topologies comprised of commercial-off-the-shelf and custom devices. The RapidFET and NEX-SRIO debug and verification tools were instrumental in the successful implementation of the RapidIO hardware demonstration. The SpaceWire hardware demonstration successfully demonstrated the transfer and routing of application data packets between multiple nodes and also was able reprogram remote nodes using configuration bitfiles transmitted over the network, a key feature proposed in node-based architectures (NBAs). Although a much larger network (at least 18 to 27 nodes) would be required to fully verify the design for use in a real-world application, this demonstration has shown that both RapidIO and SpaceWire are capable of routing application packets across a network to a common downlink node, illustrating their potential use in real-world NBAs.
This paper presents an overview of algorithms for directing messages through networks of varying topology. These are commonly referred to as routing algorithms in the literature that is presented. In addition to providing background on networking terminology and router basics, the paper explains the issues of deadlock and livelock as they apply to routing. After this, there is a discussion of routing algorithms for both store-and-forward and wormhole-switched networks. The paper covers both algorithms that do and do not adapt to conditions in the network. Techniques targeting structured as well as irregular topologies are discussed. Following this, strategies for routing in the presence of faulty nodes and links in the network are described.
Emerging high-bandwidth, low-latency network technology has made network-based architectures both feasible and potentially desirable for use in satellite payload architectures. The selection of network topology is a critical component when developing these multi-node or multi-point architectures. This study examines network topologies and their effect on overall network performance. Numerous topologies were reviewed against a number of performance, reliability, and cost metrics. This document identifies a handful of good network topologies for satellite applications and the metrics used to justify them as such. Since often multiple topologies will meet the requirements of the satellite payload architecture under development, the choice of network topology is not easy, and in the end the choice of topology is influenced by both the design characteristics and requirements of the overall system and the experience of the developer.