The selective amorphization of SiGe in Si/SiGe nanostructures via a 1 MeV Si+ implant was investigated, resulting in single-crystal Si nanowires (NWs) and quantum dots (QDs) encapsulated in amorphous SiGe fins and pillars, respectively. The Si NWs and QDs are formed during high-temperature dry oxidation of single-crystal Si/SiGe heterostructure fins and pillars, during which Ge diffuses along the nanostructure sidewalls and encapsulates the Si layers. The fins and pillars were then subjected to a 3 × 1015 ions/cm2 1 MeV Si+ implant, resulting in the amorphization of SiGe, while leaving the encapsulated Si crystalline for larger, 65-nm wide NWs and QDs. Interestingly, the 26-nm diameter Si QDs amorphize, while the 28-nm wide NWs remain crystalline during the same high energy ion implant. This result suggests that the Si/SiGe pillars have a lower threshold for Si-induced amorphization compared to their Si/SiGe fin counterparts. However, Monte Carlo simulations of ion implantation into the Si/SiGe nanostructures reveal similar predicted levels of displacements per cm3. Molecular dynamics simulations suggest that the total stress magnitude in Si QDs encapsulated in crystalline SiGe is higher than the total stress magnitude in Si NWs, which may lead to greater crystalline instability in the QDs during ion implant. The potential lower amorphization threshold of QDs compared to NWs is of special importance to applications that require robust QD devices in a variety of radiation environments.
A semi-analytic fluid model has been developed for characterizing relativistic electron emission across a warm diode gap. Here we demonstrate the use of this model in (i) verifying multi-fluid codes in modeling compressible relativistic electron flows (the EMPIRE-Fluid code is used as an example; see also Ref. 1), (ii) elucidating key physics mechanisms characterizing the influence of compressibility and relativistic injection speed of the electron flow, and (iii) characterizing the regimes over which a fluid model recovers physically reasonable solutions.
The objective of this milestone was to finish integrating GenTen tensor software with combustion application Pele using the Ascent in situ analysis software, partnering with the ALPINE and Pele teams. Also, to demonstrate the usage of the tensor analysis as part of a combustion simulation.
We present an adaptive algorithm for constructing surrogate models of multi-disciplinary systems composed of a set of coupled components. With this goal we introduce “coupling” variables with a priori unknown distributions that allow surrogates of each component to be built independently. Once built, the surrogates of the components are combined to form an integrated-surrogate that can be used to predict system-level quantities of interest at a fraction of the cost of the original model. The error in the integrated-surrogate is greedily minimized using an experimental design procedure that allocates the amount of training data, used to construct each component-surrogate, based on the contribution of those surrogates to the error of the integrated-surrogate. The multi-fidelity procedure presented is a generalization of multi-index stochastic collocation that can leverage ensembles of models of varying cost and accuracy, for one or more components, to reduce the computational cost of constructing the integrated-surrogate. Extensive numerical results demonstrate that, for a fixed computational budget, our algorithm is able to produce surrogates that are orders of magnitude more accurate than methods that treat the integrated system as a black-box.
Physics-informed machine learning (PIML) has emerged as a promising new approach for simulating complex physical and biological systems that are governed by complex multiscale processes for which some data are also available. In some instances, the objective is to discover part of the hidden physics from the available data, and PIML has been shown to be particularly effective for such problems for which conventional methods may fail. Unlike commercial machine learning where training of deep neural networks requires big data, in PIML big data are not available. Instead, we can train such networks from additional information obtained by employing the physical laws and evaluating them at random points in the space-time domain. Such PIML integrates multimodality and multifidelity data with mathematical models, and implements them using neural networks or graph networks. Here, we review some of the prevailing trends in embedding physics into machine learning, using physics-informed neural networks (PINNs) based primarily on feed-forward neural networks and automatic differentiation. For more complex systems or systems of systems and unstructured data, graph neural networks (GNNs) present some distinct advantages, and here we review how physics-informed learning can be accomplished with GNNs based on graph exterior calculus to construct differential operators; we refer to these architectures as physics-informed graph networks (PIGNs). We present representative examples for both forward and inverse problems and discuss what advances are needed to scale up PINNs, PIGNs and more broadly GNNs for large-scale engineering problems.
The E3 transition in irradiated GaAs observed in deep level transient spectroscopy (DLTS) was recently discovered in Laplace-DLTS to encompass three distinct components. The component designated E3c was found to be metastable, reversibly bleached under minority carrier (hole) injection, with an introduction rate dependent upon Si doping density. It is shown through first-principles modeling that the E3c must be the intimate Si-vacancy pair, best described as a Si sitting in a divacancy Sivv. The bleached metastable state is enabled by a doubly site-shifting mechanism: Upon recharging, the defect undergoes a second site shift rather returning to its original E3c-active configuration via reversing the first site shift. Identification of this defect offers insights into the short-time annealing kinetics in irradiated GaAs.
X-ray tomography is capable of imaging the interior of objects in three dimensions non-invasively, with applications in biomedical imaging, materials science, electronic inspection, and other fields. The reconstruction process can be an ill-conditioned inverse problem, requiring regularization to obtain satisfactory results. Recently, deep learning has been adopted for tomographic reconstruction. Unlike iterative algorithms which require a distribution that is known a priori, deep reconstruction networks can learn a prior distribution through sampling the training distributions. In this work, we develop a Physics-assisted Generative Adversarial Network (PGAN), a two-step algorithm for tomographic reconstruction. In contrast to previous efforts, our PGAN utilizes maximum-likelihood estimates derived from the measurements to regularize the reconstruction with both known physics and the learned prior. Compared with methods with less physics assisting in training, PGAN can reduce the photon requirement with limited projection angles to achieve a given error rate. The advantages of using a physics-assisted learned prior in X-ray tomography may further enable low-photon nanoscale imaging.
Nonlocal operators of fractional type are a popular modeling choice for applications that do not adhere to classical diffusive behavior; however, one major challenge in nonlocal simulations is the selection of model parameters. In this work we propose an optimization-based approach to parameter identification for fractional models with an optional truncation radius. We formulate the inference problem as an optimal control problem where the objective is to minimize the discrepancy between observed data and an approximate solution of the model, and the control variables are the fractional order and the truncation length. For the numerical solution of the minimization problem we propose a gradient-based approach, where we enhance the numerical performance by an approximation of the bilinear form of the state equation and its derivative with respect to the fractional order. Several numerical tests in one and two dimensions illustrate the theoretical results and show the robustness and applicability of our method.
Here, we report the first nonjellium, systematic, density functional theory (DFT) study of intrinsic and extrinsic defects and defect levels in zinc-blende (cubic) gallium nitride. We use the local moment counter charge (LMCC) method, the standard Perdew-Becke-Ernzerhoff (PBE) exchange-correlation potential, and two pseudopotentials, where the Ga 3$\textit{d}$ orbitals are either in the core ($d^0$) or explicitly in the valence set ($d^{10}$). We studied 64, 216, 512, and 1000 atom supercells, and demonstrated convergence to the infinite limit, crucial for delineating deep from shallow states near band edges, and for demonstrating the elimination of finite cell-size errors. Contrary to common claims, we find that exact exchange is not required to obtain defect levels across the experimental band gap. As was true in silicon, silicon carbide, and gallium arsenide, the extremal LMCC defect levels of the aggregate of defects yield an effective LMCC defect band gap that is within 10% of the experimental gap (3.3 eV) for both pseudopotentials. We demonstrate that the gallium vacancy is more complicated than previously reported. There is dramatic metastability–a nearest-neighbor nitrogen atom shifts into the gallium site, forming an antisite, nitrogen vacancy pair, which is more stable than the simple vacancy for positive charge states. Our assessment of the $d^0$ and $d^{10}$ pseudopotentials yields minimal differences in defect structures and defect levels. The better agreement of the $d^0$ lattice constant with experiment suggests that the more computationally economical $d^0$ pseudopotentials are sufficient to achieve the fidelity possible within the physical accuracy of DFT, and thereby enable calculations in larger supercells necessary to demonstrate convergence with respect to finite size supercell errors.
This report presents a specification for the Portals 4 network programming interface. Portals 4 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4 is well suited to massively parallel processing and embedded systems. Portals 4 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandia's Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4 is targeted to the next generation of machines employing advanced network interface architectures that support enhanced offload capabilities.
The current manuscript is a final report on the activities carried out under the Project LDRD-CIS #226834. In scientific terms, the work reported in this manuscript is a continuation of the efforts started with Project LDRD-express #223796 with final report of activities SAND2021-11481, see [83]. In this section we briefly explain what pre-existing developments motivated the current body of work and provide an overview of the activities developed with the funds provided. The overarching goal of the current project LDRD-CIS #226834 and the previous project LDRD-express #223796 is the development of numerical methods with mathematically guaranteed properties in order to solve the Euler-Maxwell system of plasma physics and generalizations thereof. Even though Project #223796 laid out general foundations of space and time discretization of Euler-Maxwell system, overall, it was focused on the development of numerical schemes for purely electrostatic fluid-plasma models. In particular, the project developed a family of schemes with mathematically guaranteed robustness in order to solve the Euler-Poisson model. This model is an asymptotic limit where only electrostatic response of the plasma is considered. Its primary feature is the presence of a non-local force, the electrostatic force, which introduces effects with infinite speed propagation into the problem. Even though instantaneous propagation of perturbations may be considered nonphysical, there are plenty of physical regimes of technical interest where such an approximation is perfectly valid.
Rezaul Karim, Mohammad; Narasimhachary, Santosh; Radaelli, Francesco; Amann, Christian; Dayal, Kaushik; Silling, Stewart A.; Germann, Timothy C.
We present a computational study and framework that allows us to study and understand the crack nucleation process from forging flaws. Forging flaws may be present in large steel rotor components commonly used for rotating power generation equipment including gas turbines, electrical generators, and steam turbines. The service life of these components is often limited by crack nucleation and subsequent growth from such forging flaws, which frequently exhibit themselves as non-metallic oxide inclusions. The fatigue crack growth process can be described by established engineering fracture mechanics methods. However, the initial crack nucleation process from a forging flaw is challenging for traditional engineering methods to quantify as it depends on the details of the flaw, including flaw morphology. We adopt the peridynamics method to describe and study this crack nucleation process. For a specific industrial gas turbine rotor steel, we present how we integrate and fit commonly known base material property data such as elastic properties, yield strength, and S-N curves, as well as fatigue crack growth data into a peridynamic model. The obtained model is then utilized in a series of high-performance two-dimensional peridynamic simulations to study the crack nucleation process from forging flaws for ambient and elevated temperatures in a rectangular simulation cell specimen. The simulations reveal an initial local nucleation at multiple small oxide inclusions followed by micro-crack propagation, arrest, coalescence, and eventual emergence of a dominant micro-crack that governs the crack nucleation process. The dependence on temperature and density of oxide inclusions of both the details of the microscopic processes and cycles to crack nucleation is also observed. The results are compared with fatigue experiments performed with specimens containing forging flaws of the same rotor steel.
Quantum information processing has reached an inflection point, transitioning from proof-of-principle scientific experiments to small, noisy quantum processors. To accelerate this process and eventually move to fault-tolerant quantum computing, it is necessary to provide the scientific community with access to whitebox testbed systems. The Quantum Scientific Computing Open User Testbed (QSCOUT) provides scientists unique access to an innovative system to help advance quantum computing science.
Errors in quantum logic gates are usually modeled by quantum process matrices (CPTP maps). But process matrices can be opaque and unwieldy. We show how to transform the process matrix of a gate into an error generator that represents the same information more usefully. We construct a basis of simple and physically intuitive elementary error generators, classify them, and show how to represent the error generator of any gate as a mixture of elementary error generators with various rates. Finally, we show how to build a large variety of reduced models for gate errors by combining elementary error generators and/or entire subsectors of generator space. We conclude with a few examples of reduced models, including one with just 9N2 parameters that describes almost all commonly predicted errors on an N-qubit processor.
In this article, we present a general methodology to combine the Discontinuous PetrovGalerkin (DPG) method in space and time in the context of methods of lines for transient advection-reaction problems. We first introduce a semidiscretization in space with a DPG method redefining the ideas of optimal testing and practicality of the method in this context. Then, we apply the recently developed DPG-based time-marching scheme, which is of exponential-type, to the resulting system of Ordinary Differential Equations (ODEs). We also discuss how to efficiently compute the action of the exponential of the matrix coming from the space semidiscretization without assembling the full matrix. Finally, we verify the proposed method for 1D+time advection-reaction problems showing optimal convergence rates for smooth solutions and more stable results for linear conservation laws comparing to the classical exponential integrators.
Entangling gates in trapped-ion quantum computers are most often applied to stationary ions with initial motional distributions that are thermal and close to the ground state, while those demonstrations that involve transport generally use sympathetic cooling to reinitialize the motional state prior to applying a gate. Future systems with more ions, however, will face greater nonthermal excitation due to increased amounts of ion transport and exacerbated by longer operational times and variations over the trap array. In addition, pregate sympathetic cooling may be limited due to time costs and laser access constraints. In this paper, we analyze the impact of such coherent motional excitation on entangling-gate error by performing simulations of Mølmer-Sørenson (MS) gates on a pair of trapped-ion qubits with both thermal and coherent excitation present in a shared motional mode at the start of the gate. We quantify how a small amount of coherent displacement erodes gate performance in the presence of experimental noise, and we demonstrate that adjusting the relative phase between the initial coherent displacement and the displacement induced by the gate or using Walsh modulation can suppress this error. We then use experimental data from transported ions to analyze the impact of coherent displacement on MS-gate error under realistic conditions.
Voronin, Alexey; He, Yunhui; Maclachlan, Scott; Olson, Luke N.; Tuminaro, Raymond S.
A well-known strategy for building effective preconditioners for higher-order discretizations of some PDEs, such as Poisson's equation, is to leverage effective preconditioners for their low-order analogs. In this work, we show that high-quality preconditioners can also be derived for the Taylor–Hood discretization of the Stokes equations in much the same manner. In particular, we investigate the use of geometric multigrid based on the (Formula presented.) discretization of the Stokes operator as a preconditioner for the (Formula presented.) discretization of the Stokes system. We utilize local Fourier analysis to optimize the damping parameters for Vanka and Braess–Sarazin relaxation schemes and to achieve robust convergence. These results are then verified and compared against the measured multigrid performance. While geometric multigrid can be applied directly to the (Formula presented.) system, our ultimate motivation is to apply algebraic multigrid within solvers for (Formula presented.) systems via the (Formula presented.) discretization, which will be considered in a companion paper.
We present a surrogate modeling framework for conservatively estimating measures of risk from limited realizations of an expensive physical experiment or computational simulation. Risk measures combine objective probabilities with the subjective values of a decision maker to quantify anticipated outcomes. Given a set of samples, we construct a surrogate model that produces estimates of risk measures that are always greater than their empirical approximations obtained from the training data. These surrogate models limit over-confidence in reliability and safety assessments and produce estimates of risk measures that converge much faster to the true value than purely sample-based estimates. We first detail the construction of conservative surrogate models that can be tailored to a stakeholder's risk preferences and then present an approach, based on stochastic orders, for constructing surrogate models that are conservative with respect to families of risk measures. Our surrogate models include biases that permit them to conservatively estimate the target risk measures. We provide theoretical results that show that these biases decay at the same rate as the L2 error in the surrogate model. Numerical demonstrations confirm that risk-adapted surrogate models do indeed overestimate the target risk measures while converging at the expected rate.
Neural networks are largely based on matrix computations. During forward inference, the most heavily used compute kernel is the matrix-vector multiplication (MVM): $W \vec{x} $. Inference is a first frontier for the deployment of next-generation hardware for neural network applications, as it is more readily deployed in edge devices, such as mobile devices or embedded processors with size, weight, and power constraints. Inference is also easier to implement in analog systems than training, which has more stringent device requirements. The main processing kernel used during inference is the MVM.
In this paper, we develop an algorithm to efficiently solve risk-averse optimization problems posed in reflexive Banach space. Such problems often arise in many practical applications as, e.g., optimization problems constrained by partial differential equations with uncertain inputs. Unfortunately, for many popular risk models including the coherent risk measures, the resulting risk-averse objective function is nonsmooth. This lack of differentiability complicates the numerical approximation of the objective function as well as the numerical solution of the optimization problem. To address these challenges, we propose a primal–dual algorithm for solving large-scale nonsmooth risk-averse optimization problems. This algorithm is motivated by the classical method of multipliers and by epigraphical regularization of risk measures. As a result, the algorithm solves a sequence of smooth optimization problems using derivative-based methods. We prove convergence of the algorithm even when the subproblems are solved inexactly and conclude with numerical examples demonstrating the efficiency of our method.
In this paper, we propose a hybrid method that uses stochastic and deterministic search to compute the maximum likelihood estimator of a low-rank count tensor with Poisson loss via state-of-theart local methods. Our approach is inspired by Simulated Annealing for global optimization and allows for fine-grain parameter tuning as well as adaptive updates to algorithm parameters. We present numerical results that indicate our hybrid approach can compute better approximations to the maximum likelihood estimator with less computation than the state-of-the-art methods by themselves.
Social systems are uniquely complex and difficult to study, but understanding them is vital to solving the world’s problems. The Ground Truth program developed a new way of testing the research methods that attempt to understand and leverage the Human Domain and its associated complexities. The program developed simulations of social systems as virtual world test beds. Not only were these simulations able to produce data on future states of the system under various circumstances and scenarios, but their causal ground truth was also explicitly known. Research teams studied these virtual worlds, facilitating deep validation of causal inference, prediction, and prescription methods. The Ground Truth program model provides a way to test and validate research methods to an extent previously impossible, and to study the intricacies and interactions of different components of research.
A major challenge in shape optimization is the coupling of finite element method (FEM) codes in a way that facilitates efficient computation of shape derivatives. This is particularly difficult with multiphysics problems involving legacy codes, where the costs of implementing and maintaining shape derivative capabilities are prohibitive. The volume and boundary methods are two approaches to computing shape derivatives. Each has a major drawback: the boundary method is less accurate, while the volume method is more invasive to the FEM code. We introduce the strip method, which computes shape derivatives on a strip adjacent to the boundary. The strip method makes code coupling simple. Like the boundary method, it queries the state and adjoint solutions at quadrature nodes, but requires no knowledge of the FEM code implementations. At the same time, it exhibits the higher accuracy of the volume method. As an added benefit, its computational complexity is comparable to that of the boundary method, that is, it is faster than the volume method. We illustrate the benefits of the strip method with numerical examples.
A straight fiber with nonlocal forces that are independent of bond strain is considered. These internal loads can either stabilize or destabilize the straight configuration. Transverse waves with long wavelength have unstable dispersion properties for certain combinations of nonlocal kernels and internal loads. When these unstable waves occur, deformation of the straight fiber into a circular arc can lower its potential energy in equilibrium. The equilibrium value of the radius of curvature is computed explicitly.
We demonstrate SONOS (silicon-oxide-nitride-oxide-silicon) analog memory arrays that are optimized for neural network inference. The devices are fabricated in a 40nm process and operated in the subthreshold regime for in-memory matrix multiplication. Subthreshold operation enables low conductances to be implemented with low error, which matches the typical weight distribution of neural networks, which is heavily skewed toward near-zero values. This leads to high accuracy in the presence of programming errors and process variations. We simulate the end-To-end neural network inference accuracy, accounting for the measured programming error, read noise, and retention loss in a fabricated SONOS array. Evaluated on the ImageNet dataset using ResNet50, the accuracy using a SONOS system is within 2.16% of floating-point accuracy without any retraining. The unique error properties and high On/Off ratio of the SONOS device allow scaling to large arrays without bit slicing, and enable an inference architecture that achieves 20 TOPS/W on ResNet50, a > 10× gain in energy efficiency over state-of-The-Art digital and analog inference accelerators.