Engineering and applied science rely on computational experiments to rigorously study physical systems. The mathematical models used to probe these systems are highly complex, and sampling-intensive studies often require prohibitively many simulations for acceptable accuracy. Surrogate models provide a means of circumventing the high computational expense of sampling such complex models. In particular, polynomial chaos expansions (PCEs) have been successfully used for uncertainty quantification studies of deterministic models where the dominant source of uncertainty is parametric. We discuss an extension to conventional PCE surrogate modeling to enable surrogate construction for stochastic computational models that have intrinsic noise in addition to parametric uncertainty. We develop a PCE surrogate on a joint space of intrinsic and parametric uncertainty, enabled by Rosenblatt transformations, which are evaluated via kernel density estimation of the associated conditional cumulative distributions. Furthermore, we extend the construction to random field data via the Karhunen-Loève expansion. We then take advantage of closed-form solutions for computing PCE Sobol indices to perform a global sensitivity analysis of the model which quantifies the intrinsic noise contribution to the overall model output variance. Additionally, the resulting joint PCE is generative in the sense that it allows generating random realizations at any input parameter setting that are statistically approximately equivalent to realizations from the underlying stochastic model. The method is demonstrated on a chemical catalysis example model and a synthetic example controlled by a parameter that enables a switch from unimodal to bimodal response distributions.
Understanding and accurately characterizing energy dissipation mechanisms in civil structures during earthquakes is an important element of seismic assessment and design. The most commonly used model is attributed to Rayleigh. This paper proposes a systematic approach to quantify the uncertainty associated with Rayleigh's damping model. Bayesian calibration with embedded model error is employed to treat the coefficients of the Rayleigh model as random variables using modal damping ratios. Through a numerical example, we illustrate how this approach works and how the calibrated model can address modeling uncertainty associated with the Rayleigh damping model.
Bayesian inference with a simple Gaussian error model is used to efficiently compute prediction variances for energies, forces, and stresses in the linear SNAP interatomic potential. The prediction variance is shown to have a strong correlation with the absolute error over approximately 24 orders of magnitude. Using this prediction variance, an active learning algorithm is constructed to iteratively train a potential by selecting the structures with the most uncertain properties from a pool of candidate structures. The relative importance of the energy, force, and stress errors in the objective function is shown to have a strong impact upon the trajectory of their respective net error metrics when running the active learning algorithm. Batched training of different batch sizes is also tested against singular structure updates, and it is found that batches can be used to significantly reduce the number of retraining steps required with only minor impact on the active learning trajectory.
A new strategy is presented for computing anharmonic partition functions for the motion of adsorbates relative to a catalytic surface. Importance sampling is compared with conventional Monte Carlo. The importance sampling is significantly more efficient. This new approach is applied to CH3* on Ni(111) as a test case. The motion of methyl relative to the nickel surface is found to be anharmonic, with significantly higher entropy compared to the standard harmonic oscillator model. The new method is freely available as part of the Minima-Preserving Neural Network within the AdTherm package.
The ShakeAlert Earthquake Early Warning (EEW) system aims to issue an advance warning to residents on the West Coast of the United States seconds before the ground shaking arrives, if the expected ground shaking exceeds a certain threshold. However, residents in tall buildings may experience much greater motion due to the dynamic response of the buildings. Therefore, there is an ongoing effort to extend ShakeAlert to include the contribution of building response to provide a more accurate estimation of the expected shaking intensity for tall buildings. Currently, the supposedly ideal solution of analyzing detailed finite element models of buildings under predicted ground-motion time histories is not theoretically or practically feasible. The authors have recently investigated existing simple methods to estimate peak floor acceleration (PFA) and determined these simple formulas are not practically suitable. Instead, this article explores another approach by extending the Pacific Earthquake Engineering Research Center (PEER) performance-based earthquake engineering (PBEE) to EEW, considering that every component involved in building response prediction is uncertain in the EEW scenario. While this idea is not new and has been proposed by other researchers, it has two shortcomings: (1) the simple beam model used for response prediction is prone to modeling uncertainty, which has not been quantified, and (2) the ground motions used for probabilistic demand models are not suitable for EEW applications. In this article, we address these two issues by incorporating modeling errors into the parameters of the beam model and using a new set of ground motions, respectively. We demonstrate how this approach could practically work using data from a 52-story building in downtown Los Angeles. Using the criteria and thresholds employed by previous researchers, we show that if peak ground acceleration (PGA) is accurately estimated, this approach can predict the expected level of human comfort in tall buildings.
Ground heat flux (G0) is a key component of the land-surface energy balance of high-latitude regions. Despite its crucial role in controlling permafrost degradation due to global warming, G0 is sparsely measured and not well represented in the outputs of global scale model simulation. In this study, an analytical heat transfer model is tested to reconstruct G0 across seasons using soil temperature series from field measurements, Global Climate Model, and climate reanalysis outputs. The probability density functions of ground heat flux and of model parameters are inferred using available G0 data (measured or modeled) for snow-free period as a reference. When observed G0 is not available, a numerical model is applied using estimates of surface heat flux (dependent on parameters) as the top boundary condition. These estimates (and thus the corresponding parameters) are verified by comparing the distributions of simulated and measured soil temperature at several depths. Aided by state-of-the-art uncertainty quantification methods, the developed G0 reconstruction approach provides novel means for assessing the probabilistic structure of the ground heat flux for regional permafrost change studies.
In this report we present our findings and outcomes of the NNRDS (analysis of Neural Networks as Random Dynamical Systems) project. The work is largely motivated by the analogy of a large class of neural networks (NNs) with a discretized ordinary differential equation (ODE) schemes. Namely, residual NNs, or ResNets, can be viewed as a discretization of neural ODEs (NODEs) where the NN depth plays the role of the time evolution. We employ several legacy tools from ODE theory, such as stiffness, nonlocality, autonomicity, to enable regularization of ResNets thus improving their generalization capabilities. Furthermore, armed with NN analysis tools borrowed from the ODE theory, we are able to efficiently augment NN predictions with uncertainty overcoming wellknown dimensionality challenges and adding a degree of trust towards NN predictions. Finally, we have developed a Python library QUiNN (Quantification of Uncertainties in Neural Networks) that incorporates improved-architecture ResNets, besides classical feed-forward NNs, and contains wrappers to PyTorch NN models enabling several major classes of uncertainty quantification methods for NNs. Besides synthetic problems, we demonstrate the methods on datasets from climate modeling and materials science.
Neural ordinary differential equations (NODEs) have recently regained popularity as large-depth limits of a large class of neural networks. In particular, residual neural networks (ResNets) are equivalent to an explicit Euler discretization of an underlying NODE, where the transition from one layer to the next is one time step of the discretization. The relationship between continuous and discrete neural networks has been of particular interest. Notably, analysis from the ordinary differential equation viewpoint can potentially lead to new insights for understanding the behavior of neural networks in general. In this work, we take inspiration from differential equations to define the concept of stiffness for a ResNet via the interpretation of a ResNet as the discretization of a NODE. Here, we then examine the effects of stiffness on the ability of a ResNet to generalize, via computational studies on example problems coming from climate and chemistry models. We find that penalizing stiffness does have a unique regularizing effect, but we see no benefit to penalizing stiffness over L2 regularization (penalization of network parameter norms) in terms of predictive performance.
The use of simple models for response prediction of building structures is preferred in earthquake engineering for risk evaluations at regional scales, as they make computational studies more feasible. The primary impediment in their gainful use presently is the lack of viable methods for quantifying (and reducing upon) the modeling errors/uncertainties they bear. This study presents a Bayesian calibration method wherein the modeling error is embedded into the parameters of the model. The method is specifically described for coupled shear-flexural beam models here, but it can be applied to any parametric surrogate model. The major benefit the method offers is the ability to consider the modeling uncertainty in the forward prediction of any degree-of-freedom or composite response regardless of the data used in calibration. The method is extensively verified using two synthetic examples. In the first example, the beam model is calibrated to represent a similar beam model but with enforced modeling errors. In the second example, the beam model is used to represent the detailed finite element model of a 52-story building. Both examples show the capability of the proposed solution to provide realistic uncertainty estimation around the mean prediction.
Runoff is a critical component of the terrestrial water cycle, and Earth system models (ESMs) are essential tools to study its spatiotemporal variability. Runoff schemes in ESMs typically include many parameters so that model calibration is necessary to improve the accuracy of simulated runoff. However, runoff calibration at a global scale is challenging because of the high computational cost and the lack of reliable observational datasets. In this study, we calibrated 11 runoff relevant parameters in the Energy Exascale Earth System Model (E3SM) Land Model (ELM) using a surrogate-assisted Bayesian framework. First, the polynomial chaos expansion machinery with Bayesian compressed sensing is used to construct computationally inexpensive surrogate models for ELM-simulated runoff at 0.5 × 0.5 for 1991-2010. The error metric between the ELM simulations and the benchmark data is selected to construct the surrogates, which facilitates efficient calibration and avoids the more conventional, but challenging, construction of high-dimensional surrogates for the ELM simulated runoff. Second, the Sobol' index sensitivity analysis is performed using the surrogate models to identify the most sensitive parameters, and our results show that, in most regions, ELM-simulated runoff is strongly sensitive to 3 of the 11 uncertain parameters. Third, a Bayesian method is used to infer the optimal values of the most sensitive parameters using an observation-based global runoff dataset as the benchmark. Our results show that model performance is significantly improved with the inferred parameter values. Although the parametric uncertainty of simulated runoff is reduced after the parameter inference, it remains comparable to the multimodel ensemble uncertainty represented by the global hydrological models in ISMIP2a. Additionally, the annual global runoff trend during the simulation period is not well constrained by the inferred parameter values, suggesting the importance of including parametric uncertainty in future runoff projections.
A Bayesian inference strategy has been used to estimate uncertain inputs to global impurity transport code (GITR) modeling predictions of tungsten erosion and migration in the linear plasma device, PISCES-A. This allows quantification of GITR output uncertainty based on the uncertainties in measured PISCES-A plasma electron density and temperature profiles (n e, T e) used as inputs to GITR. The technique has been applied for comparison to dedicated experiments performed for high (4 × 1022 m-2 s-1) and low (5 × 1021 m-2 s-1) flux 250 eV He-plasma exposed tungsten (W) targets designed to assess the net and gross erosion of tungsten, and corresponding W impurity transport. The W target design and orientation, impurity collector, and diagnostics, have been designed to eliminate complexities associated with tokamak divertor plasma exposures (inclined target, mixed plasma species, re-erosion, etc) to benchmark results against the trace impurity transport model simulated by GITR. The simulated results of the erosion, migration, and re-deposition of W during the experiment from the GITR code coupled to materials response models are presented. Specifically, the modeled and experimental W I emission spectroscopy data for a 429.4 nm line and net erosion through the target and collector mass difference measurements are compared. The methodology provides predictions of observable quantities of interest with quantified uncertainty, allowing estimation of moments, together with the sensitivities to plasma temperature and density.
The UQ Toolkit (UQTk) is a collection of libraries and tools for the quantification of uncertainty in numerical model predictions. Version 3.1.2 offers intrusive and non-intrusive methods for propagating input uncertainties through computational models, tools for sensitivity analysis, methods for sparse surrogate construction, and Bayesian inference tools for inferring parameters from experimental data. This manual discusses the download and installation process for UQTk, provides pointers to the UQ methods used in the toolkit, and describes some of the examples provided with the toolkit.
Flooding impacts are on the rise globally, and concentrated in urban areas. Currently, there are no operational systems to forecast flooding at spatial resolutions that can facilitate emergency preparedness and response actions mitigating flood impacts. We present a framework for real-time flood modeling and uncertainty quantification that combines the physics of fluid motion with advances in probabilistic methods. The framework overcomes the prohibitive computational demands of high-fidelity modeling in real-time by using a probabilistic learning method relying on surrogate models that are trained prior to a flood event. This shifts the overwhelming burden of computation to the trivial problem of data storage, and enables forecasting of both flood hazard and its uncertainty at scales that are vital for time-critical decision-making before and during extreme events. The framework has the potential to improve flood prediction and analysis and can be extended to other hazard assessments requiring intense high-fidelity computations in real-time.
A new method for computing anharmonic thermophysical properties for adsorbates on metal surfaces is presented. Classical Monte Carlo phase space integration is performed to calculate the partition function for the motion of a hydrogen atom on Cu(111). A minima-preserving neural network potential energy surface is used within the integration routine. Two different sampling schema for generating the training data are presented, and two different density functionals are used. The results are benchmarked against direct state counting results by using discrete variable representation. The phase space integration results are in excellent quantitative agreement with the benchmark results. Additionally, both the discrete variable representation and the phase space integration results confirm that the motion of H on Cu(111) is highly anharmonic. The results were applied to calculate the free energy of dissociative adsorption of H2 and the resulting Langmuir isotherms at 400, 800, and 1200 K in a partial pressure range of 0-1 bar. It shows that the anharmonic effects lead to significantly higher predicted surface site fractions of hydrogen.
We present a new geodesic-based method for geometry optimization in a basis set of redundant internal coordinates. Our method updates the molecular geometry by following the geodesic generated by a displacement vector on the internal coordinate manifold, which dramatically reduces the number of steps required to converge to a minimum. Our method can be implemented in any existing optimization code, requiring only implementation of derivatives of the Wilson B-matrix and the ability to numerically solve an ordinary differential equation.
Kreitz, Bjarne; Sargsyan, Khachik; Mazeau, Emily J.; Blondal, Katrin; West, Richard H.; Wehinger, Gregor D.; Turek, Thomas; Goldsmith, C.F.
Automatic mechanism generation is used to determine mechanisms for the CO2 hydrogenation on Ni(111) in a two-stage process while considering the correlated uncertainty in DFT-based energetic parameters systematically. In a coarse stage, all the possible chemistry is explored with gas-phase products down to the ppb level, while a refined stage discovers the core methanation submechanism. Five thousand unique mechanisms were generated, which contain minor perturbations in all parameters. Global uncertainty assessment, global sensitivity analysis, and degree of rate control analysis are performed to study the effect of this parametric uncertainty on the microkinetic model predictions. Comparison of the model predictions with experimental data on a Ni/SiO2 catalyst find a feasible set of microkinetic mechanisms within the correlated uncertainty space that are in quantitative agreement with the measured data, without relying on explicit parameter optimization. Global uncertainty and sensitivity analyses provide tools to determine the pathways and key factors that control the methanation activity within the parameter space. Together, these methods reveal that the degree of rate control approach can be misleading if parametric uncertainty is not considered. The procedure of considering uncertainties in the automated mechanism generation is not unique to CO2 methanation and can be easily extended to other challenging heterogeneously catalyzed reactions.
The UQ Toolkit (UQTk) is a collection of libraries and tools for the quantification of uncertainty in numerical model predictions. Version 3.1.1 offers intrusive and non-intrusive methods for propagating input uncertainties through computational models, tools for sensitivity analysis, methods for sparse surrogate construction, and Bayesian inference tools for inferring parameters from experimental data. This manual discusses the download and installation process for UQTk, provides pointers to the UQ methods used in the toolkit, and describes some of the examples provided with the toolkit.
We demonstrate a Bayesian method for the “real-time” characterization and forecasting of partially observed COVID-19 epidemic. Characterization is the estimation of infection spread parameters using daily counts of symptomatic patients. The method is designed to help guide medical resource allocation in the early epoch of the outbreak. The estimation problem is posed as one of Bayesian inference and solved using a Markov chain Monte Carlo technique. The data used in this study was sourced before the arrival of the second wave of infection in July 2020. The proposed modeling approach, when applied at the country level, generally provides accurate forecasts at the regional, state and country level. The epidemiological model detected the flattening of the curve in California, after public health measures were instituted. The method also detected different disease dynamics when applied to specific regions of New Mexico.
A novel modeling framework that simultaneously improves accuracy, predictability, and computational efficiency is presented. It embraces the benefits of three modeling techniques integrated together for the first time: surrogate modeling, parameter inference, and data assimilation. The use of polynomial chaos expansion (PCE) surrogates significantly decreases computational time. Parameter inference allows for model faster convergence, reduced uncertainty, and superior accuracy of simulated results. Ensemble Kalman filters assimilate errors that occur during forecasting. To examine the applicability and effectiveness of the integrated framework, we developed 18 approaches according to how surrogate models are constructed, what type of parameter distributions are used as model inputs, and whether model parameters are updated during the data assimilation procedure. We conclude that (1) PCE must be built over various forcing and flow conditions, and in contrast to previous studies, it does not need to be rebuilt at each time step; (2) model parameter specification that relies on constrained, posterior information of parameters (so-called Selected specification) can significantly improve forecasting performance and reduce uncertainty bounds compared to Random specification using prior information of parameters; and (3) no substantial differences in results exist between single and dual ensemble Kalman filters, but the latter better simulates flood peaks. The use of PCE effectively compensates for the computational load added by the parameter inference and data assimilation (up to ~80 times faster). Therefore, the presented approach contributes to a shift in modeling paradigm arguing that complex, high-fidelity hydrologic and hydraulic models should be increasingly adopted for real-time and ensemble flood forecasting.
The UQ Toolkit (UQTk) is a collection of libraries and tools for the quantification of uncertainty in numerical model predictions. Version 3.1.0 offers intrusive and non-intrusive methods for propagating input uncertainties through computational models, tools for sensitivity analysis, methods for sparse surrogate construction, and Bayesian inference tools for inferring parameters from experimental data. This manual discusses the download and installation process for UQTk, provides pointers to the UQ methods used in the toolkit, and describes some of the examples provided with the toolkit.
Basis adaptation in Homogeneous Chaos spaces rely on a suitable rotation of the underlying Gaussian germ. Several rotations have been proposed in the literature resulting in adaptations with different convergence properties. In this paper we present a new adaptation mechanism that builds on compressive sensing algorithms, resulting in a reduced polynomial chaos approximation with optimal sparsity. The developed adaptation algorithm consists of a two-step optimization procedure that computes the optimal coefficients and the input projection matrix of a low dimensional chaos expansion with respect to an optimally rotated basis. We demonstrate the attractive features of our algorithm through several numerical examples including the application on Large-Eddy Simulation (LES) calculations of turbulent combustion in a HIFiRE scramjet engine.
Model error estimation remains one of the key challenges in uncertainty quantification and predictive science. For computational models of complex physical systems, model error, also known as structural error or model inadequacy, is often the largest contributor to the overall predictive uncertainty. This work builds on a recently developed framework of embedded, internal model correction, in order to represent and quantify structural errors, together with model parameters,within a Bayesian inference context. We focus specifically on a Polynomial Chaos representation with additive modification of existing model parameters, enabling a non-intrusive procedure for efficient approximate likelihood construction, model error estimation, and disambiguation of model and data errors’ contributions to predictive uncertainty. The framework is demonstrated on several synthetic examples, as well as on a chemical ignition problem.
Model error estimation remains one of the key challenges in uncertainty quantification and predictive science. For computational models of complex physical systems, model error, also known as structural error or model inadequacy, is often the largest contributor to the overall predictive uncertainty. This work builds on a recently developed frame- work of embedded, internal model correction, in order to represent and quantify structural errors, together with model parameters, within a Bayesian inference context.We focus specifically on a polynomial chaos representation with addi- tive modification of existing model parameters, enabling a nonintrusive procedure for efficient approximate likelihood construction, model error estimation, and disambiguation of model and data errors’ contributions to predictive uncer- tainty. The framework is demonstrated on several synthetic examples, as well as on a chemical ignition problem.
Rate coefficients are key quantities in gas phase kinetics and can be determined theoretically via master equation (ME) calculations. Rate coefficients characterize how fast a certain chemical species reacts away due to collisions into a specific product. Some of these collisions are simply transferring energy between the colliding partners, in which case the initial chemical species can undergo a unimolecular reaction: dissociation or isomerization. Other collisions are reactive, and the colliding partners either exchange atoms, these are direct reactions, or form complexes that can themselves react further or get stabilized by deactivating collisions with a bath gas. The input of MEs are molecular parameters: geometries, energies, and frequencies determined from ab initio calculations. While the calculation of these rate coefficients using ab initio data is becoming routine in many cases, the determination of the uncertainties of the rate coefficients are often ignored, sometimes crudely assessed by varying independently just a few of the numerous parameters, and only occasionally studied in detail. In this study, molecular frequencies, barrier heights, well depths, and imaginary frequencies (needed to calculate quantum mechanical tunneling) were automatically perturbed in an uncorrelated fashion. Our Python tool, MEUQ, takes user requests to change all or specified well, barrier, or bimolecular product parameters for a reaction. We propagate the uncertainty in these input parameters and perform global sensitivity analysis in the rate coefficients for the ethyl + O2 system using state-of-the-art uncertainty quantification (UQ) techniques via Python interface to UQ Toolkit (www.sandia.gov/uqtoolkit). A total of 10,000 sets of rate coefficients were collected after perturbing 240 molecular parameters. With our methodology, sensitive mechanistic steps can be revealed to a modeler in a straightforward manner for identification of significant and negligible influences in bimolecular reactions.
Here, compressive sensing is a powerful technique for recovering sparse solutions of underdetermined linear systems, which is often encountered in uncertainty quantification analysis of expensive and high-dimensional physical models. We perform numerical investigations employing several compressive sensing solvers that target the unconstrained LASSO formulation, with a focus on linear systems that arise in the construction of polynomial chaos expansions. With core solvers l1_ls, SpaRSA, CGIST, FPC_AS, and ADMM, we develop techniques to mitigate overfitting through an automated selection of regularization constant based on cross-validation, and a heuristic strategy to guide the stop-sampling decision. Practical recommendations on parameter settings for these techniques are provided and discussed. The overall method is applied to a series of numerical examples of increasing complexity, including large eddy simulations of supersonic turbulent jet-in-crossflow involving a 24-dimensional input. Through empirical phase-transition diagrams and convergence plots, we illustrate sparse recovery performance under structures induced by polynomial chaos, accuracy, and computational trade-offs between polynomial bases of different degrees, and practicability of conducting compressive sensing for a realistic, high-dimensional physical application. Across test cases studied in this paper, we find ADMM to have demonstrated empirical advantages through consistent lower errors and faster computational times.
A new method for fast evaluation of high dimensional integrals arising in quantum mechanics is proposed. Here, the method is based on sparse approximation of a high dimensional function followed by a low-rank compression. In the first step, we interpret the high dimensional integrand as a tensor in a suitable tensor product space and determine its entries by a compressed sensing based algorithm using only a few function evaluations. Secondly, we implement a rank reduction strategy to compress this tensor in a suitable low-rank tensor format using standard tensor compression tools. This allows representing a high dimensional integrand function as a small sum of products of low dimensional functions. Finally, a low dimensional Gauss–Hermite quadrature rule is used to integrate this low-rank representation, thus alleviating the curse of dimensionality. Finally, numerical tests on synthetic functions, as well as on energy correction integrals for water and formaldehyde molecules demonstrate the efficiency of this method using very few function evaluations as compared to other integration strategies.
We conduct a global sensitivity analysis (GSA) of the Energy Exascale Earth System Model (E3SM), land model (ELM) to calculate the sensitivity of five key carbon cycle outputs to 68 model parameters. This GSA is conducted by first constructing a Polynomial Chaos (PC) surrogate via new Weighted Iterative Bayesian Compressive Sensing (WIBCS) algorithm for adaptive basis growth leading to a sparse, high-dimensional PC surrogate with 3,000 model evaluations. The PC surrogate allows efficient extraction of GSA information leading to further dimensionality reduction. The GSA is performed at 96 FLUXNET sites covering multiple plant functional types (PFTs) and climate conditions. About 20 of the model parameters are identified as sensitive with the rest being relatively insensitive across all outputs and PFTs. These sensitivities are dependent on PFT, and are relatively consistent among sites within the same PFT. The five model outputs have a majority of their highly sensitive parameters in common. A common subset of sensitive parameters is also shared among PFTs, but some parameters are specific to certain types (e.g., deciduous phenology). The relative importance of these parameters shifts significantly among PFTs and with climatic variables such as mean annual temperature.
The development of scramjet engines is an important research area for advancing hypersonic and orbital flights. Progress toward optimal engine designs requires accurate flow simulations together with uncertainty quantification. However, performing uncertainty quantification for scramjet simulations is challenging due to the large number of uncertainparameters involvedandthe high computational costofflow simulations. These difficulties are addressedin this paper by developing practical uncertainty quantification algorithms and computational methods, and deploying themin the current studyto large-eddy simulations ofajet incrossflow inside a simplified HIFiRE Direct Connect Rig scramjet combustor. First, global sensitivity analysis is conducted to identify influential uncertain input parameters, which can help reduce the system's stochastic dimension. Second, because models of different fidelity are used in the overall uncertainty quantification assessment, a framework for quantifying and propagating the uncertainty due to model error is presented. These methods are demonstrated on a nonreacting jet-in-crossflow test problem in a simplified scramjet geometry, with parameter space up to 24 dimensions, using static and dynamic treatments of the turbulence subgrid model, and with two-dimensional and three-dimensional geometries.
The development of scramjet engines is an important research area for advancing hypersonic and orbital flights. Progress towards optimal engine designs requires accurate and computationally affordable flow simulations, as well as uncertainty quantification (UQ). While traditional UQ techniques can become prohibitive under expensive simulations and high-dimensional parameter spaces, polynomial chaos (PC) surrogate modeling is a useful tool for alleviating some of the computational burden. However, non-intrusive quadrature-based constructions of PC expansions relying on a single high-fidelity model can still be quite expensive. We thus introduce a two-stage numerical procedure for constructing PC surrogates while making use of multiple models of different fidelity. The first stage involves an initial dimension reduction through global sensitivity analysis using compressive sensing. The second stage utilizes adaptive sparse quadrature on a multifidelity expansion to compute PC surrogate coefficients in the reduced parameter space where quadrature methods can be more effective. The overall method is used to produce accurate surrogates and to propagate uncertainty induced by uncertain boundary conditions and turbulence model parameters, for performance quantities of interest from large eddy simulations of supersonic reactive flows inside a scramjet engine.
The development of scramjet engines is an important research area for advancing hypersonic and orbital flights. Progress toward optimal engine designs requires accurate flow simulations together with uncertainty quantification. However, performing uncertainty quantification for scramjet simulations is challenging due to the large number of uncertainparameters involvedandthe high computational costofflow simulations. These difficulties are addressedin this paper by developing practical uncertainty quantification algorithms and computational methods, and deploying themin the current studyto large-eddy simulations ofajet incrossflow inside a simplified HIFiRE Direct Connect Rig scramjet combustor. First, global sensitivity analysis is conducted to identify influential uncertain input parameters, which can help reduce the system's stochastic dimension. Second, because models of different fidelity are used in the overall uncertainty quantification assessment, a framework for quantifying and propagating the uncertainty due to model error is presented. These methods are demonstrated on a nonreacting jet-in-crossflow test problem in a simplified scramjet geometry, with parameter space up to 24 dimensions, using static and dynamic treatments of the turbulence subgrid model, and with two-dimensional and three-dimensional geometries.
Compressive sensing is a powerful technique for recovering sparse solutions of underdetermined linear systems, which is often encountered in uncertainty quantification analysis of expensive and high-dimensional physical models. We perform numerical investigations employing several compressive sensing solvers that target the unconstrained LASSO formulation, with a focus on linear systems that arise in the construction of polynomial chaos expansions. With core solvers l1_ls, SpaRSA, CGIST, FPC_AS, and ADMM, we develop techniques to mitigate overfitting through an automated selection of regularization constant based on cross-validation, and a heuristic strategy to guide the stop-sampling decision. Practical recommendations on parameter settings for these techniques are provided and discussed. The overall method is applied to a series of numerical examples of increasing complexity, including large eddy simulations of supersonic turbulent jet-in-crossflow involving a 24-dimensional input. Through empirical phase-transition diagrams and convergence plots, we illustrate sparse recovery performance under structures induced by polynomial chaos, accuracy, and computational trade-offs between polynomial bases of different degrees, and practicability of conducting compressive sensing for a realistic, high-dimensional physical application. Across test cases studied in this paper, we find ADMM to have demonstrated empirical advantages through consistent lower errors and faster computational times.
Data movement is considered the main performance concern for exascale, including both on-node memory and off-node network communication. Indeed, many application traces show significant time spent in MPI calls, potentially indicating that faster networks must be provisioned for scalability. However, equating MPI times with network communication delays ignores synchronization delays and software overheads independent of network hardware. Using point-to-point protocol details, we explore the decomposition of MPI time into communication, synchronization and software stack components using architecture simulation. Detailed validation using Bayesian inference is used to identify the sensitivity of performance to specific latency/bandwidth parameters for different network protocols and to quantify associated uncertainties. The inference combined with trace replay shows that synchronization and MPI software stack overhead are at least as important as the network itself in determining time spent in communication routines.
Griffiths, Natalie A.; Hanson, Paul J.; Ricciuto, Daniel M.; Jensen, Anna M.; Malhotra, Avni; Mcfarlane, Karis J.; Norby, Richard J.; Sargsyan, Khachik; Sebestyen, Stephen D.; Shi, Xiaoying; Walker, Anthony P.; Ward, Eric J.; Warren, Jeffrey M.; Weston, David J.
We are conducting a large-scale, long-term climate change response experiment in an ombrotrophic peat bog in Minnesota to evaluate the effects of warming and elevated CO2 on ecosystem processes using empirical and modeling approaches. To better frame future assessments of peatland responses to climate change, we characterized and compared spatial vs. temporal variation in measured C cycle processes and their environmental drivers. We also conducted a sensitivity analysis of a peatland C model to identify how variation in ecosystem parameters contributes to model prediction uncertainty. High spatial variability in C cycle processes resulted in the inability to determine if the bog was a C source or sink, as the 95% confidence interval ranged from a source of 50 g C m-2 yr-1 to a sink of 67 g C m-2 yr-1. Model sensitivity analysis also identified that spatial variation in tree and shrub photosynthesis, allocation characteristics, and maintenance respiration all contributed to large variations in the pretreatment estimates of net C balance. Variation in ecosystem processes can be more thoroughly characterized if more measurements are collected for parameters that are highly variable over space and time, and especially if those measurements encompass environmental gradients that may be driving the spatial and temporal variation (e.g., hummock vs. hollow microtopographies, and wet vs. dry years). Together, the coupled modeling and empirical approaches indicate that variability in C cycle processes and their drivers must be taken into account when interpreting the significance of experimental warming and elevated CO2 treatments.
The UQ Toolkit (UQTk) is a collection of libraries and tools for the quantification of uncertainty in numerical model predictions. Version 3.0.4 offers intrusive and non-intrusive methods for propagating input uncertainties through computational models, tools for sensitivity analysis, methods for sparse surrogate construction, and Bayesian inference tools for inferring parameters from experimental data. This manual discusses the download and installation process for UQTk, provides pointers to the UQ methods used in the toolkit, and describes some of the examples provided with the toolkit.
A new method is proposed for a fast evaluation of high-dimensional integrals of potential energy surfaces (PES) that arise in many areas of quantum dynamics. It decomposes a PES into a canonical low-rank tensor format, reducing its integral into a relatively short sum of products of low-dimensional integrals. The decomposition is achieved by the alternating least squares (ALS) algorithm, requiring only a small number of single-point energy evaluations. Therefore, it eradicates a force-constant evaluation as the hotspot of many quantum dynamics simulations and also possibly lifts the curse of dimensionality. This general method is applied to the anharmonic vibrational zero-point and transition energy calculations of molecules using the second-order diagrammatic vibrational many-body Green's function (XVH2) theory with a harmonic-approximation reference. In this application, high dimensional PES and Green's functions are both subjected to a low-rank decomposition. Evaluating the molecular integrals over a low-rank PES and Green's functions as sums of low-dimensional integrals using the Gauss–Hermite quadrature, this canonical-tensor-decomposition-based XVH2 (CT-XVH2) achieves an accuracy of 0.1 cm−1 or higher and nearly an order of magnitude speedup as compared with the original algorithm using force constants for water and formaldehyde.
The UQ Toolkit (UQTk) is a collection of libraries and tools for the quantification of uncertainty in numerical model predictions. Version 3.0.3 offers intrusive and non-intrusive methods for propagating input uncertainties through computational models, tools for sen- sitivity analysis, methods for sparse surrogate construction, and Bayesian inference tools for inferring parameters from experimental data. This manual discusses the download and installation process for UQTk, provides pointers to the UQ methods used in the toolkit, and describes some of the examples provided with the toolkit.
The development of scramjet engines is an important research area for advancing hypersonic and orbital flights. Progress towards optimal engine designs requires both accurate flow simulations as well as uncertainty quantification (UQ). However, performing UQ for scramjet simulations is challenging due to the large number of uncertain parameters involved and the high computational cost of flow simulations. We address these difficulties by combining UQ algorithms and numerical methods to the large eddy simulation of the HIFiRE scramjet configuration. First, global sensitivity analysis is conducted to identify influential uncertain input parameters, helping reduce the stochastic dimension of the problem and discover sparse representations. Second, as models of different fidelity are available and inevitably used in the overall UQ assessment, a framework for quantifying and propagating the uncertainty due to model error is introduced. These methods are demonstrated on a non-reacting scramjet unit problem with parameter space up to 24 dimensions, using 2D and 3D geometries with static and dynamic treatments of the turbulence subgrid model.
Here, we present the results of an application of Bayesian inference and maximum entropy methods for the estimation of the joint probability density for the Arrhenius rate para meters of the rate coefficient of the H2/O2-mechanism chain branching reaction H + O2 → OH + O. Available published data is in the form of summary statistics in terms of nominal values and error bars of the rate coefficient of this reaction at a number of temperature values obtained from shock-tube experiments. Our approach relies on generating data, in this case OH concentration profiles, consistent with the given summary statistics, using Approximate Bayesian Computation methods and a Markov Chain Monte Carlo procedure. The approach permits the forward propagation of parametric uncertainty through the computational model in a manner that is consistent with the published statistics. A consensus joint posterior on the parameters is obtained by pooling the posterior parameter densities given each consistent data set. To expedite this process, we construct efficient surrogates for the OH concentration using a combination of Pad'e and polynomial approximants. These surrogate models adequately represent forward model observables and their dependence on input parameters and are computationally efficient to allow their use in the Bayesian inference procedure. We also utilize Gauss-Hermite quadrature with Gaussian proposal probability density functions for moment computation resulting in orders of magnitude speedup in data likelihood evaluation. Despite the strong non-linearity in the model, the consistent data sets all res ult in nearly Gaussian conditional parameter probability density functions. The technique also accounts for nuisance parameters in the form of Arrhenius parameters of other rate coefficients with prescribed uncertainty. The resulting pooled parameter probability density function is propagated through stoichiometric hydrogen-air auto-ignition computations to illustrate the need to account for correlation among the Arrhenius rate parameters of one reaction and across rate parameters of different reactions.
The UQ Toolkit (UQTk) is a collection of libraries and tools for the quantification of uncertainty in numerical model predictions. Version 3.0 offers intrusive and non-intrusive methods for propagating input uncertainties through computational models, tools for sensitivity analysis, methods for sparse surrogate construction, and Bayesian inference tools for inferring parameters from experimental data. This manual discusses the download and installation process for UQTk, provides pointers to the UQ methods used in the toolkit, and describes some of the examples provided with the toolkit.
Demonstrate algorithm-based resilience to silent data corruption (SDC) and hard faults in a task-based domain-decomposition preconditioner for elliptic PDEs.
Explore scalability of a resilient task-based domain decomposition preconditioner for elliptic PDEs. Selective reliability to study the impact of different levels of simulated SDC and hard faults. Explore interplay between the application resilience, and the role of the server-client programming model.
We discuss algorithm-based resilience to silent data corruption (SDC) in a task- based domain-decomposition preconditioner for partial differential equations (PDEs). The algorithm exploits a reformulation of the PDE as a sampling problem, followed by a solution update through data manipulation that is resilient to SDC. The implementation is based on a server-client model where all state information is held by the servers, while clients are designed solely as computational units. Scalability tests run up to ~ 51 K cores show a parallel efficiency greater than 90%. We use a 2D elliptic PDE and a fault model based on random single bit-flip to demonstrate the resilience of the application to synthetically injected SDC. We discuss two fault scenarios: one based on the corruption of all data of a target task, and the other involving the corruption of a single data point. We show that for our application, given the test problem considered, a four-fold increase in the number of faults only yields a 2% change in the overhead to overcome their presence, from 7% to 9%. We then discuss potential savings in energy consumption via dynamics voltage/frequency scaling, and its interplay with fault-rates, and application overhead.
We present a resilient domain-decomposition preconditioner for partial differential equations (PDEs). The algorithm reformulates the PDE as a sampling problem, followed by a solution update through data manipulation that is resilient to both soft and hard faults. We discuss an implementation based on a server-client model where all state information is held by the servers, while clients are designed solely as computational units. Servers are assumed to be “sandboxed”, while no assumption is made on the reliability of the clients. We explore the scalability of the algorithm up to ∼12k cores, build an SST/macro skeleton to extrapolate to∼50k cores, and show the resilience under simulated hard and soft faults for a 2D linear Poisson equation.
We present a resilient domain-decomposition preconditioner for partial differential equations (PDEs). The algorithm reformulates the PDE as a sampling problem, followed by a solution update through data manipulation that is resilient to both soft and hard faults. We discuss an implementation based on a server-client model where all state information is held by the servers, while clients are designed solely as computational units. Servers are assumed to be “sandboxed”, while no assumption is made on the reliability of the clients. We explore the scalability of the algorithm up to ∼12k cores, build an SST/macro skeleton to extrapolate to∼50k cores, and show the resilience under simulated hard and soft faults for a 2D linear Poisson equation.
We present a domain-decomposition-based pre-conditioner for the solution of partial differential equations (PDEs) that is resilient to both soft and hard faults. The algorithm is based on the following steps: first, the computational domain is split into overlapping subdomains, second, the target PDE is solved on each subdomain for sampled values of the local current boundary conditions, third, the subdomain solution samples are collected and fed into a regression step to build maps between the subdomains' boundary conditions, finally, the intersection of these maps yields the updated state at the subdomain boundaries. This reformulation allows us to recast the problem as a set of independent tasks. The implementation relies on an asynchronous server-client framework, where one or more reliable servers hold the data, while the clients ask for tasks and execute them. This framework provides resiliency to hard faults such that if a client crashes, it stops asking for work, and the servers simply distribute the work among all the other clients alive. Erroneous subdomain solves (e.g. due to soft faults) appear as corrupted data, which is either rejected if that causes a task to fail, or is seamlessly filtered out during the regression stage through a suitable noise model. Three different types of faults are modeled: hard faults modeling nodes (or clients) crashing, soft faults occurring during the communication of the tasks between server and clients, and soft faults occurring during task execution. We demonstrate the resiliency of the approach for a 2D elliptic PDE, and explore the effect of the faults at various failure rates.
The objective of this work is to investigate the efficacy of using calibration strategies from Uncertainty Quantification (UQ) to determine model coefficients for LES. As the target methods are for engineering LES, uncertainty from numerical aspects of the model must also be quantified. 15 The ultimate goal of this research thread is to generate a cost versus accuracy curve for LES such that the cost could be minimized given an accuracy prescribed by an engineering need. Realization of this goal would enable LES to serve as a predictive simulation tool within the engineering design process.
Direct solutions of the Chemical Master Equation (CME) governing Stochastic Reaction Networks (SRNs) are generally prohibitively expensive due to excessive numbers of possible discrete states in such systems. To enhance computational efficiency we develop a hybrid approach where the evolution of states with low molecule counts is treated with the discrete CME model while that of states with large molecule counts is modeled by the continuum Fokker-Planck equation. The Fokker-Planck equation is discretized using a 2nd order finite volume approach with appropriate treatment of flux components. The numerical construction at the interface between the discrete and continuum regions implements the transfer of probability reaction by reaction according to the stoichiometry of the system. The performance of this novel hybrid approach is explored for a two-species circadian model with computational efficiency gains of about one order of magnitude.
In this paper, a series of algorithms are proposed to address the problems in the NASA Langley Research Center Multidisciplinary Uncertainty Quantification Challenge. A Bayesian approach is employed to characterize and calibrate the epistemic parameters based on the available data, whereas a variance-based global sensitivity analysis is used to rank the epistemic and aleatory model parameters. A nested sampling of the aleatory-epistemic space is proposed to propagate uncertainties from model parameters to output quantities of interest.
The move towards extreme-scale computing platforms challenges scientific simulations in many ways. Given the recent tendencies in computer architecture development, one needs to reformulate legacy codes in order to cope with large amounts of communication, system faults, and requirements of low-memory usage per core. In this work, we develop a novel framework for solving PDEs via domain decomposition that reformulates the solution as a state of knowledge with a probabilistic interpretation. Such reformulation allows resiliency with respect to potential faults without having to apply fault detection, avoids unnecessary communication, and is generally well-suited for rigorous uncertainty quantification studies that target improvements of predictive fidelity of scientific models. We demonstrate our algorithm for one-dimensional PDE examples where artificial faults have been implemented as bit flips in the binary representation of subdomain solutions.
We present a methodology to assess the predictive fidelity of multiscale simulations by incorporating uncertainty in the information exchanged between the components of an atomisticto-continuum simulation. We account for both the uncertainty due to finite sampling in molecular dynamics (MD) simulations and the uncertainty in the physical parameters of the model. Using Bayesian inference, we represent the expensive atomistic component by a surrogate model that relates the long-term output of the atomistic simulation to its uncertain inputs. We then present algorithms to solve for the variables exchanged across the atomistic-continuum interface in terms of polynomial chaos expansions (PCEs). We consider a simple Couette flow where velocities are exchanged between the atomistic and continuum components, while accounting for uncertainty in the atomistic model parameters and the continuum boundary conditions. Results show convergence of the coupling algorithm at a reasonable number of iterations. The uncertainty in the obtained variables significantly depends on the amount of data sampled from the MD simulations and on the width of the time averaging window used in the MD simulations.
This brief report explains the method used for parameter calibration and model validation in SST/Macro and the set of tools and workflow developed for this purpose.
In this project we have developed atmospheric measurement capabilities and a suite of atmospheric modeling and analysis tools that are well suited for verifying emissions of green- house gases (GHGs) on an urban-through-regional scale. We have for the first time applied the Community Multiscale Air Quality (CMAQ) model to simulate atmospheric CO2 . This will allow for the examination of regional-scale transport and distribution of CO2 along with air pollutants traditionally studied using CMAQ at relatively high spatial and temporal resolution with the goal of leveraging emissions verification efforts for both air quality and climate. We have developed a bias-enhanced Bayesian inference approach that can remedy the well-known problem of transport model errors in atmospheric CO2 inversions. We have tested the approach using data and model outputs from the TransCom3 global CO2 inversion comparison project. We have also performed two prototyping studies on inversion approaches in the generalized convection-diffusion context. One of these studies employed Polynomial Chaos Expansion to accelerate the evaluation of a regional transport model and enable efficient Markov Chain Monte Carlo sampling of the posterior for Bayesian inference. The other approach uses de- terministic inversion of a convection-diffusion-reaction system in the presence of uncertainty. These approaches should, in principle, be applicable to realistic atmospheric problems with moderate adaptation. We outline a regional greenhouse gas source inference system that integrates (1) two ap- proaches of atmospheric dispersion simulation and (2) a class of Bayesian inference and un- certainty quantification algorithms. We use two different and complementary approaches to simulate atmospheric dispersion. Specifically, we use a Eulerian chemical transport model CMAQ and a Lagrangian Particle Dispersion Model - FLEXPART-WRF. These two models share the same WRF assimilated meteorology fields, making it possible to perform a hybrid simulation, in which the Eulerian model (CMAQ) can be used to compute the initial condi- tion needed by the Lagrangian model, while the source-receptor relationships for a large state vector can be efficiently computed using the Lagrangian model in its backward mode. In ad- dition, CMAQ has a complete treatment of atmospheric chemistry of a suite of traditional air pollutants, many of which could help attribute GHGs from different sources. The inference of emissions sources using atmospheric observations is cast as a Bayesian model calibration problem, which is solved using a variety of Bayesian techniques, such as the bias-enhanced Bayesian inference algorithm, which accounts for the intrinsic model deficiency, Polynomial Chaos Expansion to accelerate model evaluation and Markov Chain Monte Carlo sampling, and Karhunen-Lo %60 eve (KL) Expansion to reduce the dimensionality of the state space. We have established an atmospheric measurement site in Livermore, CA and are collect- ing continuous measurements of CO2 , CH4 and other species that are typically co-emitted with these GHGs. Measurements of co-emitted species can assist in attributing the GHGs to different emissions sectors. Automatic calibrations using traceable standards are performed routinely for the gas-phase measurements. We are also collecting standard meteorological data at the Livermore site as well as planetary boundary height measurements using a ceilometer. The location of the measurement site is well suited to sample air transported between the San Francisco Bay area and the California Central Valley.