Polynomial Chaos Expansions for Discrete Random Variables in Cyber Security Emulytics Experiments
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
International Journal for Uncertainty Quantification
We propose a learning algorithm for discovering unknown parameterized dynamical systems by using observational data of the state variables. Our method is built upon and extends the recent work of discovering unknown dynamical systems, in particular those using a deep neural network (DNN). We propose a DNN structure, largely based upon the residual network (ResNet), to not only learn the unknown form of the governing equation but also to take into account the random effect embedded in the system, which is generated by the random parameters. Once the DNN model is successfully constructed, it is able to produce system prediction over a longer term and for arbitrary parameter values. For uncertainty quantification, it allows us to conduct uncertainty analysis by evaluating solution statistics over the parameter space.
Additive Manufacturing
Additive Manufacturing (AM), commonly referred to as 3D printing, offers the ability to not only fabricate geometrically complex lattice structures but parts in which lattice topologies in-fill volumes bounded by complex surface geometries. However, current AM processes produce defects on the strut and node elements which make up the lattice structure. This creates an inherent difference between the as-designed and as-fabricated geometries, which negatively affects predictions (via numerical simulation) of the lattice's mechanical performance. Although experimental and numerical analysis of an AM lattice's bulk structure, unit cell and struts have been performed, there exists almost no research data on the mechanical response of the individual as-manufactured lattice node elements. This research proposes a methodology that, for the first time, allows non-destructive quantification of the mechanical response of node elements within an as-manufactured lattice structure. A custom-developed tool is used to extract and classify each individual node geometry from micro-computed tomography scans of an AM fabricated lattice. Voxel-based finite element meshes are generated for numerical simulation and the mechanical response distribution is compared to that of the idealised computer-aided design model. The method demonstrates compatibility with Uncertainty Quantification methods that provide opportunities for efficient prediction of a population of nodal responses from sampled data. Overall, the non-destructive and automated nature of the node extraction and response evaluation is promising for its application in qualification and certification of additively manufactured lattice structures.
Abstract not provided.
International Journal for Uncertainty Quantification
This paper presents a multifidelity uncertainty quantification framework called MFNets. We seek to address three existing challenges that arise when experimental and simulation data from different sources are used to enhance statistical estimation and prediction with quantified uncertainty. Specifically, we demonstrate that MFNets can (1) fuse heterogeneous data sources arising from simulations with different parameterizations, e.g simulation models with different uncertain parameters or data sets collected under different environmental conditions; (2) encode known relationships among data sources to reduce data requirements; and (3) improve the robustness of existing multi-fidelity approaches to corrupted data. MFNets construct a network of latent variables (LVs) to facilitate the fusion of data from an ensemble of sources of varying credibility and cost. These LVs are posited as explanatory variables that provide the source of correlation in the observed data. Furthermore, MFNets provide a way to encode prior physical knowledge to enable efficient estimation of statistics and/or construction of surrogates via conditional independence relations on the LVs. We highlight the utility of our framework with a number of theoretical results which assess the quality of the posterior mean as a frequentist estimator and compare it to standard sampling approaches that use single fidelity, multilevel, and control variate Monte Carlo estimators. We also use the proposed framework to derive the Monte Carlo-based control variate estimator entirely from the use of Bayes rule and linear-Gaussian models -- to our knowledge the first such derivation. Finally, we demonstrate the ability to work with different uncertain parameters across different models.
Water Resources Research
In this synthesis, we assess present research and anticipate future development needs in modeling water quality in watersheds. We first discuss areas of potential improvement in the representation of freshwater systems pertaining to water quality, including representation of environmental interfaces, in-stream water quality and process interactions, soil health and land management, and (peri-)urban areas. In addition, we provide insights into the contemporary challenges in the practices of watershed water quality modeling, including quality control of monitoring data, model parameterization and calibration, uncertainty management, scale mismatches, and provisioning of modeling tools. Finally, we make three recommendations to provide a path forward for improving watershed water quality modeling science, infrastructure, and practices. These include building stronger collaborations between experimentalists and modelers, bridging gaps between modelers and stakeholders, and cultivating and applying procedural knowledge to better govern and support water quality modeling processes within organizations.
Journal of Machine Learning for Modeling and Computing
Gaussian process regression is a popular Bayesian framework for surrogate modeling of expensive data sources. As part of a larger effort in scientific machine learning, many recent works have incorporated physical constraints or other a priori information within Gaussian process regression to supplement limited data and regularize the behavior of the model. We provide an overview and survey of several classes of Gaussian process constraints, including positivity or bound constraints, monotonicity and convexity constraints, differential equation constraints provided by linear PDEs, and boundary condition constraints. We compare the strategies behind each approach as well as the differences in implementation, concluding with a discussion of the computational challenges introduced by constraints.
This report summarizes work done under the Laboratory Directed Research and Development (LDRD) project titled "Incorporating physical constraints into Gaussian process surrogate models?' In this project, we explored a variety of strategies for constraint implementations. We considered bound constraints, monotonicity and related convexity constraints, Gaussian processes which are constrained to satisfy linear operator constraints which represent physical laws expressed as partial differential equations, and intrinsic boundary condition constraints. We wrote three papers and are currently finishing two others. We developed initial software implementations for some approaches. This report summarizes the work done under this LDRD.
Journal of Computational Physics
Incorporating experimental data is essential for increasing the credibility of simulation-aided decision making and design. This paper presents a method which uses a computational model to guide the optimal acquisition of experimental data to produce data-informed predictions of quantities of interest (QoI). Many strategies for optimal experimental design (OED) select data that maximize some utility that measures the reduction in uncertainty of uncertain model parameters, for example the expected information gain between prior and posterior distributions of these parameters. In this paper, we seek to maximize the expected information gained from the push-forward of an initial (prior) density to the push-forward of the updated (posterior) density through the parameter-to-prediction map. The formulation presented is based upon the solution of a specific class of stochastic inverse problems which seeks a probability density that is consistent with the model and the data in the sense that the push-forward of this density through the parameter-to-observable map matches a target density on the observable data. While this stochastic inverse problem forms the mathematical basis for our approach, we develop a one-step algorithm, focused on push-forward probability measures, that leverages inference-for-prediction to bypass constructing the solution to the stochastic inverse problem. A number of numerical results are presented to demonstrate the utility of this optimal experimental design for prediction and facilitate comparison of our approach with traditional OED.
Abstract not provided.
The Arctic is warming and feedbacks in the coupled Earth system may be driving the Arctic to tipping events that could have critical downstream impacts for the rest of the globe. In this project we have focused on analyzing sea ice variability and loss in the coupled Earth system Summer sea ice loss is happening rapidly and although the loss may be smooth and reversible, it has significant consequences for other Arctic systems as well as geopolitical and economic implications. Accurate seasonal predictions of sea ice minimum extent and long-term estimates of timing for a seasonally ice-free Arctic depend on a better understanding of the factors influencing sea ice dynamics and variation in this strongly coupled system. Under this project we have investigated the most influential factors in accurate predictions of September Arctic sea ice extent using machine learning models trained separately on observational data and on simulation data from five E3SM historical ensembles. Monthly averaged data from June, July, and August for a selection of ice, ocean, and atmosphere variables were used to train a random forest regression model. Gini importance measures were computed for each input feature with the testing data. We found that sea ice volume is most important earlier in the season (June) and sea ice extent became a more important predictor closer to September. Results from this study provide insight into how feature importance changes with forecast length and illustrates differences between observational data and simulated Earth system data. We have additionally performed a global sensitivity analysis (GSA) using a fully coupled ultra- low resolution configuration E3SM. To our knowledge, this is the first global sensitivity analysis involving the fully-coupled E3SM Earth system model. We have found that parameter variations show significant impact on the Arctic climate state and atmospheric parameters related to cloud parameterizations are the most significant. We also find significant interactions between parameters from different components of E3SM. The results of this study provide invaluable insight into the relative importance of various parameters from the sea ice, atmosphere and ocean components of the E3SM (including cross-component parameter interactions) on various Arctic-focused quantities of interest (QOIs).
Abstract not provided.
Abstract not provided.
We present a numerical framework for recovering unknown non-autonomous dynamical systems with time-dependent inputs. To circumvent the difficulty presented by the non-autonomous nature of the system, our method transforms the solution state into piecewise integration of the system over a discrete set of time instances. The time-dependent inputs are then locally parameterized by using a proper model, for example, polynomial regression, in the pieces determined by the time instances. This transforms the original system into a piecewise parametric system that is locally time invariant. We then design a deep neural network structure to learn the local models. Once the network model is constructed, it can be iteratively used over time to conduct global system prediction. We provide theoretical analysis of our algorithm and present a number of numerical examples to demonstrate the effectiveness of the method.
Journal of Computational Physics
We describe and analyze a variance reduction approach for Monte Carlo (MC) sampling that accelerates the estimation of statistics of computationally expensive simulation models using an ensemble of models with lower cost. These lower cost models — which are typically lower fidelity with unknown statistics — are used to reduce the variance in statistical estimators relative to a MC estimator with equivalent cost. We derive the conditions under which our proposed approximate control variate framework recovers existing multifidelity variance reduction schemes as special cases. We demonstrate that existing recursive/nested strategies are suboptimal because they use the additional low-fidelity models only to efficiently estimate the unknown mean of the first low-fidelity model. As a result, they cannot achieve variance reduction beyond that of a control variate estimator that uses a single low-fidelity model with known mean. However, there often exists about an order-of-magnitude gap between the maximum achievable variance reduction using all low-fidelity models and that achieved by a single low-fidelity model with known mean. We show that our proposed approach can exploit this gap to achieve greater variance reduction by using non-recursive sampling schemes. The proposed strategy reduces the total cost of accurately estimating statistics, especially in cases where only low-fidelity simulation models are accessible for additional evaluations. Several analytic examples and an example with a hyperbolic PDE describing elastic wave propagation in heterogeneous media are used to illustrate the main features of the methodology.
The Dakota toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. Dakota contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic expansion methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the Dakota toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a user's manual for the Dakota software and provides capability overviews and procedures for software execution, as well as a variety of example studies.
The Dakota toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. Dakota contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic expansion methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the Dakota toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a theoretical manual for selected algorithms implemented within the Dakota software. It is not intended as a comprehensive theoretical treatment, since a number of existing texts cover general optimization theory, statistical analysis, and other introductory topics. Rather, this manual is intended to summarize a set of Dakota-related research publications in the areas of surrogate-based optimization, uncertainty quantification, and optimization under uncertainty that provide the foundation for many of Dakota's iterative analysis capabilities.
This article is concerned with the approximation of high-dimensional functions by kernel-based methods. Motivated by uncertainty quantification, which often necessitates the construction of approximations that are accurate with respect to a probability density function of random variables, we aim at minimizing the approximation error with respect to a weighted $L^p$-norm. We present a greedy procedure for designing computer experiments based upon a weighted modification of the pivoted Cholesky factorization. The method successively generates nested samples with the goal of minimizing error in regions of high probability. Numerical experiments validate that this new importance sampling strategy is superior to other sampling approaches, especially when used with non-product probability density functions. We also show how to use the proposed algorithm to efficiently generate surrogates for inferring unknown model parameters from data.
Abstract not provided.
Abstract not provided.
Abstract not provided.
In this paper, we present an adaptive algorithm to construct response surface approximations of high-fidelity models using a hierarchy of lower fidelity models. Our algorithm is based on multiindex stochastic collocation and automatically balances physical discretization error and response surface error to construct an approximation of model outputs. This surrogate can be used for uncertainty quantification (UQ) and sensitivity analysis (SA) at a fraction of the cost of a purely high-fidelity approach. We demonstrate the effectiveness of our algorithm on a canonical test problem from the UQ literature and a complex multi-physics model that simulates the performance of an integrated nozzle for an unmanned aerospace vehicle. We find that when the input-output response is sufficiently smooth our algorithm produces approximations that can be up to orders of magnitude more accurate than single fidelity approximations for a fixed computational budget.