Publications

Results 26–50 of 59

Search results

Jump to search filters

Variational Kalman Filtering with H∞-Based Correction for Robust Bayesian Learning in High Dimensions

Proceedings of the IEEE Conference on Decision and Control

Das, Niladri; Duersch, Jed A.; Catanach, Thomas A.

In this paper, we address the problem of convergence of sequential variational inference filter (VIF) through the application of a robust variational objective and H∞-norm based correction for a linear Gaussian system. As the dimension of state or parameter space grows, performing the full Kalman update with the dense covariance matrix for a large-scale system requires increased storage and computational complexity, making it impractical. The VIF approach, based on mean-field Gaussian variational inference, reduces this burden through the variational approximation to the covariance usually in the form of a diagonal covariance approximation. The challenge is to retain convergence and correct for biases introduced by the sequential VIF steps. We desire a frame-work that improves feasibility while still maintaining reasonable proximity to the optimal Kalman filter as data is assimilated. To accomplish this goal, a H∞-norm based optimization perturbs the VIF covariance matrix to improve robustness. This yields a novel VIF-H∞ recursion that employs consecutive variational inference and H∞ based optimization steps. We explore the development of this method and investigate a numerical example to illustrate the effectiveness of the proposed filter.

More Details

CIS Project 22359, Final Technical Report. Discretized Posterior Approximation in High Dimensions

Duersch, Jed A.; Catanach, Thomas A.

Our primary aim in this work is to understand how to efficiently obtain reliable uncertainty quantification in automatic learning algorithms with limited training datasets. Standard approaches rely on cross-validation to tune hyper parameters. Unfortunately, when our datasets are too small, holdout datasets become unreliable—albeit unbiased—measures of prediction quality due to the lack of adequate sample size. We should not place confidence in holdout estimators under conditions wherein the sample variance is both large and unknown. More poigniantly, our training experiments on limited data (Duersch and Catanach, 2021) show that even if we could improve estimator quality under these conditions, the typical training trajectory may never even encounter generalizable models.

More Details

Analysis and Optimization of Seismo-Acoustic Monitoring Networks with Bayesian Optimal Experimental Design

Catanach, Thomas A.; Monogue, Kevin

The Bayesian optimal experimental design (OED) problem seeks to identify data, sensor configurations, or experiments which can optimally reduce uncertainty. The goal of OED is to find an experiment that maximizes the expected information gain (EIG) about quantities of interest given prior knowledge about expected data. Therefore, within the context of seismic monitoring, we can use Bayesian OED to configure sensor networks by choosing sensor locations, types, and fidelity in order to improve our ability to identify and locate seismic sources. In this work, we develop the framework necessary to use Bayesian OED to optimize the ability to locate seismic events from arrival time data of detected seismic phases. In order to do utilize Bayesian OED we must develop four elements:1. A likelihood function that describes the uncertainty of detection and travel times; 2. A Bayesian solver that takes a prior and likelihood to identify the posterior; 3. An algorithm to compute EIG; and, 4. An optimizer that finds a sensor network which maximizes EIG. Once we have developed this framework, we can explore many relevant questions to monitoring such as: how and what multiphenomenology data can be used to optimally reduce uncertainty, how to trade off sensor fidelity and earth model uncertainty, and how sensor types, number, and locations influence uncertainty

More Details

Parsimonious Inference Information-Theoretic Foundations for a Complete Theory of Machine Learning (CIS-LDRD Project 218313 Final Technical Report)

Duersch, Jed A.; Catanach, Thomas A.; Gu, Ming

This work examines how we may cast machine learning within a complete Bayesian framework to quantify and suppress explanatory complexity from first principles. Our investigation into both the philosophy and mathematics of rational belief leads us to emphasize the critical role of Bayesian inference in learning well-justified predictions within a rigorous and complete extended logic. The Bayesian framework allows us to coherently account for evidence in the learned plausibility of potential explanations. As an extended logic, the Bayesian paradigm regards probability as a notion of degrees of truth. In order to satisfy critical properties of probability as a coherent measure, as well as maintain consistency with binary propositional logic, we arrive at Bayes' Theorem as the only justifiable mechanism to update our beliefs to account for empiracle evidence. Yet, in the machine learning paradigm, where explanations are unconstrained algorithmic abstractions, we arrive at a critical challenge: Bayesian inference requires prior belief. Conventional approaches fail to yield a consistent framework in which we could compare prior plausibility among the infinities of potential choices in learning architectures. The difficulty of articulating well-justified prior belief over abstract models is the provinence of memorization in traditional machine learning training practices. This becomes exceptionally problematic in the context of limited datasets, when we wish to learn justifiable predictions from only a small amount of data.

More Details

Modeling Failure of Electrical Transformers due to Effects of a HEMP Event

Hansen, Clifford H.; Catanach, Thomas A.; Glover, Austin M.; Huerta, Jose G.; Stuart, Zach; Guttromson, Ross G.

Understanding the effect of a high-altitude electromagnetic pulse (HEMP) on the equipment in the United States electrical power grid is important to national security. A present challenge to this understanding is evaluating the vulnerability of transformers to a HEMP. Evaluating vulnerability by direct testing is cost-prohibitive, due to the wide variation in transformers, their high cost, and the large number of tests required to establish vulnerability with confidence. Alternatively, material and component testing can be performed to quantify a model for transformer failure, and the model can be used to assess vulnerability of a wide variety of transformers. This project develops a model of the probability of equipment failure due to effects of a HEMP. Potential failure modes are cataloged, and a model structure is presented which can be quantified by the results of small-scale coupon tests.

More Details

Characterization of Partially Observed Epidemics - Application to COVID-19

Safta, Cosmin S.; Ray, Jaideep R.; Laros, James H.; Catanach, Thomas A.; Chowdhary, Kamaljit S.; Debusschere, Bert D.; Galvan, Edgar; Geraci, Gianluca G.; Khalil, Mohammad K.; Portone, Teresa P.

This report documents a statistical method for the "real-time" characterization of partially observed epidemics. Observations consist of daily counts of symptomatic patients, diagnosed with the disease. Characterization, in this context, refers to estimation of epidemiological parameters that can be used to provide short-term forecasts of the ongoing epidemic, as well as to provide gross information for the time-dependent infection rate. The characterization problem is formulated as a Bayesian inverse problem, and is predicated on a model for the distribution of the incubation period. The model parameters are estimated as distributions using a Markov Chain Monte Carlo (MCMC) method, thus quantifying the uncertainty in the estimates. The method is applied to the COVID-19 pandemic of 2020, using data at the country, provincial (e.g., states) and regional (e.g. county) levels. The epidemiological model includes a stochastic component due to uncertainties in the incubation period. This model-form uncertainty is accommodated by a pseudo-marginal Metropolis-Hastings MCMC sampler, which produces posterior distributions that reflect this uncertainty. We approximate the discrepancy between the data and the epidemiological model using Gaussian and negative binomial error models; the latter was motivated by the over-dispersed count data. For small daily counts we find the performance of the calibrated models to be similar for the two error models. For large daily counts the negative-binomial approximation is numerically unstable unlike the Gaussian error model. Application of the model at the country level (for the United States, Germany, Italy, etc.) generally provided accurate forecasts, as the data consisted of large counts which suppressed the day-to-day variations in the observations. Further, the bulk of the data is sourced over the duration before the relaxation of the curbs on population mixing, and is not confounded by any discernible country-wide second wave of infections. At the state-level, where reporting was poor or which evinced few infections (e.g., New Mexico), the variance in the data posed some, though not insurmountable, difficulties, and forecasts were able to capture the data with large uncertainty bounds. The method was found to be sufficiently sensitive to discern the flattening of the infection and epidemic curve due to shelter-in-place orders after around 90% quantile for the incubation distribution (about 10 days for COVID-19). The proposed model was also used at a regional level to compare the forecasts for the central and north-west regions of New Mexico. Modeling the data for these regions illustrated different disease spread dynamics captured by the model. While in the central region the daily counts peaked in the late April, in the north-west region the ramp-up continued for approximately three more weeks.

More Details

Generalizing information to the evolution of rational belief

Entropy

Duersch, Jed A.; Catanach, Thomas A.

Information theory provides a mathematical foundation to measure uncertainty in belief. Belief is represented by a probability distribution that captures our understanding of an outcome's plausibility. Information measures based on Shannon's concept of entropy include realization information, Kullback-Leibler divergence, Lindley's information in experiment, cross entropy, and mutual information. We derive a general theory of information from first principles that accounts for evolving belief and recovers all of these measures. Rather than simply gauging uncertainty, information is understood in this theory to measure change in belief. We may then regard entropy as the information we expect to gain upon realization of a discrete latent random variable. This theory of information is compatible with the Bayesian paradigm in which rational belief is updated as evidence becomes available. Furthermore, this theory admits novel measures of information with well-defined properties, which we explored in both analysis and experiment. This view of information illuminates the study of machine learning by allowing us to quantify information captured by a predictive model and distinguish it from residual information contained in training data. We gain related insights regarding feature selection, anomaly detection, and novel Bayesian approaches.

More Details

Bayesian inference of stochastic reaction networks using multifidelity sequential tempered markov chain monte carlo

International Journal for Uncertainty Quantification

Catanach, Thomas A.; Vo, Huy D.; Munsky, Brian

Stochastic reaction network models are often used to explain and predict the dynamics of gene regulation in single cells. These models usually involve several parameters, such as the kinetic rates of chemical reactions, that are not directly measurable and must be inferred from experimental data. Bayesian inference provides a rigorous probabilistic framework for identifying these parameters by finding a posterior parameter distribution that captures their uncer-tainty. Traditional computational methods for solving inference problems such as Markov chain Monte Carlo methods based on the classical Metropolis-Hastings algorithm involve numerous serial evaluations of the likelihood function, which in turn requires expensive forward solutions of the chemical master equation (CME). We propose an alternate approach based on a multifidelity extension of the sequential tempered Markov chain Monte Carlo (ST-MCMC) sam-pler. This algorithm is built upon sequential Monte Carlo and solves the Bayesian inference problem by decomposing it into a sequence of efficiently solved subproblems that gradually increase both model fidelity and the influence of the observed data. We reformulate the finite state projection (FSP) algorithm, a well-known method for solving the CME, to produce a hierarchy of surrogate master equations to be used in this multifidelity scheme. To determine the appro-priate fidelity, we introduce a novel information-theoretic criterion that seeks to extract the most information about the ultimate Bayesian posterior from each model in the hierarchy without inducing significant bias. This novel sampling scheme is tested with high-performance computing resources using biologically relevant problems.

More Details
Results 26–50 of 59
Results 26–50 of 59