Publications

Results 51–75 of 243

Search results

Jump to search filters

Daily forecasting of regional epidemics of coronavirus disease with bayesian uncertainty quantification, United States

Emerging Infectious Diseases

Lin, Yen T.; Neumann, Jacob; Miller, Ely F.; Posner, Richard G.; Mallela, Abhishek; Safta, Cosmin; Ray, Jaideep; Thakur, Gautam; Chinthavali, Supriya; Hlavacek, William S.

To increase situational awareness and support evidence-based policymaking, we formulated a mathematical model for coronavirus disease transmission within a regional population. This compartmental model accounts for quarantine, self-isolation, social distancing, a nonexponentially distributed incubation period, asymptomatic persons, and mild and severe forms of symptomatic disease. We used Bayesian inference to calibrate region-specific models for consistency with daily reports of confirmed cases in the 15 most populous metropolitan statistical areas in the United States. We also quantified uncertainty in parameter estimates and forecasts. This online learning approach enables early identification of new trends despite considerable variability in case reporting.

More Details

Robustness and Validation of Model and Digital Twins Deployment

Volkova, Svitana; Stracuzzi, David J.; Shafer, Jenifer; Ray, Jaideep; Pullum, Laura

For digital twins (DTs) to become a central fixture in mission critical systems, a better understanding is required of potential modes of failure, quantification of uncertainty, and the ability to explain a model’s behavior. These aspects are particularly important as the performance of a digital twin will evolve during model development and deployment for real-world operations.

More Details

Feature selection, clustering, and prototype placement for turbulence data sets

AIAA Scitech 2021 Forum

Barone, Matthew F.; Ray, Jaideep; Domino, Stefan P.

This paper explores unsupervised learning approaches for analysis and categorization of turbulent flow data. Single point statistics from several high-fidelity turbulent flow simulation data sets are classified using a Gaussian mixture model clustering algorithm. Candidate features are proposed, which include barycentric coordinates of the Reynolds stress anisotropy tensor, as well as scalar and angular invariants of the Reynolds stress and mean strain rate tensors. A feature selection algorithm is applied to the data in a sequential fashion, flow by flow, to identify a good feature set and an optimal number of clusters for each data set. The algorithm is first applied to Direct Numerical Simulation data for plane channel flow, and produces clusters that are consistent with turbulent flow theory and empirical results that divide the channel flow into a number of regions (viscous sub-layer, log layer, etc). Clusters are then identified for flow over a wavy-walled channel, flow over a bump in a channel, and flow past a square cylinder. Some clusters are closely identified with the anisotropy state of the turbulence, as indicated by the location within the barycentric map of the Reynolds stress tensor. Other clusters can be connected to physical phenomena, such as boundary layer separation and free shear layers. Exemplar points from the clusters, or prototypes, are then identified using a prototype selection method. These exemplars summarize the dataset by a factor of 10 to 1000. The clustering and prototype selection algorithms provide a foundation for physics-based, semi-automated classification of turbulent flow states and extraction of a subset of data points that can serve as the basis for the development of explainable machine-learned turbulence models.

More Details

Characterization of partially observed epidemics through Bayesian inference: application to COVID-19

Computational Mechanics

Safta, Cosmin; Ray, Jaideep; Sargsyan, Khachik

We demonstrate a Bayesian method for the “real-time” characterization and forecasting of partially observed COVID-19 epidemic. Characterization is the estimation of infection spread parameters using daily counts of symptomatic patients. The method is designed to help guide medical resource allocation in the early epoch of the outbreak. The estimation problem is posed as one of Bayesian inference and solved using a Markov chain Monte Carlo technique. The data used in this study was sourced before the arrival of the second wave of infection in July 2020. The proposed modeling approach, when applied at the country level, generally provides accurate forecasts at the regional, state and country level. The epidemiological model detected the flattening of the curve in California, after public health measures were instituted. The method also detected different disease dynamics when applied to specific regions of New Mexico.

More Details

Predictive Skill of Deep Learning Models Trained on Limited Sequence Data

Safta, Cosmin; Lee, Kookjin L.; Ray, Jaideep

In this report we investigate the utility of one-dimensional convolutional neural network (CNN) models in epidemiological forecasting. Deep learning models, especially variants of recurrent neural networks (RNNs) have been studied for influenza forecasting, and have achieved higher forecasting skill compared to conventional models such as ARIMA models. In this study, we adapt two neural networks that employ one-dimensional temporal convolutional layers as a primary building block temporal convolutional networks and simple neural attentive meta-learner for epidemiological forecasting and test them with influenza data from the US collected over 2010-2019. We find that epidemiological forecasting with CNNs is feasible, and their forecasting skill is comparable to, and at times, superior to, RNNs. Thus CNNs and RNNs bring the power of nonlinear transformations to purely data-driven epidemiological models, a capability that heretofore has been limited to more elaborate mechanistic/compartmental disease models.

More Details

A Multi-Instance learning Framework for Seismic Detectors

Ray, Jaideep; Wang, Fulton; Young, Christopher J.

In this report, we construct and test a framework for fusing the predictions of a ensemble of seismic wave detectors. The framework is drawn from multi-instance learning and is meant to improve the predictive skill of the ensemble beyond that of the individual detectors. We show how the framework allows the use of multiple features derived from the seismogram to detect seismic wave arrivals, as well as how it allows only the most informative features to be retained in the ensemble. The computational cost of the "ensembling" method is linear in the size of the ensemble, allowing a scalable method for monitoring multiple features/transformations of a seismogram. The framework is tested on teleseismic and regional p-wave arrivals at the IMS (International Monitoring System) station in Warramunga, NT, Australia and the PNSU station in University of Utah's monitoring network.

More Details

Characterization of Partially Observed Epidemics - Application to COVID-19

Safta, Cosmin; Ray, Jaideep; Bays, Nathan R.; Catanach, Thomas A.; Chowdhary, Kenny; Debusschere, Bert J.; Galvan, Edgar; Geraci, Gianluca; Khalil, Mohammad; Portone, Teresa

This report documents a statistical method for the "real-time" characterization of partially observed epidemics. Observations consist of daily counts of symptomatic patients, diagnosed with the disease. Characterization, in this context, refers to estimation of epidemiological parameters that can be used to provide short-term forecasts of the ongoing epidemic, as well as to provide gross information for the time-dependent infection rate. The characterization problem is formulated as a Bayesian inverse problem, and is predicated on a model for the distribution of the incubation period. The model parameters are estimated as distributions using a Markov Chain Monte Carlo (MCMC) method, thus quantifying the uncertainty in the estimates. The method is applied to the COVID-19 pandemic of 2020, using data at the country, provincial (e.g., states) and regional (e.g. county) levels. The epidemiological model includes a stochastic component due to uncertainties in the incubation period. This model-form uncertainty is accommodated by a pseudo-marginal Metropolis-Hastings MCMC sampler, which produces posterior distributions that reflect this uncertainty. We approximate the discrepancy between the data and the epidemiological model using Gaussian and negative binomial error models; the latter was motivated by the over-dispersed count data. For small daily counts we find the performance of the calibrated models to be similar for the two error models. For large daily counts the negative-binomial approximation is numerically unstable unlike the Gaussian error model. Application of the model at the country level (for the United States, Germany, Italy, etc.) generally provided accurate forecasts, as the data consisted of large counts which suppressed the day-to-day variations in the observations. Further, the bulk of the data is sourced over the duration before the relaxation of the curbs on population mixing, and is not confounded by any discernible country-wide second wave of infections. At the state-level, where reporting was poor or which evinced few infections (e.g., New Mexico), the variance in the data posed some, though not insurmountable, difficulties, and forecasts were able to capture the data with large uncertainty bounds. The method was found to be sufficiently sensitive to discern the flattening of the infection and epidemic curve due to shelter-in-place orders after around 90% quantile for the incubation distribution (about 10 days for COVID-19). The proposed model was also used at a regional level to compare the forecasts for the central and north-west regions of New Mexico. Modeling the data for these regions illustrated different disease spread dynamics captured by the model. While in the central region the daily counts peaked in the late April, in the north-west region the ramp-up continued for approximately three more weeks.

More Details

Rigorous Data Fusion for Computationally Expensive Simulations

Winovich, Nickolas; Rushdi, Ahmad; Phipps, Eric T.; Ray, Jaideep; Lin, Guang; Ebeida, Mohamed S.

This manuscript comprises the final report for the 1-year, FY19 LDRD project "Rigorous Data Fusion for Computationally Expensive Simulations," wherein an alternative approach to Bayesian calibration was developed based a new sampling technique called VoroSpokes. Vorospokes is a novel quadrature and sampling framework defined with respect to Voronoi tessellations of bounded domains in $R^d$ developed within this project. In this work, we first establish local quadrature and sampling results on convex polytopes using randomly directed rays, or spokes, to approximate the quantities of interest for a specified target function. A theoretical justification for both procedures is provided along with empirical results demonstrating the unbiased convergence in the resulting estimates/samples. The local quadrature and sampling procedures are then extended to global procedures defined on more general domains by applying the local results to the cells of a Voronoi tessellation covering the domain in consideration. We then demonstrate how the proposed global sampling procedure can be used to define a natural framework for adaptively constructing Voronoi Piecewise Surrogate (VPS) approximations based on local error estimates. Finally, we show that the adaptive VPS procedure can be used to form a surrogate model approximation to a specified, potentially unnormalized, density function, and that the global sampling procedure can be used to efficiently draw independent samples from the surrogate density in parallel. The performance of the resulting VoroSpokes sampling framework is assessed on a collection of Bayesian inference problems and is shown to provide highly accurate posterior predictions which align with the results obtained using traditional methods such as Gibbs sampling and random-walk Markov Chain Monte Carlo (MCMC). Importantly, the proposed framework provides a foundation for performing Bayesian inference tasks which is entirely independent from the theory of Markov chains.

More Details

Validation of calibrated K-ɛ model parameters for jet-in-crossflow

AIAA Aviation 2019 Forum

Miller, Nathan; Beresh, Steven J.; Ray, Jaideep

Previous efforts determined a set of calibrated model parameters for ReynoldsAveraged Navier Stokes (RANS) simulations of a compressible jet in crossflow (JIC) using a k-ɛ turbulence model. These coefficients were derived from Particle Image Velocimetry (PIV) data of a complementary experiment using a limited set of flow conditions. Here, k-ɛ models using conventional (nominal) and calibrated parameters are rigorously validated against PIV data acquired under a much wider variety of JIC cases, including a flight configuration. The results from the simulations using the calibrated model parameters showed considerable improvements over those using the nominal values, even for cases that were not used in defining the calibrated parameters. This improvement is demonstrated using quality metrics defined specifically to test the spatial alignment of the jet core as well as the magnitudes of flow variables on the PIV planes. These results suggest that the calibrated parameters have applicability well outside the specific flow case used in defining them and that with the right model parameters, RANS results can be improved significantly over the nominal.

More Details

Estimation of inflow uncertainties in laminar hypersonic double-cone experiments

AIAA Scitech 2019 Forum

Ray, Jaideep; Kieweg, Sarah; Dinzl, Derek J.; Carnes, Brian; Weirs, Gregory; Freno, Brian A.; Howard, Micah; Smith, Thomas M.

We propose a probabilistic framework for assessing the consistency of an experimental dataset, i.e., whether the stated experimental conditions are consistent with the measurements provided. In case the dataset is inconsistent, our framework allows one to hypothesize and test sources of inconsistencies. This is crucial in model validation efforts. The framework relies on statistical inference to estimate experimental settings deemed untrustworthy, from measurements deemed accurate. The quality of the inferred variables is gauged by its ability to reproduce held-out experimental measurements; if the new predictions are closer to measurements than before, the cause of the discrepancy is deemed to have been found. The framework brings together recent advances in the use of Bayesian inference and statistical emulators in fluid dynamics with similarity measures for random variables to construct the hypothesis testing approach. We test the framework on two double-cone experiments executed in the LENS-XX wind tunnel and one in the LENS-I tunnel; all three have encountered difficulties when used in model validation exercises. However, the cause behind the difficulties with the LENS-I experiment is known, and our inferential framework recovers it. We also detect an inconsistency with one of the LENS-XX experiments, and hypothesize three causes for it. We check two of the hypotheses using our framework, and we find evidence that rejects them. We end by proposing that uncertainty quantification methods be used more widely to understand experiments and characterize facilities, and we cite three different methods to do so, the third of which we present in this paper.

More Details

Conditioning multi-model ensembles for disease forecasting

Ray, Jaideep; Cauthen, Katherine R.; Lefantzi, Sophia; Burks, Lynne

In this study we investigate how an ensemble of disease models can be conditioned to observational data, in a bid to improve its predictive skill. We use the ensemble of influenza forecasting models gathered by the US Centers for Disease Control and Prevention (CDC) as the exemplar. This ensemble is used every year to forecast the annual influenza outbreak in the United States. The models constituting this ensemble draw on very different modeling assumptions and approximations and are a diverse collection of methods to approximate epidemiological dynamics. Currently, each models' predictions are accorded the same importance, or weight, when compiling the ensemble's forecast. We consider this equally-weighted ensemble as the baseline case which has to be improved upon. In this study, we explore whether an ensemble forecast can be improved by "conditioning" the ensemble to whatever observational data is available from the ongoing outbreak. "Conditioning" can imply according the ensemble's members different weights which evolve over time, or simply perform the forecast using the top k (equally-weighted) models. In the latter case, the composition of the "top-k-see of models evolves over time. This is called "model averaging" in statistics. We explore four methods to perform model-averaging, three of which are new. We find that the CDC ensemble responds best to the "top-k-models" approach to model-averaging. All the new MA methods perform better than the baseline equally-weighted ensemble. The four model-averaging methods treat the models as black-boxes and simply use their forecasts as inputs i.e., one does not need access to the models at all, but rather only their forecasts. The model-averaging approaches reviewed in this report thus form a general framework for model-averaging any model ensemble.

More Details

Robust Bayesian calibration of a k-ϵ model for compressible jet-in-crossflow simulations

AIAA Journal

Ray, Jaideep; Dechant, Lawrence; Lefantzi, Sophia; Ling, Julia; Arunajatesan, Srinivasan

Compressible jet-in-crossflow interactions are difficult to simulate accurately using Reynolds-averaged Navier-Stokes (RANS) models. This could be due to simplifications inherent in RANS or the use of inappropriate RANS constants estimated by fitting to experiments of simple or canonical flows. Our previous work on Bayesian calibration of a k - ϵ model to experimental data had led to a weak hypothesis that inaccurate simulations could be due to inappropriate constants more than model-form inadequacies of RANS. In this work, Bayesian calibration of k - ϵ constants to a set of experiments that span a range of Mach numbers and jet strengths has been performed. The variation of the calibrated constants has been checked to assess the degree to which parametric estimates compensate for RANS's model-form errors. An analytical model of jet-in-crossflow interactions has also been developed, and estimates of k - ϵ constants that are free of any conflation of parametric and RANS's model-form uncertainties have been obtained. It has been found that the analytical k - ϵ constants provide mean-flow predictions that are similar to those provided by the calibrated constants. Further, both of them provide predictions that are far closer to experimental measurements than those computed using "nominal" values of these constants simply obtained from the literature. It can be concluded that the lack of predictive skill of RANS jet-in-crossflow simulations is mostly due to parametric inadequacies, and our analytical estimates may provide a simple way of obtaining predictive compressible jet-in-crossflow simulations.

More Details
Results 51–75 of 243
Results 51–75 of 243
Top