Publications

50 Results

Search results

Jump to search filters

Improved Subseasonal Forecasting of Extreme Polar Vortices Using Machine Learning

Ehrmann, Thomas; Gulian, Mamikon; Weylandt, Michael

Our research was focused on forecasting the position and shape of the winter stratospheric polar vortex at a subseasonal timescale of 15 days in advance. To achieve this, we employed both statistical and neural network machine learning techniques. The analysis was performed on 42 winter seasons of reanalysis data provided by NASA giving us a total of 6,342 days of data. The state of the polar vortex for determined by using geometric moments to calculate the centroid latitude and the aspect ratio of an ellipse fit onto the vortex. Timeseries for thirty additional precursors were calculated to help improve the predictive capabilities of the algorithm. Feature importance of these precursors was performed using random forest to measure the predictive importance and the ideal number of precursors. Then, using the precursors identified as important, various statistical methods were tested for predictive accuracy with random forest and nearest neighbor performing the best. An echo state network, a type of recurrent neural network that features sparsely connected hidden layer and a reduced number of trainable parameters that allows for rapid training and testing, was also implemented for the forecasting problem. Hyperparameter tuning was performed for each methods using a subset of the training data. The algorithms were trained and tuned on the first 41 years of data, then tested for accuracy on the final year. In general, the centroid latitude of the polar vortex proved easier to predict than the aspect ratio across all algorithms. Random forest outperformed other statistical forecasting algorithms overall but struggled to predict extreme values. Forecasting from echo state network suggested a strong predictive capability past 15 days, but further work is required to fully realize the potential of recurrent neural network approaches.

More Details

Fractional Modeling in Action: a Survey of Nonlocal Models for Subsurface Transport, Turbulent Flows, and Anomalous Materials

Journal of Peridynamics and Nonlocal Modeling

D'Elia, Marta; Gulian, Mamikon; Suzuki, Jorge L.; Zayernouri, Mohsen

Modeling of phenomena such as anomalous transport via fractional-order differential equations has been established as an effective alternative to partial differential equations, due to the inherent ability to describe large-scale behavior with greater efficiency than fully resolved classical models. In this review article, we first provide a broad overview of fractional-order derivatives with a clear emphasis on the stochastic processes that underlie their use. We then survey three exemplary application areas — subsurface transport, turbulence, and anomalous materials — in which fractional-order differential equations provide accurate and predictive models. For each area, we report on the evidence of anomalous behavior that justifies the use of fractional-order models, and survey both foundational models as well as more expressive state-of-the-art models. We also propose avenues for future research, including more advanced and physically sound models, as well as tools for calibration and discovery of fractional-order models.

More Details

Mathematical Foundations for Nonlocal Interface Problems: Multiscale Simulations of Heterogeneous Materials (Final LDRD Report)

D'Elia, Marta; Bochev, Pavel B.; Foster, John E.; Glusa, Christian; Gulian, Mamikon; Gunzburger, Max; Trageser, Jeremy; Kuhlman, Kristopher L.; Martinez, Mario; Najm, Habib N.; Silling, Stewart; Tupek, Michael; Xu, Xiao

Nonlocal models provide a much-needed predictive capability for important Sandia mission applications, ranging from fracture mechanics for nuclear components to subsurface flow for nuclear waste disposal, where traditional partial differential equations (PDEs) models fail to capture effects due to long-range forces at the microscale and mesoscale. However, utilization of this capability is seriously compromised by the lack of a rigorous nonlocal interface theory, required for both application and efficient solution of nonlocal models. To unlock the full potential of nonlocal modeling we developed a mathematically rigorous and physically consistent interface theory and demonstrate its scope in mission-relevant exemplar problems.

More Details

Gaussian process regression constrained by boundary value problems

Computer Methods in Applied Mechanics and Engineering

Gulian, Mamikon; Frankel, A.; Swiler, Laura P.

We develop a framework for Gaussian processes regression constrained by boundary value problems. The framework may be applied to infer the solution of a well-posed boundary value problem with a known second-order differential operator and boundary conditions, but for which only scattered observations of the source term are available. Scattered observations of the solution may also be used in the regression. The framework combines co-kriging with the linear transformation of a Gaussian process together with the use of kernels given by spectral expansions in eigenfunctions of the boundary value problem. Thus, it benefits from a reduced-rank property of covariance matrices. We demonstrate that the resulting framework yields more accurate and stable solution inference as compared to physics-informed Gaussian process regression without boundary condition constraints.

More Details

Error-in-variables modelling for operator learning

Proceedings of Machine Learning Research

Patel, Ravi; Manickam, Indu; Lee, Myoungkyu; Gulian, Mamikon

Deep operator learning has emerged as a promising tool for reduced-order modelling and PDE model discovery. Leveraging the expressive power of deep neural networks, especially in high dimensions, such methods learn the mapping between functional state variables. While proposed methods have assumed noise only in the dependent variables, experimental and numerical data for operator learning typically exhibit noise in the independent variables as well, since both variables represent signals that are subject to measurement error. In regression on scalar data, failure to account for noisy independent variables can lead to biased parameter estimates. With noisy independent variables, linear models fitted via ordinary least squares (OLS) will show attenuation bias, wherein the slope will be underestimated. In this work, we derive an analogue of attenuation bias for linear operator regression with white noise in both the independent and dependent variables, showing that the norm upper bound of the operator learned via OLS decreases with increasing noise in the independent variable. In the nonlinear setting, we computationally demonstrate underprediction of the action of the Burgers operator in the presence of noise in the independent variable. We propose error-in-variables (EiV) models for two operator regression methods, MOR-Physics and DeepONet, and demonstrate that these new models reduce bias in the presence of noisy independent variables for a variety of operator learning problems. Considering the Burgers operator in 1D and 2D, we demonstrate that EiV operator learning robustly recovers operators in high-noise regimes that defeat OLS operator learning. We also introduce an EiV model for time-evolving PDE discovery and show that OLS and EiV perform similarly in learning the Kuramoto-Sivashinsky evolution operator from corrupted data, suggesting that the effect of bias in OLS operator learning depends on the regularity of the target operator.

More Details

Connections between nonlocal operators: From vector calculus identities to a fractional Helmholtz decomposition

Gulian, Mamikon; Mengesha, Tadele; Scott, James C.

Nonlocal vector calculus, which is based on the nonlocal forms of gradient, divergence, and Laplace operators in multiple dimensions, has shown promising applications in fields such as hydrology, mechanics, and image processing. In this work, we study the analytical underpinnings of these operators. We rigorously treat compositions of nonlocal operators, prove nonlocal vector calculus identities, and connect weighted and unweighted variational frameworks. We combine these results to obtain a weighted fractional Helmholtz decomposition which is valid for sufficiently smooth vector fields. Our approach identifies the function spaces in which the stated identities and decompositions hold, providing a rigorous foundation to the nonlocal vector calculus identities that can serve as tools for nonlocal modeling in higher dimensions.

More Details

Data-driven learning of nonlocal physics from high-fidelity synthetic data

Computer Methods in Applied Mechanics and Engineering

You, Huaiqian; Yu, Yue; Trask, Nathaniel A.; Gulian, Mamikon; D'Elia, Marta

A key challenge to nonlocal models is the analytical complexity of deriving them from first principles, and frequently their use is justified a posteriori. In this work we extract nonlocal models from data, circumventing these challenges and providing data-driven justification for the resulting model form. Extracting data-driven surrogates is a major challenge for machine learning (ML) approaches, due to nonlinearities and lack of convexity — it is particularly challenging to extract surrogates which are provably well-posed and numerically stable. Our scheme not only yields a convex optimization problem, but also allows extraction of nonlocal models whose kernels may be partially negative while maintaining well-posedness even in small-data regimes. To achieve this, based on established nonlocal theory, we embed in our algorithm sufficient conditions on the non-positive part of the kernel that guarantee well-posedness of the learnt operator. These conditions are imposed as inequality constraints to meet the requisite conditions of the nonlocal theory. We demonstrate this workflow for a range of applications, including reproduction of manufactured nonlocal kernels; numerical homogenization of Darcy flow associated with a heterogeneous periodic microstructure; nonlocal approximation to high-order local transport phenomena; and approximation of globally supported fractional diffusion operators by truncated kernels.

More Details

Analysis of Anisotropic Nonlocal Diffusion Models: Well-posedness of Fractional Problems for Anomalous Transport

Gulian, Mamikon

We analyze the well-posedness of an anisotropic, nonlocal diffusion equation. Establishing an equivalence between weighted and unweighted anisotropic nonlocal diffusion operators in the vein of unified nonlocal vector calculus, we apply our analysis to a class of fractional-order operators and present rigorous estimates for the solution of the corresponding anisotropic anomalous diffusion equation. Furthermore, we extend our analysis to the anisotropic diffusion-advection equation and prove well-posedness for fractional orders s ∊ [0.5, 1). We also present an application of the advection-diffusion equation to anomalous transport of solutes.

More Details

A block coordinate descent optimizer for classification problems exploiting convexity

CEUR Workshop Proceedings

Patel, Ravi; Trask, Nathaniel A.; Gulian, Mamikon; Cyr, Eric C.

Second-order optimizers hold intriguing potential for deep learning, but suffer from increased cost and sensitivity to the non-convexity of the loss surface as compared to gradient-based approaches. We introduce a coordinate descent method to train deep neural networks for classification tasks that exploits global convexity of the cross-entropy loss in the weights of the linear layer. Our hybrid Newton/Gradient Descent (NGD) method is consistent with the interpretation of hidden layers as providing an adaptive basis and the linear layer as providing an optimal fit of the basis to data. By alternating between a second-order method to find globally optimal parameters for the linear layer and gradient descent to train the hidden layers, we ensure an optimal fit of the adaptive basis to data throughout training. The size of the Hessian in the second-order step scales only with the number weights in the linear layer and not the depth and width of the hidden layers; furthermore, the approach is applicable to arbitrary hidden layer architecture. Previous work applying this adaptive basis perspective to regression problems demonstrated significant improvements in accuracy at reduced training cost, and this work can be viewed as an extension of this approach to classification problems. We first prove that the resulting Hessian matrix is symmetric semi-definite, and that the Newton step realizes a global minimizer. By studying classification of manufactured two-dimensional point cloud data, we demonstrate both an improvement in validation error and a striking qualitative difference in the basis functions encoded in the hidden layer when trained using NGD. Application to image classification benchmarks for both dense and convolutional architectures reveals improved training accuracy, suggesting gains of second-order methods over gradient descent. A Tensorflow implementation of the algorithm is available at github.com/rgp62/.

More Details

A Survey of Constrained Gaussian Process: Approaches and Implementation Challenges

Journal of Machine Learning for Modeling and Computing

Swiler, Laura P.; Gulian, Mamikon; Frankel, A.; Safta, Cosmin; Jakeman, John D.

Gaussian process regression is a popular Bayesian framework for surrogate modeling of expensive data sources. As part of a larger effort in scientific machine learning, many recent works have incorporated physical constraints or other a priori information within Gaussian process regression to supplement limited data and regularize the behavior of the model. We provide an overview and survey of several classes of Gaussian process constraints, including positivity or bound constraints, monotonicity and convexity constraints, differential equation constraints provided by linear PDEs, and boundary condition constraints. We compare the strategies behind each approach as well as the differences in implementation, concluding with a discussion of the computational challenges introduced by constraints.

More Details

Incorporating physical constraints into Gaussian process surrogate models (LDRD Project Summary)

Swiler, Laura P.; Gulian, Mamikon; Frankel, A.; Jakeman, John D.; Safta, Cosmin

This report summarizes work done under the Laboratory Directed Research and Development (LDRD) project titled "Incorporating physical constraints into Gaussian process surrogate models?' In this project, we explored a variety of strategies for constraint implementations. We considered bound constraints, monotonicity and related convexity constraints, Gaussian processes which are constrained to satisfy linear operator constraints which represent physical laws expressed as partial differential equations, and intrinsic boundary condition constraints. We wrote three papers and are currently finishing two others. We developed initial software implementations for some approaches. This report summarizes the work done under this LDRD.

More Details

A Unified Theory of Fractional Nonlocal and Weighted Nonlocal Vector Calculus

Gulian, Mamikon; Karniadakis, George E.; Olson, Hayley

Nonlocal and fractional-order models capture effects that classical partial differential equations cannot describe; for this reason, they are suitable for a broad class of engineering and scientific applications that feature multiscale or anomalous behavior. This has driven a desire for a vector calculus that includes nonlocal and fractional gradient, divergence and Laplacian type operators, as well as tools such as Green's identities, to model subsurface transport, turbulence, and conservation laws. In the literature, several independent definitions and theories of nonlocal and fractional vector calculus have been put forward. Some have been studied rigorously and in depth, while others have been introduced ad-hoc for specific applications. The goal of this work is to provide foundations for a unified vector calculus by (1) consolidating fractional vector calculus as a special case of nonlocal vector calculus, (2) relating unweighted and weighted Laplacian operators by introducing an equivalence kernel, and (3) proving a form of Green's identity to unify the corresponding variational frameworks for the resulting nonlocal volume-constrained problems. The proposed framework goes beyond the analysis of nonlocal equations by supporting new model discovery, establishing theory and interpretation for a broad class of operators, and providing useful analogues of standard tools from the classical vector calculus.

More Details

What is the fractional Laplacian? A comparative review with new results

Journal of Computational Physics

Lischke, Anna; Pang, Guofei; Gulian, Mamikon; Song, Fangying; Glusa, Christian; Zheng, Xiaoning; Mao, Zhiping; Cai, Wei; Meerschaert, Mark M.; Ainsworth, Mark; Karniadakis, George E.

The fractional Laplacian in Rd, which we write as (−Δ)α/2 with α∈(0,2), has multiple equivalent characterizations. Moreover, in bounded domains, boundary conditions must be incorporated in these characterizations in mathematically distinct ways, and there is currently no consensus in the literature as to which definition of the fractional Laplacian in bounded domains is most appropriate for a given application. The Riesz (or integral) definition, for example, admits a nonlocal boundary condition, where the value of a function must be prescribed on the entire exterior of the domain in order to compute its fractional Laplacian. In contrast, the spectral definition requires only the standard local boundary condition. These differences, among others, lead us to ask the question: “What is the fractional Laplacian?” Beginning from first principles, we compare several commonly used definitions of the fractional Laplacian theoretically, through their stochastic interpretations as well as their analytical properties. Then, we present quantitative comparisons using a sample of state-of-the-art methods. We discuss recent advances on nonzero boundary conditions and present new methods to discretize such boundary value problems: radial basis function collocation (for the Riesz fractional Laplacian) and nonharmonic lifting (for the spectral fractional Laplacian). In our numerical studies, we aim to compare different definitions on bounded domains using a collection of benchmark problems. We consider the fractional Poisson equation with both zero and nonzero boundary conditions, where the fractional Laplacian is defined according to the Riesz definition, the spectral definition, the directional definition, and the horizon-based nonlocal definition. We verify the accuracy of the numerical methods used in the approximations for each operator, and we focus on identifying differences in the boundary behaviors of solutions to equations posed with these different definitions. Through our efforts, we aim to further engage the research community in open problems and assist practitioners in identifying the most appropriate definition and computational approach to use for their mathematical models in addressing anomalous transport in diverse applications.

More Details

Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint

Proceedings of Machine Learning Research

Cyr, Eric C.; Gulian, Mamikon; Patel, Ravi; Perego, Mauro; Trask, Nathaniel A.

Motivated by the gap between theoretical optimal approximation rates of deep neural networks (DNNs) and the accuracy realized in practice, we seek to improve the training of DNNs. The adoption of an adaptive basis viewpoint of DNNs leads to novel initializations and a hybrid least squares/gradient descent optimizer. We provide analysis of these techniques and illustrate via numerical examples dramatic increases in accuracy and convergence rate for benchmarks characterizing scientific applications where DNNs are currently used, including regression problems and physics-informed neural networks for the solution of partial differential equations.

More Details
50 Results
50 Results