Publications

Results 26–49 of 49

Search results

Jump to search filters

Data-driven learning of nonlocal physics from high-fidelity synthetic data

Computer Methods in Applied Mechanics and Engineering

You, Huaiqian; Yu, Yue; Trask, Nathaniel A.; Gulian, Mamikon G.; D'Elia, Marta D.

A key challenge to nonlocal models is the analytical complexity of deriving them from first principles, and frequently their use is justified a posteriori. In this work we extract nonlocal models from data, circumventing these challenges and providing data-driven justification for the resulting model form. Extracting data-driven surrogates is a major challenge for machine learning (ML) approaches, due to nonlinearities and lack of convexity — it is particularly challenging to extract surrogates which are provably well-posed and numerically stable. Our scheme not only yields a convex optimization problem, but also allows extraction of nonlocal models whose kernels may be partially negative while maintaining well-posedness even in small-data regimes. To achieve this, based on established nonlocal theory, we embed in our algorithm sufficient conditions on the non-positive part of the kernel that guarantee well-posedness of the learnt operator. These conditions are imposed as inequality constraints to meet the requisite conditions of the nonlocal theory. We demonstrate this workflow for a range of applications, including reproduction of manufactured nonlocal kernels; numerical homogenization of Darcy flow associated with a heterogeneous periodic microstructure; nonlocal approximation to high-order local transport phenomena; and approximation of globally supported fractional diffusion operators by truncated kernels.

More Details

A block coordinate descent optimizer for classification problems exploiting convexity

CEUR Workshop Proceedings

Patel, Ravi G.; Trask, Nathaniel A.; Gulian, Mamikon G.; Cyr, Eric C.

Second-order optimizers hold intriguing potential for deep learning, but suffer from increased cost and sensitivity to the non-convexity of the loss surface as compared to gradient-based approaches. We introduce a coordinate descent method to train deep neural networks for classification tasks that exploits global convexity of the cross-entropy loss in the weights of the linear layer. Our hybrid Newton/Gradient Descent (NGD) method is consistent with the interpretation of hidden layers as providing an adaptive basis and the linear layer as providing an optimal fit of the basis to data. By alternating between a second-order method to find globally optimal parameters for the linear layer and gradient descent to train the hidden layers, we ensure an optimal fit of the adaptive basis to data throughout training. The size of the Hessian in the second-order step scales only with the number weights in the linear layer and not the depth and width of the hidden layers; furthermore, the approach is applicable to arbitrary hidden layer architecture. Previous work applying this adaptive basis perspective to regression problems demonstrated significant improvements in accuracy at reduced training cost, and this work can be viewed as an extension of this approach to classification problems. We first prove that the resulting Hessian matrix is symmetric semi-definite, and that the Newton step realizes a global minimizer. By studying classification of manufactured two-dimensional point cloud data, we demonstrate both an improvement in validation error and a striking qualitative difference in the basis functions encoded in the hidden layer when trained using NGD. Application to image classification benchmarks for both dense and convolutional architectures reveals improved training accuracy, suggesting gains of second-order methods over gradient descent. A Tensorflow implementation of the algorithm is available at github.com/rgp62/.

More Details

A Survey of Constrained Gaussian Process: Approaches and Implementation Challenges

Journal of Machine Learning for Modeling and Computing

Swiler, Laura P.; Gulian, Mamikon G.; Frankel, Ari L.; Safta, Cosmin S.; Jakeman, John D.

Gaussian process regression is a popular Bayesian framework for surrogate modeling of expensive data sources. As part of a larger effort in scientific machine learning, many recent works have incorporated physical constraints or other a priori information within Gaussian process regression to supplement limited data and regularize the behavior of the model. We provide an overview and survey of several classes of Gaussian process constraints, including positivity or bound constraints, monotonicity and convexity constraints, differential equation constraints provided by linear PDEs, and boundary condition constraints. We compare the strategies behind each approach as well as the differences in implementation, concluding with a discussion of the computational challenges introduced by constraints.

More Details

Incorporating physical constraints into Gaussian process surrogate models (LDRD Project Summary)

Swiler, Laura P.; Gulian, Mamikon G.; Frankel, Ari L.; Jakeman, John D.; Safta, Cosmin S.

This report summarizes work done under the Laboratory Directed Research and Development (LDRD) project titled "Incorporating physical constraints into Gaussian process surrogate models?' In this project, we explored a variety of strategies for constraint implementations. We considered bound constraints, monotonicity and related convexity constraints, Gaussian processes which are constrained to satisfy linear operator constraints which represent physical laws expressed as partial differential equations, and intrinsic boundary condition constraints. We wrote three papers and are currently finishing two others. We developed initial software implementations for some approaches. This report summarizes the work done under this LDRD.

More Details

A Unified Theory of Fractional Nonlocal and Weighted Nonlocal Vector Calculus

D'Elia, Marta D.; Gulian, Mamikon G.; Karniadakis, George; Olson, Hayley

Nonlocal and fractional-order models capture effects that classical partial differential equations cannot describe; for this reason, they are suitable for a broad class of engineering and scientific applications that feature multiscale or anomalous behavior. This has driven a desire for a vector calculus that includes nonlocal and fractional gradient, divergence and Laplacian type operators, as well as tools such as Green's identities, to model subsurface transport, turbulence, and conservation laws. In the literature, several independent definitions and theories of nonlocal and fractional vector calculus have been put forward. Some have been studied rigorously and in depth, while others have been introduced ad-hoc for specific applications. The goal of this work is to provide foundations for a unified vector calculus by (1) consolidating fractional vector calculus as a special case of nonlocal vector calculus, (2) relating unweighted and weighted Laplacian operators by introducing an equivalence kernel, and (3) proving a form of Green's identity to unify the corresponding variational frameworks for the resulting nonlocal volume-constrained problems. The proposed framework goes beyond the analysis of nonlocal equations by supporting new model discovery, establishing theory and interpretation for a broad class of operators, and providing useful analogues of standard tools from the classical vector calculus.

More Details

What is the fractional Laplacian? A comparative review with new results

Journal of Computational Physics

Lischke, Anna; Pang, Guofei; Gulian, Mamikon G.; Song, Fangying; Glusa, Christian A.; Zheng, Xiaoning; Mao, Zhiping; Cai, Wei; Meerschaert, Mark M.; Ainsworth, Mark; Karniadakis, George E.

The fractional Laplacian in Rd, which we write as (−Δ)α/2 with α∈(0,2), has multiple equivalent characterizations. Moreover, in bounded domains, boundary conditions must be incorporated in these characterizations in mathematically distinct ways, and there is currently no consensus in the literature as to which definition of the fractional Laplacian in bounded domains is most appropriate for a given application. The Riesz (or integral) definition, for example, admits a nonlocal boundary condition, where the value of a function must be prescribed on the entire exterior of the domain in order to compute its fractional Laplacian. In contrast, the spectral definition requires only the standard local boundary condition. These differences, among others, lead us to ask the question: “What is the fractional Laplacian?” Beginning from first principles, we compare several commonly used definitions of the fractional Laplacian theoretically, through their stochastic interpretations as well as their analytical properties. Then, we present quantitative comparisons using a sample of state-of-the-art methods. We discuss recent advances on nonzero boundary conditions and present new methods to discretize such boundary value problems: radial basis function collocation (for the Riesz fractional Laplacian) and nonharmonic lifting (for the spectral fractional Laplacian). In our numerical studies, we aim to compare different definitions on bounded domains using a collection of benchmark problems. We consider the fractional Poisson equation with both zero and nonzero boundary conditions, where the fractional Laplacian is defined according to the Riesz definition, the spectral definition, the directional definition, and the horizon-based nonlocal definition. We verify the accuracy of the numerical methods used in the approximations for each operator, and we focus on identifying differences in the boundary behaviors of solutions to equations posed with these different definitions. Through our efforts, we aim to further engage the research community in open problems and assist practitioners in identifying the most appropriate definition and computational approach to use for their mathematical models in addressing anomalous transport in diverse applications.

More Details

Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint

Proceedings of Machine Learning Research

Cyr, Eric C.; Gulian, Mamikon G.; Patel, Ravi G.; Perego, Mauro P.; Trask, Nathaniel A.

Motivated by the gap between theoretical optimal approximation rates of deep neural networks (DNNs) and the accuracy realized in practice, we seek to improve the training of DNNs. The adoption of an adaptive basis viewpoint of DNNs leads to novel initializations and a hybrid least squares/gradient descent optimizer. We provide analysis of these techniques and illustrate via numerical examples dramatic increases in accuracy and convergence rate for benchmarks characterizing scientific applications where DNNs are currently used, including regression problems and physics-informed neural networks for the solution of partial differential equations.

More Details
Results 26–49 of 49
Results 26–49 of 49