Nonlocal models provide a much-needed predictive capability for important Sandia mission applications, ranging from fracture mechanics for nuclear components to subsurface flow for nuclear waste disposal, where traditional partial differential equations (PDEs) models fail to capture effects due to long-range forces at the microscale and mesoscale. However, utilization of this capability is seriously compromised by the lack of a rigorous nonlocal interface theory, required for both application and efficient solution of nonlocal models. To unlock the full potential of nonlocal modeling we developed a mathematically rigorous and physically consistent interface theory and demonstrate its scope in mission-relevant exemplar problems.
A key challenge to nonlocal models is the analytical complexity of deriving them from first principles, and frequently their use is justified a posteriori. In this work we extract nonlocal models from data, circumventing these challenges and providing data-driven justification for the resulting model form. Extracting data-driven surrogates is a major challenge for machine learning (ML) approaches, due to nonlinearities and lack of convexity — it is particularly challenging to extract surrogates which are provably well-posed and numerically stable. Our scheme not only yields a convex optimization problem, but also allows extraction of nonlocal models whose kernels may be partially negative while maintaining well-posedness even in small-data regimes. To achieve this, based on established nonlocal theory, we embed in our algorithm sufficient conditions on the non-positive part of the kernel that guarantee well-posedness of the learnt operator. These conditions are imposed as inequality constraints to meet the requisite conditions of the nonlocal theory. We demonstrate this workflow for a range of applications, including reproduction of manufactured nonlocal kernels; numerical homogenization of Darcy flow associated with a heterogeneous periodic microstructure; nonlocal approximation to high-order local transport phenomena; and approximation of globally supported fractional diffusion operators by truncated kernels.
Second-order optimizers hold intriguing potential for deep learning, but suffer from increased cost and sensitivity to the non-convexity of the loss surface as compared to gradient-based approaches. We introduce a coordinate descent method to train deep neural networks for classification tasks that exploits global convexity of the cross-entropy loss in the weights of the linear layer. Our hybrid Newton/Gradient Descent (NGD) method is consistent with the interpretation of hidden layers as providing an adaptive basis and the linear layer as providing an optimal fit of the basis to data. By alternating between a second-order method to find globally optimal parameters for the linear layer and gradient descent to train the hidden layers, we ensure an optimal fit of the adaptive basis to data throughout training. The size of the Hessian in the second-order step scales only with the number weights in the linear layer and not the depth and width of the hidden layers; furthermore, the approach is applicable to arbitrary hidden layer architecture. Previous work applying this adaptive basis perspective to regression problems demonstrated significant improvements in accuracy at reduced training cost, and this work can be viewed as an extension of this approach to classification problems. We first prove that the resulting Hessian matrix is symmetric semi-definite, and that the Newton step realizes a global minimizer. By studying classification of manufactured two-dimensional point cloud data, we demonstrate both an improvement in validation error and a striking qualitative difference in the basis functions encoded in the hidden layer when trained using NGD. Application to image classification benchmarks for both dense and convolutional architectures reveals improved training accuracy, suggesting gains of second-order methods over gradient descent. A Tensorflow implementation of the algorithm is available at github.com/rgp62/.
Lischke, Anna; Pang, Guofei; Gulian, Mamikon G.; Song, Fangying; Glusa, Christian A.; Zheng, Xiaoning; Mao, Zhiping; Cai, Wei; Meerschaert, Mark M.; Ainsworth, Mark; Karniadakis, George E.
The fractional Laplacian in Rd, which we write as (−Δ)α/2 with α∈(0,2), has multiple equivalent characterizations. Moreover, in bounded domains, boundary conditions must be incorporated in these characterizations in mathematically distinct ways, and there is currently no consensus in the literature as to which definition of the fractional Laplacian in bounded domains is most appropriate for a given application. The Riesz (or integral) definition, for example, admits a nonlocal boundary condition, where the value of a function must be prescribed on the entire exterior of the domain in order to compute its fractional Laplacian. In contrast, the spectral definition requires only the standard local boundary condition. These differences, among others, lead us to ask the question: “What is the fractional Laplacian?” Beginning from first principles, we compare several commonly used definitions of the fractional Laplacian theoretically, through their stochastic interpretations as well as their analytical properties. Then, we present quantitative comparisons using a sample of state-of-the-art methods. We discuss recent advances on nonzero boundary conditions and present new methods to discretize such boundary value problems: radial basis function collocation (for the Riesz fractional Laplacian) and nonharmonic lifting (for the spectral fractional Laplacian). In our numerical studies, we aim to compare different definitions on bounded domains using a collection of benchmark problems. We consider the fractional Poisson equation with both zero and nonzero boundary conditions, where the fractional Laplacian is defined according to the Riesz definition, the spectral definition, the directional definition, and the horizon-based nonlocal definition. We verify the accuracy of the numerical methods used in the approximations for each operator, and we focus on identifying differences in the boundary behaviors of solutions to equations posed with these different definitions. Through our efforts, we aim to further engage the research community in open problems and assist practitioners in identifying the most appropriate definition and computational approach to use for their mathematical models in addressing anomalous transport in diverse applications.