Physics-constrained machine learning is emerging as an important topic in the field of machine learning for physics. One of the most significant advantages of incorporating physics constraints into machine learning methods is that the resulting model requires significantly less data to train. By incorporating physical rules into the machine learning formulation itself, the predictions are expected to be physically plausible. Gaussian process (GP) is perhaps one of the most common methods in machine learning for small datasets. In this paper, we investigate the possibility of constraining a GP formulation with monotonicity on three different material datasets, where one experimental and two computational datasets are used. The monotonic GP is compared against the regular GP, where a significant reduction in the posterior variance is observed. The monotonic GP is strictly monotonic in the interpolation regime, but in the extrapolation regime, the monotonic effect starts fading away as one goes beyond the training dataset. Imposing monotonicity on the GP comes at a small accuracy cost, compared to the regular GP. The monotonic GP is perhaps most useful in applications where data are scarce and noisy, and monotonicity is supported by strong physical evidence.
Making reliable predictions in the presence of uncertainty is critical to high-consequence modeling and simulation activities, such as those encountered at Sandia National Laboratories. Surrogate or reduced-order models are often used to mitigate the expense of performing quality uncertainty analyses with high-fidelity, physics-based codes. However, phenomenological surrogate models do not always adhere to important physics and system properties. This project develops surrogate models that integrate physical theory with experimental data through a maximally-informative framework that accounts for the many uncertainties present in computational modeling problems. Correlations between relevant outputs are preserved through the use of multi-output or co-predictive surrogate models; known physical properties (specifically monotoncity) are also preserved; and unknown physics and phenomena are detected using a causal analysis. By endowing surrogate models with key properties of the physical system being studied, their predictive power is arguably enhanced, allowing for reliable simulations and analyses at a reduced computational cost.
The modern scientific process often involves the development of a predictive computational model. To improve its accuracy, a computational model can be calibrated to a set of experimental data. A variety of validation metrics can be used to quantify this process. Some of these metrics have direct physical interpretations and a history of use, while others, especially those for probabilistic data, are more difficult to interpret. In this work, a variety of validation metrics are used to quantify the accuracy of different calibration methods. Frequentist and Bayesian perspectives are used with both fixed effects and mixed-effects statistical models. Through a quantitative comparison of the resulting distributions, the most accurate calibration method can be selected. Two examples are included which compare the results of various validation metrics for different calibration methods. It is quantitatively shown that, in the presence of significant laboratory biases, a fixed effects calibration is significantly less accurate than a mixed-effects calibration. This is because the mixed-effects statistical model better characterizes the underlying parameter distributions than the fixed effects model. The results suggest that validation metrics can be used to select the most accurate calibration model for a particular empirical model with corresponding experimental data.
A class of sequential multiscale models investigated in this study consists of discrete dislocation dynamics (DDD) simulations and continuum strain gradient plasticity (SGP) models to simulate the size effect in plastic deformation of metallic micropillars. The high-fidelity DDD explicitly simulates the microstructural (dislocation) interactions. These simulations account for the effect of dislocation densities and their spatial distributions on plastic deformation. The continuum SGP captures the size-dependent plasticity in micropillars using two length parameters. The main challenge in predictive DDD-SGP multiscale modeling is selecting the proper constitutive relations for the SGP model, which is necessitated by the uncertainty in computational prediction due to DDD's microstructural randomness. This contribution addresses these challenges using a Bayesian learning and model selection framework. A family of SGP models with different fidelities and complexities is constructed using various constitutive relation assumptions. The parameters of the SGP models are then learned from a set of training data furnished by the DDD simulations of micropillars. Bayesian learning allows the assessment of the credibility of plastic deformation prediction by characterizing the microstructural variability and the uncertainty in training data. Additionally, the family of the possible SGP models is subjected to a Bayesian model selection to pick the model that adequately explains the DDD training data. The framework proposed in this study enables learning the physics-based multiscale model from uncertain observational data and determining the optimal computational model for predicting complex physical phenomena, i.e., size effect in plastic deformation of micropillars.
Computational modeling and simulation are paramount to modern science. Computational models often replace physical experiments that are prohibitively expensive, dangerous, or occur at extreme scales. Thus, it is critical that these models accurately represent and can be used as replacements for reality. This paper provides an analysis of metrics that may be used to determine the validity of a computational model. While some metrics have a direct physical meaning and a long history of use, others, especially those that compare probabilistic data, are more difficult to interpret. Furthermore, the process of model validation is often application-specific, making the procedure itself challenging and the results difficult to defend. We therefore provide guidance and recommendations as to which validation metric to use, as well as how to use and decipher the results. Furthermore an example is included that compares interpretations of various metrics and demonstrates the impact of model and experimental uncertainty on validation processes.
This paper describes versions of OPAL, the Occam-Plausibility Algorithm (Farrell et al. in J Comput Phys 295:189–208, 2015) in which the use of Bayesian model plausibilities is replaced with information-theoretic methods, such as the Akaike information criterion and the Bayesian information criterion. Applications to complex systems of coarse-grained molecular models approximating atomistic models of polyethylene materials are described. All of these model selection methods take into account uncertainties in the model, the observational data, the model parameters, and the predicted quantities of interest. A comparison of the models chosen by Bayesian model selection criteria and those chosen by the information-theoretic criteria is given.