Publications Search

Yet Another Discriminant Analysis (YADA): A Probabilistic Model for Machine Learning Applications

Mathematics

Field, Richard V.; Smith, Michael R.; Wuest, Ellery J.; Ingram, Joe B.

This paper presents a probabilistic model for various machine learning (ML) applications. While deep learning (DL) has produced state-of-the-art results in many domains, DL models are complex and over-parameterized, which leads to high uncertainty about what the model has learned, as well as its decision process. Further, DL models are not probabilistic, making reasoning about their output challenging. In contrast, the proposed model, referred to as Yet Another Discriminate Analysis(YADA), is less complex than other methods, is based on a mathematically rigorous foundation, and can be utilized for a wide variety of ML tasks including classification, explainability, and uncertainty quantification. YADA is thus competitive in most cases with many state-of-the-art DL models. Ideally, a probabilistic model would represent the full joint probability distribution of its features, but doing so is often computationally expensive and intractable. Hence, many probabilistic models assume that the features are either normally distributed, mutually independent, or both, which can severely limit their performance. YADA is an intermediate model that (1) captures the marginal distributions of each variable and the pairwise correlations between variables and (2) explicitly maps features to the space of multivariate Gaussian variables. Numerous mathematical properties of the YADA model can be derived, thereby improving the theoretic underpinnings of ML. Validation of the model can be statistically verified on new or held-out data using native properties of YADA. However, there are some engineering and practical challenges that we enumerate to make YADA more useful.

More Details

TYPE Journal Article YEAR 2024

DOI OSTI Scopus

Geometric Measures of Trustworthiness for Machine Learning Predictions

Smith, Michael R.; Datta, Esha; Field, Richard V.; Ingram, Joe B.; Domschot, Eva; Wuest, Ellery J.; Strnadova-Neeley, Veronika

his report details the findings from the research and investigation of Geometric Measures of Trustworthiness for Machine Learning Predictions. We explored the trustworthiness of machine learning (ML) models’ predictions using geometric measures to quantify the similarity of a query point with the training data. Predictive uncertainty in ML can originate from at least three sources: (1) Model uncertainty, which represents the uncertainty in model form (e.g. decision tree, vs neural network) and estimating the model parameters from the training data, (2) Data uncertainty, which represents the natural complexities of the data such as class overlap and inherent noise, and (3) Distributional uncertainty, which represents the mismatch between the training and operational distributions. The proposed measures focus on measuring and explaining the data and distributional uncertainties by measuring the relationships of operational data with the training data.

More Details

TYPE LDRD Report YEAR 2024

DOI OSTI

Publications

Search results

Yet Another Discriminant Analysis (YADA): A Probabilistic Model for Machine Learning Applications

Geometric Measures of Trustworthiness for Machine Learning Predictions