Publications Details

Publications / LDRD Report

Extending Parsimonious Bayesian Inference

Parsimonious Bayesian inference is a theoretical framework for efficient data assimilation that seeks to balance increased consistency between predictions and training data against corresponding increases in model complexity. Within this framework, over-training is understood as optimization that encodes excessive information within model parameters while only achieving small improvements between predictions and training data. This project aims to develop practical methods of limiting excess model information during optimization. One key observation is that practical heuristics for parsimonious learning in high-dimensions must balance expressivity, i.e. the ability of the model to capture diverse predictions with only a few non-zero parameters, against discoverability, i.e. the ability to train the model with gradient-based optimization and drive parameters to low information states. As such, we developed logical activation functions that are able to adaptively approximate arbitrary truth tables that define Boolean logic operations within a probabilistic framework. These functions have demonstrated the ability to learn exclusive disjunction (XOR) and conditioned disjunction (if [condition] then [result_if_true] else [result_if_false]) within a single layer of a neural network. To efficiently exploit these activation functions to drive parsimonious learning required several other advances within the domain of variational inference. The most efficient form of complexity suppression is structured sparsification, driving most model parameters to zero while achieving the structural coherence among nonzeros needed for bandwidth reduction. Such models are not only far more efficient at suppressing information-theoretic complexity, they also reduce the other forms of complexity (computations, communication, storage, and the number of dependencies needed to evaluate predictions). Aiming to support enhanced sparsification, this project examined new approaches to high-dimensional variational inference that allow us to calibrate and control parameter uncertainty during optimization. By identifying which parameters can sustain sparsifying perturbations with little impact on prediction quality, we can develop better pruning strategies by framing them as approximate Bayesian inference. These advances also open paths to mitigate concerns with deploying advanced learning methods in resource-constrained environments, such as running models on power-limited or communication-limited devices.