Publications

8 Results

Search results

Jump to search filters

The Multiple Instance Learning Gaussian Process Probit Model

Proceedings of Machine Learning Research

Wang, Fulton W.; Pinar, Ali P.

In the Multiple Instance Learning (MIL) scenario, the training data consists of instances grouped into bags. Bag labels specify whether each bag contains at least one positive instance, but instance labels are not observed. Recently, Haußmann et al [10] tackled the MIL instance label prediction task by introducing the Multiple Instance Learning Gaussian Process Logistic (MIL-GP-Logistic) model, an adaptation of the Gaussian Process Logistic Classification model that inherits its uncertainty quantification and flexibility. Notably, they give a fast mean-field variational inference procedure. However, due to their use of the logit link, they do not maximize the variational inference ELBO objective directly, but rather a lower bound on it. This approximation, as we show, hurts predictive performance. In this work, we propose the Multiple Instance Learning Gaussian Process Probit (MIL-GP-Probit) model, an adaptation of the Gaussian Process Probit Classification model to solve the MIL instance label prediction problem. Leveraging the analytical tractability of the probit link, we give a variational inference procedure based on variable augmentation that maximizes the ELBO objective directly. Applying it, we show MIL-GP-Probit is more calibrated than MIL-GP-Logistic on all 20 datasets of the benchmark 20 Newsgroups dataset collection, and achieves higher AUC than MIL-GP-Logistic on an additional 51 out of 59 datasets. Finally, we show how the probit formulation enables principled bag label predictions and a Gibbs sampling scheme. This is the first exact inference scheme for any Bayesian model for the MIL scenario.

More Details

Developing an Active Learning algorithm for learning Bayesian classifiers under the Multiple Instance Learning scenario

Wang, Fulton W.; Pinar, Ali P.

In the Multiple Instance Learning scenario, the training data consists of instances grouped into bags, and each bag is labelled with whether it is positive, i.e. contains at least one positive instance. First, Active Learning, in which additional labels can be iteratively requested, has the potential to allow more accurate classifiers to be learned with less labels. Active Learning has been applied to the Multiple Instance Learning under two settings: when bag labels of unlabelled bags can be requested, and when instance labels within bags known to be positive can be requested. Second, Bayesian Active learning methods have the potential to learn accurate classifiers with few labels, because they explicitly track the classifier uncertainty and can thus address its knowledge gaps. Yet, there does not exist any Bayesian Active Learning method for the Multiple Instance Learning Scenario. In this work, we develop the first such method. We develop a Bayesian classifier for the Multiple Instance Learning scenario, show how it can be efficiently used for Bayesian Active Learning, and perform experiments assessing its performance. While its performance exceeds that when no Active Learning is used, it is sometimes better, sometimes worse than the naive baseline of uncertainty sampling, depending on the situation. This suggests future work: building more customizable Bayesian Active Learning methods for the Multiple Instance Scenario, customizable to whether bag or instance label accuracy is targeted, and the labeling budget.

More Details

A Multi-Instance learning Framework for Seismic Detectors

Ray, Jaideep R.; Wang, Fulton W.; Young, Christopher J.

In this report, we construct and test a framework for fusing the predictions of a ensemble of seismic wave detectors. The framework is drawn from multi-instance learning and is meant to improve the predictive skill of the ensemble beyond that of the individual detectors. We show how the framework allows the use of multiple features derived from the seismogram to detect seismic wave arrivals, as well as how it allows only the most informative features to be retained in the ensemble. The computational cost of the "ensembling" method is linear in the size of the ensemble, allowing a scalable method for monitoring multiple features/transformations of a seismogram. The framework is tested on teleseismic and regional p-wave arrivals at the IMS (International Monitoring System) station in Warramunga, NT, Australia and the PNSU station in University of Utah's monitoring network.

More Details

Modeling Complex Relationships in Large-Scale Data using Hypergraphs (LDRD Final Report)

Dunlavy, Daniel D.; Wang, Fulton W.; Wolf, Michael W.; Ellingwood, Nathan D.

This SAND report documents the findings of the LDRD project, "Modeling Complex Relationships in Large-Scale Data using Hypergraphs". The project ran from October 2017 through September 2019. The focus of the project was the development and application of hypergraph data analytics to Sandia relational data applications. In this project, we attempted to apply a hypergraph data analysis method—specifically, hypergraph eigenvector centrality—to Sandia mission problems to identify influential entities (people, location, times, etc.) in the data. Unfortunately, the application data led to graph and hypergraph representations containing disconnected components. To date, there are no well-established techniques for applying eigenvector centrality to such graphs and hypergraphs. In this report, we present several heuristics for computing eigenvector centrality for disconnected graphs. We believe this is an important start to understanding how to approach the similar problem for hypergraphs, but this project concluded before we made progress on that problem. The ideas, methods, and suggestions presented here can be used for further research into this challenging problem. We also present our ideas for generating graphs with known degree and centrality distributions. The goal in presenting this work is to identify a procedure for analyzing such graphs once the problem of addressing disconnected components has been addressed. When working with a single data set, this generator can be used to create many instances of graphs that can be used to analyze the robustness of the centrality computations for the original data set. Although the results did not match perfectly in the case of the Facebook Ego dataset used in the experiments presented here, this again represents a good start in the direction of a graph generator for such problems. We note that there are potential trade-offs between how the degree and centrality distributions are fit to the original data and suggested several possible avenues for follow-on research efforts.

More Details
8 Results
8 Results