Publications / SAND Report

Preliminary Results on Applying Nonparametric Clustering and Bayesian Consensus Clustering Methods to Multimodal Data

Chen, Maximillian G.; Darling, Michael C.; Stracuzzi, David J.

In this report, we present preliminary research into nonparametric clustering methods for multi-source imagery data and quantifying the performance of these models. In many domain areas, data sets do not necessarily follow well-defined and well-known probability distributions, such as the normal, gamma, and exponential. This is especially true when combining data from multiple sources describing a common set of objects (which we call multimodal analysis), where the data in each source can follow different distributions and need to be analyzed in conjunction with one another. This necessitates nonparametric den- sity estimation methods, which allow the data to better dictate the distribution of the data. One prominent example of multimodal analysis is multimodal image analysis, when we an- alyze multiple images taken using different radar systems of the same scene of interest. We develop uncertainty analysis methods, which are inherent in the use of probabilistic models but often not taken advance of, to assess the performance of probabilistic clustering methods used for analyzing multimodal images. This added information helps assess model perfor- mance and how much trust decision-makers should have in the obtained analysis results. The developed methods illustrate some ways in which uncertainty can inform decisions that arise when designing and using machine learning models. Acknowledgements This work was funded by the Sandia National Laboratories Laboratory Directed Research and Development (LDRD) program.