Publications

Results 1–50 of 64
Skip to search filters

Exploring Explicit Uncertainty for Binary Analysis (EUBA)

Leger, Michelle A.; Darling, Michael C.; Jones, Stephen T.; Matzen, Laura E.; Stracuzzi, David J.; Wilson, Andrew T.; Bueno, Denis B.; Christentsen, Matthew C.; Ginaldi, Melissa J.; Hannasch, David A.; Heidbrink, Scott H.; Howell, Breannan C.; Leger, Chris; Reedy, Geoffrey E.; Rogers, Alisa N.; Williams, Jack A.

Reverse engineering (RE) analysts struggle to address critical questions about the safety of binary code accurately and promptly, and their supporting program analysis tools are simply wrong sometimes. The analysis tools have to approximate in order to provide any information at all, but this means that they introduce uncertainty into their results. And those uncertainties chain from analysis to analysis. We hypothesize that exposing sources, impacts, and control of uncertainty to human binary analysts will allow the analysts to approach their hardest problems with high-powered analytic techniques that they know when to trust. Combining expertise in binary analysis algorithms, human cognition, uncertainty quantification, verification and validation, and visualization, we pursue research that should benefit binary software analysis efforts across the board. We find a strong analogy between RE and exploratory data analysis (EDA); we begin to characterize sources and types of uncertainty found in practice in RE (both in the process and in supporting analyses); we explore a domain-specific focus on uncertainty in pointer analysis, showing that more precise models do help analysts answer small information flow questions faster and more accurately; and we test a general population with domain-general sudoku problems, showing that adding "knobs" to an analysis does not significantly slow down performance. This document describes our explorations in uncertainty in binary analysis.

More Details

AI-Enhanced Co-Design for Next-Generation Microelectronics: Innovating Innovation (Workshop Report)

Descour, Michael R.; Tsao, Jeffrey Y.; Stracuzzi, David J.; Wakeland, Anna K.; Schultz, David R.; Smith, William S.; Weeks, Jacquilyn A.

On April 6-8, 2021, Sandia National Laboratories hosted a virtual workshop to explore the potential for developing AI-Enhanced Co-Design for Next-Generation Microelectronics (AICoM). The workshop brought together two themes. The first theme was articulated in the 2018 Department of Energy Office of Science (DOE SC) “Basic Research Needs for Microelectronics” (BRN) report, which called for a “fundamental rethinking” of the traditional design approach to microelectronics, in which subject matter experts (SMEs) in each microelectronics discipline (materials, devices, circuits, algorithms, etc.) work near-independently. Instead, the BRN called for a non-hierarchical, egalitarian vision of co-design, wherein “each scientific discipline informs and engages the others” in “parallel but intimately networked efforts to create radically new capabilities.” The second theme was the recognition of the continuing breakthroughs in artificial intelligence (AI) that are currently enhancing and accelerating the solution of traditional design problems in materials science, circuit design, and electronic design automation (EDA).

More Details

Robustness and Validation of Model and Digital Twins Deployment

Volkova, Svitana V.; Stracuzzi, David J.; Shafer, Jenifer S.; Ray, Jaideep R.; Pullum, Laura P.

For digital twins (DTs) to become a central fixture in mission critical systems, a better understanding is required of potential modes of failure, quantification of uncertainty, and the ability to explain a model’s behavior. These aspects are particularly important as the performance of a digital twin will evolve during model development and deployment for real-world operations.

More Details

Generating uncertainty distributions for seismic signal onset times

Bulletin of the Seismological Society of America

Peterson, Matthew G.; Vollmer, Charles V.; Brogan, Ronald; Stracuzzi, David J.; Young, Christopher J.

Signal arrival-time estimation plays a critical role in a variety of downstream seismic analy-ses, including location estimation and source characterization. Any arrival-time errors propagate through subsequent data-processing results. In this article, we detail a general framework for refining estimated seismic signal arrival times along with full estimation of their associated uncertainty. Using the standard short-term average/long-term average threshold algorithm to identify a search window, we demonstrate how to refine the pick estimate through two different approaches. In both cases, new waveform realizations are generated through bootstrap algorithms to produce full a posteriori estimates of uncertainty of onset arrival time of the seismic signal. The onset arrival uncertainty estimates provide additional data-derived information from the signal and have the potential to influence seismic analysis along several fronts.

More Details

Quantifying Uncertainty to Improve Decision Making in Machine Learning

Stracuzzi, David J.; Darling, Michael C.; Peterson, Matthew G.; Chen, Maximillian G.

Data-driven modeling, including machine learning methods, continue to play an increas- ing role in society. Data-driven methods impact decision making for applications ranging from everyday determinations about which news people see and control of self-driving cars to high-consequence national security situations related to cyber security and analysis of nuclear weapons reliability. Although modern machine learning methods have made great strides in model induction and show excellent performance in a broad variety of complex domains, uncertainty remains an inherent aspect of any data-driven model. In this report, we provide an update to the preliminary results on uncertainty quantifi- cation for machine learning presented in SAND2017-6776. Specifically, we improve upon the general problem definition and expand upon the experiments conducted for the earlier re- port. Most importantly, we summarize key lessons learned about how and when uncertainty quantification can inform decision making and provide valuable insights into the quality of learned models and potential improvements to them. Acknowledgements The authors thank Kristina Czuchlewski, John Feddema, Todd Jones, Chris Young, Rudy Garcia, Rich Field, Ann Speed, Randy Brost, Stephen Dauphin, and countless others for providing helpful discussion and comments throughout the life of this project. This work was funded by the Sandia National Laboratories Laboratory Directed Research and Development (LDRD) program.

More Details

Preliminary Results on Applying Nonparametric Clustering and Bayesian Consensus Clustering Methods to Multimodal Data

Chen, Maximillian G.; Darling, Michael C.; Stracuzzi, David J.

In this report, we present preliminary research into nonparametric clustering methods for multi-source imagery data and quantifying the performance of these models. In many domain areas, data sets do not necessarily follow well-defined and well-known probability distributions, such as the normal, gamma, and exponential. This is especially true when combining data from multiple sources describing a common set of objects (which we call multimodal analysis), where the data in each source can follow different distributions and need to be analyzed in conjunction with one another. This necessitates nonparametric den- sity estimation methods, which allow the data to better dictate the distribution of the data. One prominent example of multimodal analysis is multimodal image analysis, when we an- alyze multiple images taken using different radar systems of the same scene of interest. We develop uncertainty analysis methods, which are inherent in the use of probabilistic models but often not taken advance of, to assess the performance of probabilistic clustering methods used for analyzing multimodal images. This added information helps assess model perfor- mance and how much trust decision-makers should have in the obtained analysis results. The developed methods illustrate some ways in which uncertainty can inform decisions that arise when designing and using machine learning models. Acknowledgements This work was funded by the Sandia National Laboratories Laboratory Directed Research and Development (LDRD) program.

More Details

Data-driven uncertainty quantification for multisensor analytics

Proceedings of SPIE - The International Society for Optical Engineering

Stracuzzi, David J.; Darling, Michael C.; Chen, Maximillian G.; Peterson, Matthew G.

We discuss uncertainty quantification in multisensor data integration and analysis, including estimation methods and the role of uncertainty in decision making and trust in automated analytics. The challenges associated with automatically aggregating information across multiple images, identifying subtle contextual cues, and detecting small changes in noisy activity patterns are well-established in the intelligence, surveillance, and reconnaissance (ISR) community. In practice, such questions cannot be adequately addressed with discrete counting, hard classifications, or yes/no answers. For a variety of reasons ranging from data quality to modeling assumptions to inadequate definitions of what constitutes "interesting" activity, variability is inherent in the output of automated analytics, yet it is rarely reported. Consideration of these uncertainties can provide nuance to automated analyses and engender trust in their results. In this work, we assert the importance of uncertainty quantification for automated data analytics and outline a research agenda. We begin by defining uncertainty in the context of machine learning and statistical data analysis, identify its sources, and motivate the importance and impact of its quantification. We then illustrate these issues and discuss methods for data-driven uncertainty quantification in the context of a multi-source image analysis example. We conclude by identifying several specific research issues and by discussing the potential long-term implications of uncertainty quantification for data analytics, including sensor tasking and analyst trust in automated analytics.

More Details
Results 1–50 of 64
Results 1–50 of 64