Publications

9 Results

Search results

Jump to search filters

Test and Evaluation of Systems with Embedded Machine Learning Components

ITEA Journal of Test and Evaluation

Smith, Michael R.; Cueller, Christopher R.; Jose, Deepu J.; Ingram, Joey; Martinez, Carianne M.; Debonis, Mark

As Machine Learning (ML) continues to advance, it is being integrated into more systems. Often, the ML component represents a significant portion of the system that reduces the burden on the end user or significantly improves task performance. However, the ML component represents an unknown complex phenomenon that is learned from collected data without the need to be explicitly programmed. Despite the improvement in task performance, the models are often black boxes. Evaluating the credibility and the vulnerabilities of ML models poses a gap in current test and evaluation practice. For high consequence applications, the lack of testing and evaluation procedures represents a significant source of uncertainty and risk. To help reduce that risk, here we present considerations to evaluate systems embedded with an ML component within a red-teaming inspired methodology. We focus on (1) cyber vulnerabilities to an ML model, (2) evaluating performance gaps, and (3) adversarial ML vulnerabilities.

More Details

SAGE Intrusion Detection System: Sensitivity Analysis Guided Explainability for Machine Learning

Smith, Michael R.; Laros, James H.; Ames, Arlo L.; Carey, Alycia N.; Cueller, Christopher R.; Field, Richard V.; Maxfield, Trevor; Mitchell, Scott A.; Morris, Elizabeth S.; Moss, Blake C.; Nyre-Yu, Megan N.; Rushdi, Ahmad R.; Stites, Mallory C.; Smutz, Charles S.; Zhou, Xin Z.

This report details the results of a three-fold investigation of sensitivity analysis (SA) for machine learning (ML) explainability (MLE): (1) the mathematical assessment of the fidelity of an explanation with respect to a learned ML model, (2) quantifying the trustworthiness of a prediction, and (3) the impact of MLE on the efficiency of end-users through multiple users studies. We focused on the cybersecurity domain as the data is inherently non-intuitive. As ML is being using in an increasing number of domains, including domains where being wrong can elicit high consequences, MLE has been proposed as a means of generating trust in a learned ML models by end users. However, little analysis has been performed to determine if the explanations accurately represent the target model and they themselves should be trusted beyond subjective inspection. Current state-of-the-art MLE techniques only provide a list of important features based on heuristic measures and/or make certain assumptions about the data and the model which are not representative of the real-world data and models. Further, most are designed without considering the usefulness by an end-user in a broader context. To address these issues, we present a notion of explanation fidelity based on Shapley values from cooperative game theory. We find that all of the investigated MLE explainability methods produce explanations that are incongruent with the ML model that is being explained. This is because they make critical assumptions about feature independence and linear feature interactions for computational reasons. We also find that in deployed, explanations are rarely used due to a variety of reason including that there are several other tools which are trusted more than the explanations and there is little incentive to use the explanations. In the cases when the explanations are used, we found that there is the danger that explanations persuade the end users to wrongly accept false positives and false negatives. However, ML model developers and maintainers find the explanations more useful to help ensure that the ML model does not have obvious biases. In light of these findings, we suggest a number of future directions including developing MLE methods that directly model non-linear model interactions and including design principles that take into account the usefulness of explanations to the end user. We also augment explanations with a set of trustworthiness measures that measure geometric aspects of the data to determine if the model output should be trusted.

More Details

Creating a User-Centric Data Flow Visualization: A Case Study

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Butler, Karin B.; Leger, Michelle A.; Bueno, Denis B.; Cueller, Christopher R.; Haass, Michael J.; Loffredo, Timothy; Reedy, Geoffrey E.; Tuminaro, Julian T.

Vulnerability analysts protecting software lack adequate tools for understanding data flow in binaries. We present a case study in which we used human factors methods to develop a taxonomy for understanding data flow and the visual representations needed to support decision making for binary vulnerability analysis. Using an iterative process, we refined and evaluated the taxonomy by generating three different data flow visualizations for small binaries, trained an analyst to use these visualizations, and tested the utility of the visualizations for answering data flow questions. Throughout the process and with minimal training, analysts were able to use the visualizations to understand data flow related to security assessment. Our results indicate that the data flow taxonomy is promising as a mechanism for improving analyst understanding of data flow in binaries and for supporting efficient decision making during analysis.

More Details

Creating an Interprocedural Analyst-Oriented Data Flow Representation for Binary Analysts (CIAO)

Leger, Michelle A.; Butler, Karin B.; Bueno, Denis B.; Crepeau, Matthew; Cueller, Christopher R.; Godwin, Alex; Haass, Michael J.; Loffredo, Timothy; Mangal, Ravi; Matzen, Laura E.; Nguyen, Vivian; Orso, Alessandro; Reedy, Geoffrey E.; Stasko, John T.; Stites, Mallory C.; Tuminaro, Julian T.; Wilson, Andrew T.

National security missions require understanding third-party software binaries, a key element of which is reasoning about how data flows through a program. However, vulnerability analysts protecting software lack adequate tools for understanding data flow in binaries. To reduce the human time burden for these analysts, we used human factors methods in a rolling discovery process to derive user-centric visual representation requirements. We encountered three main challenges: analysis projects span weeks, analysis goals significantly affect approaches and required knowledge, and analyst tools, techniques, conventions, and prioritization are based on personal preference. To address these challenges, we initially focused our human factors methods on an attack surface characterization task. We generalized our results using a two-stage modified sorting task, creating requirements for a data flow visualization. We implemented these requirements partially in manual static visualizations, which we informally evaluated, and partially in automatically generated interactive visualizations, which have yet to be integrated into workflows for evaluation. Our observations and results indicate that 1) this data flow visualization has the potential to enable novel code navigation, information presentation, and information sharing, and 2) it is an excellent time to pursue research applying human factors methods to binary analysis workflows.

More Details

Collaborative analytics for biological facility characterization

Proceedings of SPIE - The International Society for Optical Engineering

Caswell, Jacob C.; Cairns, Kelsey L.; Ting, Christina T.; Hansberger, Mark W.; Stoebner, Matthew A.; Brounstein, Tom R.; Cueller, Christopher R.; Jurrus, Elizabeth R.

Thousands of facilities worldwide are engaged in biological research activities. One of DTRA's missions is to fully understand the types of facilities involved in collecting, investigating, and storing biological materials. This characterization enables DTRA to increase situational awareness and identify potential partners focused on biodefense and biosecurity. As a result of this mission, DTRA created a database to identify biological facilities from publicly available, open-source information. This paper describes an on-going effort to automate data collection and entry of facilities into this database. To frame our analysis more concretely, we consider the following motivating question: How would a decision maker respond to a pathogen outbreak during the 2018 Winter Olympics in South Korea? To address this question, we aim to further characterize the existing South Korean facilities in DTRA's database, and to identify new candidate facilities for entry, so that decision makers can identify local facilities properly equipped to assist and respond to an event. We employ text and social analytics on bibliometric data from South Korean facilities and a list of select pathogen agents to identify patterns and relationships within scientific publication graphs.

More Details
9 Results
9 Results