Page 2 – Center for Computing Research (CCR)

Using Machine Learning in Adversarial Environments

Davis, Warren L.; Dunlavy, Daniel D.; Vorobeychik, Yevgeniy V.; Butler, Karin B.; Forsythe, Chris F.; Letter, Matthew L.; Murchison, Nicole M.; Nauer, Kevin S.

Cyber defense is an asymmetric battle today. We need to understand better what options are available for providing defenders with possible advantages. Our project combines machine learning, optimization, and game theory to obscure our defensive posture from the information the adversaries are able to observe. The main conceptual contribution of this research is to separate the problem of prediction, for which machine learning is used, and the problem of computing optimal operational decisions based on such predictions, coup led with a model of adversarial response. This research includes modeling of the attacker and defender, formulation of useful optimization models for studying adversarial interactions, and user studies to meas ure the impact of the modeling approaches in re alistic settings.

More Details

TYPE SAND Report YEAR 2016

OSTI DOI

Development of Machine Learning Models for Turbulent Wall Pressure Fluctuations

Barone, Matthew F.; Ling, Julia L.; Davis, Warren L.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Using Machine Learning in Adversarial Environments

Davis, Warren L.

Intrusion/anomaly detection systems are among the first lines of cyber defense. Commonly, they either use signatures or machine learning (ML) to identify threats, but fail to account for sophisticated attackers trying to circumvent them. We propose to embed machine learning within a game theoretic framework that performs adversarial modeling, develops methods for optimizing operational response based on ML, and integrates the resulting optimization codebase into the existing ML infrastructure developed by the Hybrid LDRD. Our approach addresses three key shortcomings of ML in adversarial settings: 1) resulting classifiers are typically deterministic and, therefore, easy to reverse engineer; 2) ML approaches only address the prediction problem, but do not prescribe how one should operationalize predictions, nor account for operational costs and constraints; and 3) ML approaches do not model attackers’ response and can be circumvented by sophisticated adversaries. The principal novelty of our approach is to construct an optimization framework that blends ML, operational considerations, and a model predicting attackers reaction, with the goal of computing optimal moving target defense. One important challenge is to construct a realistic model of an adversary that is tractable, yet realistic. We aim to advance the science of attacker modeling by considering game-theoretic methods, and by engaging experimental subjects with red teaming experience in trying to actively circumvent an intrusion detection system, and learning a predictive model of such circumvention activities. In addition, we will generate metrics to test that a particular model of an adversary is consistent with available data.

More Details

TYPE Other Report YEAR 2016

OSTI DOI

The Hybrid Toolkit for Cybersecurity Analytics

Davis, Warren L.; Nebergall, Christopher N.; Han, Eunsil H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Grandmaster: Interactive Text-Based Analytics of Social Media

Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015

Fabian, Nathan D.; Davis, Warren L.; Raybourn, Elaine M.; Lakkaraju, Kiran L.; Whetzel, Jonathan H.

More Details

TYPE Conference Poster YEAR 2016

Scopus OSTI DOI

Sandia MICrONS Phase 1 kickoff slides

Chance, Frances S.; Aimone, James B.; Carlson, Kristofor D.; Davis, Warren L.; Shead, Timothy M.; Vineyard, Craig M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

Grandmaster: Interactive text-based analytics of social media

Fabian, Nathan D.; Davis, Warren L.; Raybourn, Elaine M.; Lakkaraju, Kiran L.; Whetzel, Jonathan H.

More Details

TYPE Conference Poster YEAR 2015

OSTI DOI

Active Learning in the Era of Big Data

Jamieson, Kevin J.; Davis, Warren L.

Active learning methods automatically adapt data collection by selecting the most informative samples in order to accelerate machine learning. Because of this, real-world testing and comparing active learning algorithms requires collecting new datasets (adaptively), rather than simply applying algorithms to benchmark datasets, as is the norm in (passive) machine learning research. To facilitate the development, testing and deployment of active learning for real applications, we have built an open-source software system for large-scale active learning research and experimentation. The system, called NEXT, provides a unique platform for realworld, reproducible active learning research. This paper details the challenges of building the system and demonstrates its capabilities with several experiments. The results show how experimentation can help expose strengths and weaknesses of active learning algorithms, in sometimes unexpected and enlightening ways.

More Details

TYPE Other Report YEAR 2015

OSTI DOI

Data privacy and security considerations for personal assistantsfor learning (PAL)

International Conference on Intelligent User Interfaces, Proceedings IUI

Raybourn, Elaine M.; Fabian, Nathan D.; Davis, Warren L.; Parks, Raymond C.; McClain, Jonathan T.; Trumbo, Derek T.; Regan, Damon; Durlach, Paula J.

A hypothetical scenario is utilized to explore privacy and security considerations for intelligent systems, such as a Personal Assistant for Learning (PAL). Two categories of potential concerns are addressed: factors facilitated by user models, and factors facilitated by systems. Among the strategies presented for risk mitigation is a call for ongoing, iterative dialog among privacy, security, and personalization researchers during all stages of development, testing, and deployment.

More Details

TYPE Conference Poster YEAR 2015

Scopus OSTI

Machine Learning in Adversarial Environments

Davis, Warren L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Hybrid methods for cybersecurity analysis :

Davis, Warren L.; Dunlavy, Daniel D.

Early 2010 saw a signi cant change in adversarial techniques aimed at network intrusion: a shift from malware delivered via email attachments toward the use of hidden, embedded hyperlinks to initiate sequences of downloads and interactions with web sites and network servers containing malicious software. Enterprise security groups were well poised and experienced in defending the former attacks, but the new types of attacks were larger in number, more challenging to detect, dynamic in nature, and required the development of new technologies and analytic capabilities. The Hybrid LDRD project was aimed at delivering new capabilities in large-scale data modeling and analysis to enterprise security operators and analysts and understanding the challenges of detection and prevention of emerging cybersecurity threats. Leveraging previous LDRD research e orts and capabilities in large-scale relational data analysis, large-scale discrete data analysis and visualization, and streaming data analysis, new modeling and analysis capabilities were quickly brought to bear on the problems in email phishing and spear phishing attacks in the Sandia enterprise security operational groups at the onset of the Hybrid project. As part of this project, a software development and deployment framework was created within the security analyst work ow tool sets to facilitate the delivery and testing of new capabilities as they became available, and machine learning algorithms were developed to address the challenge of dynamic threats. Furthermore, researchers from the Hybrid project were embedded in the security analyst groups for almost a full year, engaged in daily operational activities and routines, creating an atmosphere of trust and collaboration between the researchers and security personnel. The Hybrid project has altered the way that research ideas can be incorporated into the production environments of Sandias enterprise security groups, reducing time to deployment from months and years to hours and days for the application of new modeling and analysis capabilities to emerging threats. The development and deployment framework has been generalized into the Hybrid Framework and incor- porated into several LDRD, WFO, and DOE/CSL projects and proposals. And most importantly, the Hybrid project has provided Sandia security analysts with new, scalable, extensible analytic capabilities that have resulted in alerts not detectable using their previous work ow tool sets.

More Details

TYPE SAND Report YEAR 2014

OSTI DOI

Incremental learning for automated knowledge capture

Davis, Warren L.; Dixon, Kevin R.; Martin, Nathaniel M.; Wendt, Jeremy D.

People responding to high-consequence national-security situations need tools to help them make the right decision quickly. The dynamic, time-critical, and ever-changing nature of these situations, especially those involving an adversary, require models of decision support that can dynamically react as a situation unfolds and changes. Automated knowledge capture is a key part of creating individualized models of decision making in many situations because it has been demonstrated as a very robust way to populate computational models of cognition. However, existing automated knowledge capture techniques only populate a knowledge model with data prior to its use, after which the knowledge model is static and unchanging. In contrast, humans, including our national-security adversaries, continually learn, adapt, and create new knowledge as they make decisions and witness their effect. This artificial dichotomy between creation and use exists because the majority of automated knowledge capture techniques are based on traditional batch machine-learning and statistical algorithms. These algorithms are primarily designed to optimize the accuracy of their predictions and only secondarily, if at all, concerned with issues such as speed, memory use, or ability to be incrementally updated. Thus, when new data arrives, batch algorithms used for automated knowledge capture currently require significant recomputation, frequently from scratch, which makes them ill suited for use in dynamic, timecritical, high-consequence decision making environments. In this work we seek to explore and expand upon the capabilities of dynamic, incremental models that can adapt to an ever-changing feature space.

More Details

TYPE SAND Report YEAR 2013

OSTI DOI