Publications

21 Results
Skip to search filters

Tracking Cyber Adversaries with Adaptive Indicators of Compromise

Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017

Doak, Justin E.; Ingram, Joey; Mulder, Samuel A.; Naegle, John H.; Cox, Jonathan A.; Aimone, James B.; Dixon, Kevin R.; James, Conrad D.; Follett, David R.

A forensics investigation after a breach often uncovers network and host indicators of compromise (IOCs) that can be deployed to sensors to allow early detection of the adversary in the future. Over time, the adversary will change tactics, techniques, and procedures (TTPs), which will also change the data generated. If the IOCs are not kept up-to-date with the adversary's new TTPs, the adversary will no longer be detected once all of the IOCs become invalid. Tracking the Known (TTK) is the problem of keeping IOCs, in this case regular expression (regexes), up-to-date with a dynamic adversary. Our framework solves the TTK problem in an automated, cyclic fashion to bracket a previously discovered adversary. This tracking is accomplished through a data-driven approach of self-adapting a given model based on its own detection capabilities.In our initial experiments, we found that the true positive rate (TPR) of the adaptive solution degrades much less significantly over time than the naïve solution, suggesting that self-updating the model allows the continued detection of positives (i.e., adversaries). The cost for this performance is in the false positive rate (FPR), which increases over time for the adaptive solution, but remains constant for the naïve solution. However, the difference in overall detection performance, as measured by the area under the curve (AUC), between the two methods is negligible. This result suggests that self-updating the model over time should be done in practice to continue to detect known, evolving adversaries.

More Details

Incremental learning for automated knowledge capture

Davis, Warren L.; Dixon, Kevin R.; Martin, Nathaniel M.; Wendt, Jeremy D.

People responding to high-consequence national-security situations need tools to help them make the right decision quickly. The dynamic, time-critical, and ever-changing nature of these situations, especially those involving an adversary, require models of decision support that can dynamically react as a situation unfolds and changes. Automated knowledge capture is a key part of creating individualized models of decision making in many situations because it has been demonstrated as a very robust way to populate computational models of cognition. However, existing automated knowledge capture techniques only populate a knowledge model with data prior to its use, after which the knowledge model is static and unchanging. In contrast, humans, including our national-security adversaries, continually learn, adapt, and create new knowledge as they make decisions and witness their effect. This artificial dichotomy between creation and use exists because the majority of automated knowledge capture techniques are based on traditional batch machine-learning and statistical algorithms. These algorithms are primarily designed to optimize the accuracy of their predictions and only secondarily, if at all, concerned with issues such as speed, memory use, or ability to be incrementally updated. Thus, when new data arrives, batch algorithms used for automated knowledge capture currently require significant recomputation, frequently from scratch, which makes them ill suited for use in dynamic, timecritical, high-consequence decision making environments. In this work we seek to explore and expand upon the capabilities of dynamic, incremental models that can adapt to an ever-changing feature space.

More Details

Discriminative feature-rich models for syntax-based machine translation

Dixon, Kevin R.

This report describes the campus executive LDRD %E2%80%9CDiscriminative Feature-Rich Models for Syntax-Based Machine Translation,%E2%80%9D which was an effort to foster a better relationship between Sandia and Carnegie Mellon University (CMU). The primary purpose of the LDRD was to fund the research of a promising graduate student at CMU; in this case, Kevin Gimpel was selected from the pool of candidates. This report gives a brief overview of Kevin Gimpel's research.

More Details

COMET: A recipe for learning and using large ensembles on massive data

Proceedings - IEEE International Conference on Data Mining, ICDM

Basilico, Justin D.; Munson, M.A.; Kolda, Tamara G.; Dixon, Kevin R.; Kegelmeyer, William P.

COMET is a single-pass MapReduce algorithm for learning on large-scale data. It builds multiple random forest ensembles on distributed blocks of data and merges them into a mega-ensemble. This approach is appropriate when learning from massive-scale data that is too large to fit on a single machine. To get the best accuracy, IVoting should be used instead of bagging to generate the training subset for each decision tree in the random forest. Experiments with two large datasets (5GB and 50GB compressed) show that COMET compares favorably (in both accuracy and training time) to learning on a subsample of data using a serial algorithm. Finally, we propose a new Gaussian approach for lazy ensemble evaluation which dynamically decides how many ensemble members to evaluate per data point; this can reduce evaluation cost by 100X or more. © 2011 IEEE.

More Details

Optical requirements with turbulence correction for long-range biometrics

Proceedings of SPIE - The International Society for Optical Engineering

Choi, Junoh; Soehnel, Grant H.; Bagwell, Brett B.; Dixon, Kevin R.; Wick, David V.

Iris recognition utilizes distinct patterns found in the human iris to perform identification. Image acquisition is a critical first step towards successful operation of iris recognition systems. However, the quality of iris images required by standard iris recognition algorithms puts hard constraints on the imaging optical systems which have resulted in demonstrated systems to date requiring a relatively short subject stand-off distance. In this paper, we study long-range iris recognition at distances as large as 200 meters, and determine conditions the imaging system must satisfy for identification at longer stand-off distances. © 2009 SPIE.

More Details

Modeling an unstructured driving domain: A comparison of two cognitive frameworks

2008 BRIMS Conference - Behavior Representation in Modeling and Simulation

Best, Bradley J.; Dixon, Kevin R.; Speed, Ann; Fleetwood, Michael D.

This paper outlines a comparison between two cognitive modeling frameworks: Atomic Components of Thought - Rational (ACT-R; Anderson & Lebiere, 1998) and a framework under development at Sandia National Laboratories. Both frameworks are based on the cognitive psychological literature, although they represent different theoretical perspectives on cognition, with ACT-R being a production-rule-based system and the Sandia framework being a dynamical-systems or connectionist-type approach. This comparison involved a complex driving domain in which both the car being driven and the driver were equipped with sensors that provided information to each framework. The output of each framework was a classification of the real-world situation that the driver was in, e.g., being overtaken on the autobahn. Comparisons between the two frameworks included validation against human ratings of the driving situations via videotapes of driving sessions, along with twelve creation and performance metrics regarding the method and ease of framework population, processor requirements, and maximum real-time data sampling rate.

More Details

Supervised machine learning for modeling human recognition of vehicle-driving situations

2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS

Dixon, Kevin R.; Lippitt, Carl E.; Forsythe, James C.

A classification system is developed to identify driving situations from labeled examples of previous occurrences. The purpose of the classifier is to provide physical context to a separate system that mitigates unnecessary distractions, allowing the driver to maintain focus during periods of high difficulty. While watching videos of driving, we asked different users to indicate their perceptions of the current situation. We then trained a classifier to emulate the human recognition of driving situations using the Sandia Cognitive Framework. In unstructured conditions, such as driving in urban areas and the German autobahn, the classifier was able to correctly predict human perceptions of driving situations over 95% of the time. This paper focuses on the learning algorithms used to train the driving-situation classifier. Future work will reduce the human efforts needed to train the system. © 2005 IEEE.

More Details
21 Results
21 Results