Publications

Results 1–25 of 56

Search results

Jump to search filters

Bayesian Networks for Interpretable Cyberattack Detection

Proceedings of the Annual Hawaii International Conference on System Sciences

Yang, Barnett; Hoffman, Matthew J.; Brown, Nathanael J.

The challenge of cyberattack detection can be illustrated by the complexity of the MITRE ATT&CKTM matrix, which catalogues >200 attack techniques (most with multiple sub-techniques). To reliably detect cyberattacks, we propose an evidence-based approach which fuses multiple cyber events over varying time periods to help differentiate normal from malicious behavior. We use Bayesian Networks (BNs) - probabilistic graphical models consisting of a set of variables and their conditional dependencies - for fusion/classification due to their interpretable nature, ability to tolerate sparse or imbalanced data, and resistance to overfitting. Our technique utilizes a small collection of expert-informed cyber intrusion indicators to create a hybrid detection system that combines data-driven training with expert knowledge to form a host-based intrusion detection system (HIDS). We demonstrate a software pipeline for efficiently generating and evaluating various BN classifier architectures for specific datasets and discuss explainability benefits thereof.

More Details

Automated EWMA Anomaly Detection Pipeline

Proceedings of the American Control Conference

Gilletly, Samuel G.; Cauthen, Katherine R.; Mott, Joshua; Brown, Nathanael J.

There is a need to perform offline anomaly detection in count data streams to simultaneously identify both systemic changes and outliers, simultaneously. We propose a new algorithmic method, called the Anomaly Detection Pipeline, which leverages common statistical process control procedures in a novel way to accomplish this. The method we propose does not require user-defined control or phase I training data, automatically identifying regions of stability for improved parameter estimation to support change point detection. The method does not require data to be normally distributed, and it detects outliers relative to the regimes in which they occur. Our proposed method performs comparably to state-of-the-art change point detection methods, provides additional capabilities, and is extendable to a larger set of possible data streams than known methods.

More Details

Comparison of distribution selection methods

Communications in Statistics: Simulation and Computation

Chiew, Esther; Cauthen, Katherine R.; Brown, Nathanael J.; Nozick, Linda K.

Many methods have been suggested to choose between distributions. There has been relatively less study to examine whether these methods accurately recover the distributions being studied. Hence, this research compares several popular distribution selection methods through a Monte Carlo simulation study and identifies which are robust for several types of discrete probability distributions. In addition, we study whether it matters that the distribution selection method does not accurately pick the correct probability distribution by calculating the expected distance, which is the amount of information lost for each distribution selection method compared to the generating probability distribution.

More Details

Detecting Communities and Attributing Purpose to Human Mobility Data

Proceedings - Winter Simulation Conference

John, Esther W.; Cauthen, Katherine R.; Brown, Nathanael J.; Nozick, Linda K.

Many individuals' mobility can be characterized by strong patterns of regular movements and is influenced by social relationships. Social networks are also often organized into overlapping communities which are associated in time or space. We develop a model that can generate the structure of a social network and attribute purpose to individuals' movements, based solely on records of individuals' locations over time. This model distinguishes the attributed purpose of check-ins based on temporal and spatial patterns in check-in data. Because a location-based social network dataset with authoritative ground-truth to test our entire model does not exist, we generate large scale datasets containing social networks and individual check-in data to test our model. We find that our model reliably assigns community purpose to social check-in data, and is robust over a variety of different situations.

More Details

A Minimally Supervised Event Detection Method

Lecture Notes in Networks and Systems

Hoffman, Matthew J.; Bussell, Sammy J.; Brown, Nathanael J.

Solving classification problems with machine learning often entails laborious manual labeling of test data, requiring valuable time from a subject matter expert (SME). This process can be even more challenging when each sample is multidimensional. In the case of an anomaly detection system, a standard two-class problem, the dataset is likely imbalanced with few anomalous observations and many “normal” observations (e.g., credit card fraud detection). We propose a unique methodology that quickly identifies individual samples for SME tagging while automatically classifying commonly occurring samples as normal. In order to facilitate such a process, the relationships among the dimensions (or features) must be easily understood by both the SME and system architects such that tuning of the system can be readily achieved. The resulting system demonstrates how combining human knowledge with machine learning can create an interpretable classification system with robust performance.

More Details

Heuristic approach to Satellite Range Scheduling with Bounds using Lagrangian Relaxation

IEEE Systems Journal

Brown, Nathanael J.; Arguello, Bryan A.; Nozick, Linda K.; Xu, Ningxiong

Here, this paper focuses on scheduling antennas to track satellites using a novel heuristic method. The objectives pursued in developing a schedule are two-fold: (1) minimize the priority weighted number of time periods that satellites are not tracked; and (2) equalize the percent of time each satellite is uncovered. The heuristic method is a population-based local search tailored to the unique characteristics of this problem. In order to validate the performance of the heuristic, bounds are developed using Lagrangian relaxation. The heuristic method and the bounds are applied to several test problems. In all cases, the heuristic identifies a solution that is better than the upper bound and is generally quite close (but obviously larger) than the lower bound with about an order of magnitude reduction in computation time. Lastly, a comparison with CPLEX 12.7 is provided.

More Details
Results 1–25 of 56
Results 1–25 of 56