Moving target defenses (MTDs) are widely used as an active defense strategy for thwarting cyberattacks on cyber-physical systems by increasing diversity of software and network paths. Recently, machine Learning (ML) and deep Learning (DL) models have been demonstrated to defeat some of the cyber defenses by learning attack detection patterns and defense strategies. It raises concerns about the susceptibility of MTD to ML and DL methods. In this article, we analyze the effectiveness of ML and DL models when it comes to deciphering MTD methods and ultimately evade MTD-based protections in real-time systems. Specifically, we consider a MTD algorithm that periodically randomizes address assignments within the MIL-STD-1553 protocol - a military standard serial data bus. Two ML and DL-based tasks are performed on MIL-STD-1553 protocol to measure the effectiveness of the learning models in deciphering the MTD algorithm: 1) determining whether there is an address assignments change i.e., whether the given system employs a MTD protocol and if it does 2) predicting the future address assignments. The supervised learning models (random forest and k-nearest neighbors) effectively detected the address assignment changes and classified whether the given system is equipped with a specified MTD protocol. On the other hand, the unsupervised learning model (K-means) was significantly less effective. The DL model (long short-term memory) was able to predict the future addresses with varied effectiveness based on MTD algorithm's settings.
Network intrusion detection systems (NIDS) are commonly used to detect malware communications, including command-and-control (C2) traffic from botnets. NIDS performance assessments have been studied for decades, but mathematical modeling has rarely been used to explore NIDS performance. This paper details a mathematical model that describes a NIDS performing packet inspection and its detection of malware's C2 traffic. Here, the paper further describes an emulation testbed and a set of cyber experiments that used the testbed to validate the model. These experiments included a commonly used NIDS (Snort) and traffic with contents from a pervasive malware (Emotet). Results are presented for two scenarios: a nominal scenario and a “stressed” scenario in which the NIDS cannot process all incoming packets. Model and experiment results match well, with model estimates mostly falling within 95 % confidence intervals on the experiment means. Model results were produced 70-3000 times faster than the experimental results. Consequently, the model's predictive capability could potentially be used to support decisions about NIDS configuration and effectiveness that require high confidence results, quantification of uncertainty, and exploration of large parameter spaces. Furthermore, the experiments provide an example for how emulation testbeds can be used to validate cyber models that include stochastic variability.
Cyberattacks against industrial control systems have increased over the last decade, making it more critical than ever for system owners to have the tools necessary to understand the cyber resilience of their systems. However, existing tools are often qualitative, subject matter expertise-driven, or highly generic, making thorough, data-driven cyber resilience analysis challenging. The ADROC project proposed to develop a platform to enable efficient, repeatable, data-driven cyber resilience analysis for cyber-physical systems. The approach consists of two phases of modeling: computationally efficient math modeling and high-fidelity emulations. The first phase allows for scenarios of low concern to be quickly filtered out, conserving resources available for analysis. The second phase supports more detailed scenario analysis, which is more predictive of real-world systems. Data extracted from experiments is used to calculate cyber resilience metrics. ADROC then ranks scenarios based on these metrics, enabling prioritization of system resources to improve cyber resilience.
Recent high profile cyber attacks on critical infrastructures have raised awareness about the severe and widespread impacts that these attacks can have on everyday life. This awareness has spurred research into making industrial control systems and other cyber-physical systems more resilient. A plethora of cyber resilience metrics and frameworks have been proposed for cyber resilience assessments, but these approaches typically assume that data required to populate the metrics is readily available, an assumption that is frequently not valid. This paper describes a new cyber experimentation platform that can be used to generate relevant data and to calculate resilience metrics that quantify how resilient specified industrial control systems are to specified threats. Demonstration of the platform and analysis process are illustrated through a use case involving the control system for a pressurized water reactor.
Recent high profile cyber attacks on critical infrastructures have raised awareness about the severe and widespread impacts that these attacks can have on everyday life. This awareness has spurred research into making industrial control systems and other cyber-physical systems more resilient. A plethora of cyber resilience metrics and frameworks have been proposed for cyber resilience assessments, but these approaches typically assume that data required to populate the metrics is readily available, an assumption that is frequently not valid. This paper describes a new cyber experimentation platform that can be used to generate relevant data and to calculate resilience metrics that quantify how resilient specified industrial control systems are to specified threats. Demonstration of the platform and analysis process are illustrated through a use case involving the control system for a pressurized water reactor.
This report summarizes the activities performed as part of the Science and Engineering of Cybersecurity by Uncertainty quantification and Rigorous Experimentation (SECURE) Grand Challenge LDRD project. We provide an overview of the research done in this project, including work on cyber emulation, uncertainty quantification, and optimization. We present examples of integrated analyses performed on two case studies: a network scanning/detection study and a malware command and control study. We highlight the importance of experimental workflows and list references of papers and presentations developed under this project. We outline lessons learned and suggestions for future work.
Cyber testbeds provide an important mechanism for experimentally evaluating cyber security performance. However, as an experimental discipline, reproducible cyber experimentation is essential to assure valid, unbiased results. Even minor differences in setup, configuration, and testbed components can have an impact on the experiments, and thus, reproducibility of results. This paper documents a case study in reproducing an earlier emulation study, with the reproduced emulation experiment conducted by a different research group on a different testbed. We describe lessons learned as a result of this process, both in terms of the reproducibility of the original study and in terms of the different testbed technologies used by both groups. This paper also addresses the question of how to compare results between two groups' experiments, identifying candidate metrics for comparison and quantifying the results in this reproduction study.
Concerns about cyber threats to space systems are increasing. Researchers are developing intrusion detection and protection systems to mitigate these threats, but sparsity of cyber threat data poses a significant challenge to these efforts. Development of credible threat data sets are needed to overcome this challenge. This paper describes the extension/development of three data generation algorithms (generative adversarial networks, variational auto-encoders, and generative algorithm for multi-variate timeseries) to generate cyber threat data for space systems. The algorithms are applied to a use case that leverages the NASA Operational Simulation for Small Satellites (NOS$^{3})$ platform. Qualitative and quantitative measures are applied to evaluate the generated data. Strengths and weaknesses of each algorithm are presented, and suggested improvements are provided. For this use case, generative algorithm for multi-variate timeseries performed best according to both qualitative and quantitative measures.
Concerns about cyber threats to space systems are increasing. Researchers are developing intrusion detection and protection systems to mitigate these threats, but sparsity of cyber threat data poses a significant challenge to these efforts. Development of credible threat data sets are needed to overcome this challenge. This paper describes the extension/development of three data generation algorithms (generative adversarial networks, variational auto-encoders, and generative algorithm for multi-variate timeseries) to generate cyber threat data for space systems. The algorithms are applied to a use case that leverages the NASA Operational Simulation for Small Satellites (NOS$^{3})$ platform. Qualitative and quantitative measures are applied to evaluate the generated data. Strengths and weaknesses of each algorithm are presented, and suggested improvements are provided. For this use case, generative algorithm for multi-variate timeseries performed best according to both qualitative and quantitative measures.
To combat dynamic, cyber-physical disturbances in the electric grid, online and adaptive remedial action schemes (RASs) are needed to achieve fast and effective response. However, a major challenge lies in reducing the computational burden of analyses needed to inform selection of appropriate controls. This paper proposes the use of a role and interaction discovery (RID) algorithm that leverages control sensitivities to gain insight into the controller roles and support groups. Using these results, a procedure is developed to reduce the control search space to reduce computation time while achieving effective control response. A case study is presented that considers corrective line switching to mitigate geomagnetically induced current (GIC) -saturated reactive power losses in a 20-bus test system. Results demonstrated both significant reduction of both the control search space and reactive power losses using the RID approach.
Port scanning is a commonly applied technique in the discovery phase of cyber attacks. As such, defending against them has long been the subject of many research and modeling efforts. Though modeling efforts can search large parameter spaces to find effective defensive parameter settings, confidence in modeling results can be hampered by limited or omitted validation efforts. In this paper, we introduce a novel, mathematical model that describes port scanning progress by an attacker and intrusion detection by a defender. The paper further describes a set of emulation experiments that we conducted with a virtual testbed and used to validate the model. Results are presented for two scanning strategies: a slow, stealthy approach and a fast, loud approach. Estimates from the model fall within 95% confidence intervals on the means estimated from the experiments. Consequently, the model's predictive capability provides confidence in its use for evaluation and development of defensive strategies against port scanning.