Network intrusion detection systems (NIDS) are commonly used to detect malware communications, including command-and-control (C2) traffic from botnets. NIDS performance assessments have been studied for decades, but mathematical modeling has rarely been used to explore NIDS performance. This paper details a mathematical model that describes a NIDS performing packet inspection and its detection of malware's C2 traffic. Here, the paper further describes an emulation testbed and a set of cyber experiments that used the testbed to validate the model. These experiments included a commonly used NIDS (Snort) and traffic with contents from a pervasive malware (Emotet). Results are presented for two scenarios: a nominal scenario and a “stressed” scenario in which the NIDS cannot process all incoming packets. Model and experiment results match well, with model estimates mostly falling within 95 % confidence intervals on the experiment means. Model results were produced 70-3000 times faster than the experimental results. Consequently, the model's predictive capability could potentially be used to support decisions about NIDS configuration and effectiveness that require high confidence results, quantification of uncertainty, and exploration of large parameter spaces. Furthermore, the experiments provide an example for how emulation testbeds can be used to validate cyber models that include stochastic variability.
Virtual machine emulation environments provide ideal testbeds for cybersecurity evaluations because they run real software binaries in a scalable, offline test setting that is suitable for assessing the impacts of software security flaws on the system. Verification of such emulations determines whether the environment is working as intended. Verification can focus on various aspects such as timing realism, traffic realism, and resource realism. In this paper, we study resource realism and issues associated with virtual machine resource utilization. We examine telemetry metrics gathered from a series of structured experiments which involve large numbers of parallel emulations meant to oversubscribe resources at some point. We present an approach to use telemetry metrics for emulation verification, and we demonstrate this approach on two cyber scenarios. Descriptions of the experimental configurations are provided along with a detailed discussion of statistical tests used to compare telemetry metrics. Results demonstrate the potential for a structured experimental framework, combined with statistical analysis of telemetry metrics, to support emulation verification. We conclude with comments on generalizability and potential future work.
This report summarizes the activities performed as part of the Science and Engineering of Cybersecurity by Uncertainty quantification and Rigorous Experimentation (SECURE) Grand Challenge LDRD project. We provide an overview of the research done in this project, including work on cyber emulation, uncertainty quantification, and optimization. We present examples of integrated analyses performed on two case studies: a network scanning/detection study and a malware command and control study. We highlight the importance of experimental workflows and list references of papers and presentations developed under this project. We outline lessons learned and suggestions for future work.
Cyber testbeds provide an important mechanism for experimentally evaluating cyber security performance. However, as an experimental discipline, reproducible cyber experimentation is essential to assure valid, unbiased results. Even minor differences in setup, configuration, and testbed components can have an impact on the experiments, and thus, reproducibility of results. This paper documents a case study in reproducing an earlier emulation study, with the reproduced emulation experiment conducted by a different research group on a different testbed. We describe lessons learned as a result of this process, both in terms of the reproducibility of the original study and in terms of the different testbed technologies used by both groups. This paper also addresses the question of how to compare results between two groups' experiments, identifying candidate metrics for comparison and quantifying the results in this reproduction study.
Port scanning is a commonly applied technique in the discovery phase of cyber attacks. As such, defending against them has long been the subject of many research and modeling efforts. Though modeling efforts can search large parameter spaces to find effective defensive parameter settings, confidence in modeling results can be hampered by limited or omitted validation efforts. In this paper, we introduce a novel, mathematical model that describes port scanning progress by an attacker and intrusion detection by a defender. The paper further describes a set of emulation experiments that we conducted with a virtual testbed and used to validate the model. Results are presented for two scanning strategies: a slow, stealthy approach and a fast, loud approach. Estimates from the model fall within 95% confidence intervals on the means estimated from the experiments. Consequently, the model's predictive capability provides confidence in its use for evaluation and development of defensive strategies against port scanning.
In this paper we report preliminary results from the novel coupling of cyber-physical emulation and interdiction optimization to better understand the impact of a CrashOverride malware attack on a notional electric system. We conduct cyber experiments where CrashOverride issues commands to remote terminal units (RTUs) that are controlling substations within a power control area. We identify worst-case loss of load outcomes with cyber interdiction optimization; the proposed approach is a bilevel formulation that incorporates RTU mappings to controllable loads, transmission lines, and generators in the upper-level (attacker model), and a DC optimal power flow (DCOPF) in the lower-level (defender model). Overall, our preliminary results indicate that the interdiction optimization can guide the design of experiments instead of performing a 'full factorial' approach. Likewise, for systems where there are important dependencies between SCADA/ICS controls and power grid operations, the cyber-physical emulations should drive improved parameterization and surrogate models that are applied in scalable optimization techniques.
The Data Inferencing on Semantic Graphs project (DISeG) was a two-year investigation of inferencing techniques (focusing on belief propagation) to social graphs with a focus on semantic graphs (also called multi-layer graphs). While working this problem, we developed a new directed version of inferencing we call Directed Propagation (Chapters 2 and 4), identified new semantic graph sampling problems (Chapter 3).