The DARPA SEARCHLIGHT Dataset of Application Network Traffic
ACM International Conference Proceeding Series
Researchers are in constant need of reliable data to develop and evaluate AI/ML methods for networks and cybersecurity. While Internet measurements can provide realistic data, such datasets lack ground truth about application flows. We present a ~750GB dataset that includes ~2000 systematically conducted experiments and the resulting packet captures with video streaming, video teleconferencing, and cloud-based document editing applications. This curated and labeled dataset has bidirectional and encrypted traffic with complete ground truth that can be widely used for assessments and evaluation of AI/ML algorithms.