Publications Details

Publications / Conference Paper

The DARPA SEARCHLIGHT Dataset of Application Network Traffic

Ardi, Calvin; Aubry, Connor; Kocoloski, Brian; Deangelis, Dave; Hussain, Alefiya; Troglia, Matthew; Schwab, Stephen

Researchers are in constant need of reliable data to develop and evaluate AI/ML methods for networks and cybersecurity. While Internet measurements can provide realistic data, such datasets lack ground truth about application flows. We present a ~750GB dataset that includes ~2000 systematically conducted experiments and the resulting packet captures with video streaming, video teleconferencing, and cloud-based document editing applications. This curated and labeled dataset has bidirectional and encrypted traffic with complete ground truth that can be widely used for assessments and evaluation of AI/ML algorithms.