A Signal Processing Approach for Cyber Data Classification with Deep Neural Networks
Procedia Computer Science
Recent cyber security events have demonstrated the need for algorithms that adapt to the rapidly evolving threat landscape of complex network systems. In particular, human analysts often fail to identify data exfiltration when it is encrypted or disguised as innocuous data. Signature-based approaches for identifying data types are easily fooled and analysts can only investigate a small fraction of network events. However, neural networks can learn to identify subtle patterns in a suitably chosen input space. To this end, we have developed a signal processing approach for classifying data files which readily adapts to new data formats. We evaluate the performance for three input spaces consisting of the power spectral density, byte probability distribution and sliding-window entropy of the byte sequence in a file. By combining all three, we trained a deep neural network to discriminate amongst nine common data types found on the Internet with 97.4% accuracy.