Statistical Network Behavior Based Threat Detection

01 May 2017

New Image

Malware, short for malicious software, contuses to morph and change and anti-virus software may have problems detecting malicious software that have not been seen before. By employing machine learning techniques, one can learn the general behavior pattern of different threat types and use this to detect variants of threats that have not been seen. We have developed a malware detection system based on machine learning that uses features derived from a user's network flows to external hosts. A novel aspect of our technique is to develop user features based on communicating hosts that are common and hosts that as rare. Features are derived for these two classes separately. The network data for training of the detector is based on malware samples that have been run in a sandbox and normal users' traffic collected from an LTE wireless network provider. Specifically, we use the Adaboost algorithm as the classification engine and obtain good performance with ~2% false positive rate and ~94% accuracy for detecting threats mixed with normal traffic. We also provide high and low confidence regions for detecting subclasses of threats based on their types.