LR_DS_Journal_LUPIN.pdf (1.37 MB)
Evidential classification and feature selection for cyber-threat hunting
journal contributionposted on 2021-05-05, 08:17 authored by Matt Beechey, Kostas KyriakopoulosKostas Kyriakopoulos, Sangarapillai LambotharanSangarapillai Lambotharan
In recent years, there has been an immense research interest in applying Machine Learning for defending networked systems from cyber threats. A particular challenge in this domain is the identification and selection of appropriate features that ensure prompt and correct cyber threat detection. This work proposes a novel approach that leverages recent advances in evidence theory to provide a deep insight on the effect of each feature’s uncertainty on the overall classification decision. As a result, a network security analyst may rank the features in a dataset from the most to the least ambiguous, without requiring expert domain knowledge in cyber threats. Ultimately, this enables the creation of cyber threat phenotypes, which may be used to detect and differentiate between similarly manifested cyber threats. The proposed approach is evaluated on a recent, challenging scenario of network security attacks and compared against multiple feature selection techniques. Based on the selected features, cyber threat classification analysis is performed using seven state-of-the-art ML classification algorithms. The results indicate the proposed evidence-based feature selection method performs better, or, at least as good, to the state-of-the-art. Against the best performing state-of-the-art technique, Decision Tree, the proposed technique’s features enabled the classification process to take place in 93.25% of the time, whilst maintaining a high F1 Score of 0.99. Furthermore, the proposed technique’s features enable a faster classification process requiring, on average, just 29.25% of the time compared to the average across other evaluated techniques.
- Mechanical, Electrical and Manufacturing Engineering
Published inKnowledge-Based Systems
- AM (Accepted Manuscript)
Rights holder© Elsevier
Publisher statementThis paper was accepted for publication in the journal Knowledge-Based Systems and the definitive published version is available at https://doi.org/10.1016/j.knosys.2021.107120.