LR_DS_Journal_LUPIN.pdf (1.37 MB)
Download file

Evidential classification and feature selection for cyber-threat hunting

Download (1.37 MB)
In recent years, there has been an immense research interest in applying Machine Learning for defending networked systems from cyber threats. A particular challenge in this domain is the identification and selection of appropriate features that ensure prompt and correct cyber threat detection. This work proposes a novel approach that leverages recent advances in evidence theory to provide a deep insight on the effect of each feature’s uncertainty on the overall classification decision. As a result, a network security analyst may rank the features in a dataset from the most to the least ambiguous, without requiring expert domain knowledge in cyber threats. Ultimately, this enables the creation of cyber threat phenotypes, which may be used to detect and differentiate between similarly manifested cyber threats. The proposed approach is evaluated on a recent, challenging scenario of network security attacks and compared against multiple feature selection techniques. Based on the selected features, cyber threat classification analysis is performed using seven state-of-the-art ML classification algorithms. The results indicate the proposed evidence-based feature selection method performs better, or, at least as good, to the state-of-the-art. Against the best performing state-of-the-art technique, Decision Tree, the proposed technique’s features enabled the classification process to take place in 93.25% of the time, whilst maintaining a high F1 Score of 0.99. Furthermore, the proposed technique’s features enable a faster classification process requiring, on average, just 29.25% of the time compared to the average across other evaluated techniques.

History

School

  • Mechanical, Electrical and Manufacturing Engineering

Published in

Knowledge-Based Systems

Volume

226

Publisher

Elsevier

Version

AM (Accepted Manuscript)

Rights holder

© Elsevier

Publisher statement

This paper was accepted for publication in the journal Knowledge-Based Systems and the definitive published version is available at https://doi.org/10.1016/j.knosys.2021.107120.

Acceptance date

01/05/2021

Publication date

2021-05-03

Copyright date

2021

ISSN

0950-7051

Language

en

Depositor

Matt Beechey. Deposit date: 4 May 2021

Article number

107120