Analysis of hidden Markov model learning algorithms for the detection and prediction of multi-stage network attacks

Hidden Markov Models have been extensively used for determining computer systems under a Multi-Stage Network Attack (MSA), however, acquisition of optimal model training parameters remains a formidable challenge. This paper critically analyses the detection and prediction accuracy of a wide range of training and initialisation algorithms including the expectation-maximisation, spectral, Baum-Welch, differential evolution, K-means, and segmental K-means. The performance of these algorithms has been evaluated, both individually and in a hybrid approach, for detecting all the states and current state, and predicting the next state (NS), and the next observation (NO) of a given alert observation sequence. For generating this alert sequence, the Snort signaturebased intrusion detection system was utilised, using either bespoke or default rules, to raise alerts while examining the DARPA 2000 MSA dataset. The investigation also sheds further light on alternative approaches for forecasting the possible NS and NO in an MSA campaign, as well as, the impact of window size on the prediction performance for all analysed techniques. The results and discussion emphasise on the appropriateness of various techniques for the prediction of NS and NO. Furthermore, NO prediction accuracy has indicated a performance increase of up to 44.95% in the proposed hybrid approaches.