Empirical study of automatic dataset labelling

Aparicio-Navarro, Francisco; Kyriakopoulos, Kostas; Parish, David

Empirical Study of Automatic Dataset Labelling.pdf (1.44 MB)

Empirical study of automatic dataset labelling

conference contribution

posted on 2015-05-12, 15:14 authored by Francisco Aparicio-Navarro, Kostas KyriakopoulosKostas Kyriakopoulos, David Parish

Correctly labelled dataseis are commonly required. Three particular scenarios are highlighted, which showcase this need. One of these scenarios is when using supervised Intrusion Detection Systems (TDSs). These systems need labelled datasets for their training process. Also, the real nature of analysed datasets must be known when evaluating the efficiency of IDSs detecting intrusions. The third scenario is the use of feature selection that works only if the processed datasets are labelled. In normal conditions, collecting labelled datasets from real communication networks is impossible. In a previous work we developed a novel approach to automatically generate labelled network traffic datasets using an unsupervised anomaly based IDS. The approach was empirically proven to be an efficient unsupervised labelling approach. It was evaluated using a single dataset. This paper extends our previous work by using a greater number of datasets, gathered from a real IEEE 802.11 network testbed. The datasets are comprised of different wireless-specific attacks. This paper also proposes a new and more precise method to calculate the boundary threshold, used in the labelling process.

Funding

This work was supported by the Engineering and Physical Sciences Research Council (EPSRC) Grant number EP/ K014307/1 and the MOD University Research Collaboration in Signal Processing.

History

School

Mechanical, Electrical and Manufacturing Engineering

Published in

2014 9th International Conference for Internet Technology and Secured Transactions, ICITST 2014

Pages

372 - 378

Citation

APARICIO-NAVARRO, F.J., KYRIAKOPOULOS, K.G. and PARISH, D.J., 2015. Empirical study of automatic dataset labelling. IN: Proceedings of the 9th International Conference for Internet Technology and Secured Transactions, ICITST 2014, pp. 372 - 378.

Publisher

Version

AM (Accepted Manuscript)

Publication date

2015

Notes

© 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.