Generalisation Power Analysis for finding a stable set of features using evolutionary computation feature selection algorithms

Salesi, Sadegh; Cosma, Georgina

Generalisation Power Analysis for finding a stable set of features using evolutionary computation feature selection algorithms

journal contribution

posted on 2021-09-06, 08:36 authored by Sadegh Salesi, Georgina CosmaGeorgina Cosma

Evolutionary Computations (EC) are powerful techniques for feature selection tasks however, they reach different solutions in each run, and this is known as the stability issue. Existing solutions to finding a stable subset of features when using an EC algorithm include aggregation and frequency-based methods. These methods may return feature subsets that achieve weak or inconsistent classification performance when utilised to build classifiers, and this limitation is known as ‘lack of generalisation power’. To address this limitation, this paper proposes a novel algorithm called Generalisation Power Analysis (GPA) that measures the performance of feature subsets in terms of generalisation power and hence evaluates their ability to achieve optimal or near-optimal accuracy over multiple classifiers. GPA has been designed to work with the stochastic nature of EC algorithms. Experiments with eleven benchmark datasets revealed that the proposed GPA approach consistently outperformed alternative methods in finding subsets that achieved high generalisation power. Although GPA requires relatively higher computation time compared to alternative approaches as it embeds multiple classifiers, the advantages of using GPA during feature selection outweigh this limitation since the outcome will be a robust prediction model that has been developed using a subset of features that are not biased towards a specific classifier.

Funding

The Leverhulme Trust Research Project Grant RPG-2016-252 entitled “Novel Approaches for Constructing Optimised Multimodal Data Spaces”

History

School

Science

Department

Computer Science

Published in

Knowledge-Based Systems

Volume

231

Publisher

Elsevier BV

Version

AM (Accepted Manuscript)

Rights holder

Publisher statement

This paper was accepted for publication in the journal Knowledge-Based Systems and the definitive published version is available at https://doi.org/10.1016/j.knosys.2021.107450.

Acceptance date

2021-08-24

Publication date

2021-08-27

Copyright date

2021

DOI

https://doi.org/10.1016/j.knosys.2021.107450

ISSN

0950-7051

Publisher version

https://doi.org/10.1016/j.knosys.2021.107450

Language

en

Depositor

Dr Georgina Cosma. Deposit date: 29 August 2021

Article number

107450

Usage metrics

Keywords

Feature selection Generalisation Power Analysis Generalisation Power Index Machine learning Evolutionary computation Feature selection stability

Licence

CC BY-NC-ND 4.0

Generalisation Power Analysis for finding a stable set of features using evolutionary computation feature selection algorithms

Funding

The Leverhulme Trust Research Project Grant RPG-2016-252 entitled “Novel Approaches for Constructing Optimised Multimodal Data Spaces”

History

School

Department

Published in

Volume

Publisher

Version

Rights holder

Publisher statement

Acceptance date

Publication date

Copyright date

DOI

ISSN

Publisher version

Language

Depositor

Article number

Usage metrics

Categories

Keywords

Licence

Exports