Forecasting monthly airline passenger numbers with small datasets using feature engineering and a modified principal component analysis

Al-Ruzeiqi, Sara

doi:10.26174/thesis.lboro.12249779.v1

Sara_AlRuzeiqiThesisPhd.pdf (19.44 MB)

Forecasting monthly airline passenger numbers with small datasets using feature engineering and a modified principal component analysis

thesis

posted on 2020-05-14, 15:43 authored by Sara Al-Ruzeiqi

In this study, a machine learning approach based on time series models, different feature engineering, feature extraction, and feature derivation is proposed to improve air passenger forecasting. Different types of datasets were created to extract new features from the core data. An experiment was undertaken with artificial neural networks to test the performance of neurons in the hidden layer, to optimise the dimensions of all layers and to obtain an optimal choice of connection weights – thus the nonlinear optimisation problem could be solved directly. A method of tuning deep learning models using H2O (which is a feature-rich, open source machine learning platform known for its R and Spark integration and its ease of use) is also proposed, where the trained network model is built from samples of selected features from the dataset in order to ensure diversity of the samples and to improve training. A successful application of deep learning requires setting numerous parameters in order to achieve greater model accuracy. The number of hidden layers and the number of neurons, are key parameters in each layer of such a network. Hyper-parameter, grid search, and random hyper-parameter approaches aid in setting these important parameters. Moreover, a new ensemble strategy is suggested that shows potential to optimise parameter settings and hence save more computational resources throughout the tuning process of the models. The main objective, besides improving the performance metric, is to obtain a distribution on some hold-out datasets that resemble the original distribution of the training data. Particular attention is focused on creating a modified version of Principal Component Analysis (PCA) using a different correlation matrix – obtained by a different correlation coefficient based on kinetic energy to derive new features. The data were collected from several airline datasets to build a deep prediction model for forecasting airline passenger numbers. Preliminary experiments show that fine-tuning provides an efficient approach for tuning the ultimate number of hidden layers and the number of neurons in each layer when compared with the grid search method. Similarly, the results show that the modified version of PCA is more effective in data dimension reduction, classes reparability, and classification accuracy than using traditional PCA.

History

School

Science

Department

Computer Science

Publisher

Loughborough University

Rights holder

Publication date

2019

Notes

A doctoral thesis. Submitted in partial fulfilment of the requirements for the award of Doctor of Philosophy of Loughborough University.

Language

en

Supervisor(s)

Christian W. Dawson

Qualification name

PhD

Qualification level

Doctoral

This submission includes a signed certificate in addition to the thesis file(s)

I have submitted a signed certificate

Administrator link

https://repository.lboro.ac.uk/account/articles/12251135

Usage metrics

Keywords

Feature Engineering Deep Learning Principle Component Analysis (PCA)algorithm prediction Information and Computing Sciences not elsewhere classified

Licence

CC BY-NC-ND 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Forecasting monthly airline passenger numbers with small datasets using feature engineering and a modified principal component analysis

History

School

Department

Publisher

Rights holder

Publication date

Notes

Language

Supervisor(s)

Qualification name

Qualification level

This submission includes a signed certificate in addition to the thesis file(s)

Administrator link

Usage metrics

Categories

Keywords

Licence

Exports