Most information sources in the current technological world are generating data sequentially and
rapidly, in the form of data streams. The evolving nature of processes may often cause changes in
data distribution, also known as concept drift, which is difficult to detect and causes loss of
accuracy in supervised learning algorithms. As a consequence, online machine learning algorithms
that are able to update actively according to possible changes in the data distribution are required.
Although many strategies have been developed to tackle this problem, most of them are designed
for classification problems. Therefore, in the domain of regression problems, there is a need for the
development of accurate algorithms with dynamic updating mechanisms that can operate in a
computational time compatible with today’s demanding market. In this article, the authors propose
a new bagging ensemble approach based on Neural Network with Random Weights for online data
stream regression. The proposed method improves the data prediction accuracy as well as
minimises the required computational time compared to a recent algorithm for online data stream
regression from literature. The experiments are carried out using four synthetic datasets to evaluate
the algorithm's response to concept drift, along with four benchmark datasets from different
industries. The results indicate improvement in data prediction accuracy, effectiveness in handling
concept drift and much faster updating times compared to the existing available approach.
Additionally, the use of Design of Experiments as an effective tool for hyperparameter tuning is
demonstrated.
Funding
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001
Wolfson School of Mechanical, Electrical and Manufacturing Engineering, Loughborough University
History
School
Mechanical, Electrical and Manufacturing Engineering
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.