Time-series event-based prediction: an unsupervised learning framework based on genetic programming

In this paper, we propose an unsupervised learning framework based on Genetic Programming (GP) to predict the position of any particular target event (defined by the user) in a time-series. GP is used to automatically build a library of candidate temporal features. The proposed framework receives a training set S ¼ fðVaÞja ¼ 0 ng, where each Va is a time-series vector such that 8Va 2 S; Va ¼ fðxtÞjt ¼ 0 tmaxg where tmax is the size of the time-series. All Va 2 S are assumed to be generated from the same environment. The proposed framework uses a divide-and-conquer strategy for the training phase. The training process of the proposed framework works as follow. The user specifies the target event that needs to be predicted (e.g., Highest value, Second Highest value, ..., etc.). Then, the framework classifies the training samples into different Bins, where Bins ¼ fðbiÞji ¼ 0 tmaxg, based on the time-slot t of the target event in each Va training sample. Each bi 2 Bins will contain a subset of S. For each bi, the proposed framework further classifies its samples into statistically independent clusters. To achieve this, each bi is treated as an independent problem where GP is used to evolve programs to extract statistical features from each bi’s members and classify them into different clusters using the K-Means algorithm. At the end of the training process, GP is used to build an ‘event detector’ that receives an unseen time-series and predicts the time-slot where the target event is expected to occur. Empirical evidence on artificially generated data and real-world data shows that the proposed framework significantly outperforms standard Radial Basis Function Networks, standard GP system, Gaussian Process regression, Linear regression, and Polynomial Regression.