Time series count data models: an empirical application to traffic accidents

Quddus, Mohammed A.

AAP_2007_INAR_revised_Final.pdf (187.89 kB)

Time series count data models: an empirical application to traffic accidents

journal contribution

posted on 2009-09-15, 12:56 authored by Mohammed A. Quddus

Count data are primarily categorised as cross-sectional, time series, and panel. Over the past decade, Poisson and Negative Binomial (NB) models have been used widely to analyse cross-sectional and time series count data, and random effect and fixed effect Poisson and NB models have been used to analyse panel count data. However, recent literature suggests that although the underlying distributional assumptions of these models are appropriate for cross-sectional count data, they are not capable of taking into account the effect of serial correlation often found in pure time series count data. Real-valued time series models, such as the autoregressive integrated moving average (ARIMA) model, introduced by Box and Jenkins have been used in many applications over the last few decades. However, when modelling non-negative integer-valued data such as traffic accidents at a junction over time, Box and Jenkins models may be inappropriate. This is mainly due to the normality assumption of errors in the ARIMA model. Over the last few years, a new class of time series models known as integer-valued autoregressive (INAR) Poisson models, has been studied by many authors. This class of models is particularly applicable to the analysis of time series count data as these models hold the properties of Poisson regression and able to deal with serial correlation, and therefore offers an alternative to the real-valued time series models. The primary objective of this paper is to introduce the class of INAR models for the time series analysis of traffic accidents in Great Britain. Different types of time series count data are considered: aggregated time series data where both the spatial and temporal units of observation are relatively large (e.g., Great Britain and years) and disaggregated time series data where both the spatial and temporal units are relatively small (e.g., congestion charging zone and months). The performance of the INAR models is compared with the class of Box and Jenkins real-valued models. The results suggest that the performance of these two classes of models is quite similar in terms of coefficient estimates and goodness of fit for the case of aggregated time series traffic accident data. This is because the mean of the counts is high in which case the normal approximations and the ARIMA model may be satisfactory. However, the performance of INAR Poisson models is found to be much better than that of the ARIMA model for the case of the disaggregated time series traffic accident data where the counts is relatively low. The paper ends with a discussion on the limitations of INAR models to deal with the seasonality and unobserved heterogeneity.

History

School

Architecture, Building and Civil Engineering

Citation

QUDDUS, M.A., 2008. Time series count data models: an empirical application to traffic accidents. Accident Analysis and Prevention, 40 (5), pp. 1732-1741.

Publisher

Version

AM (Accepted Manuscript)

Publication date

2008

Notes

This article was published in the journal, Accident Analysis and Prevention [© Elsevier]. The definitive version is available at: http://dx.doi.org/10.1016/j.aap.2008.06.011