posted on 2009-09-15, 12:56authored byMohammed A. Quddus
Count data are primarily categorised as cross-sectional, time series, and panel. Over the past decade,
Poisson and Negative Binomial (NB) models have been used widely to analyse cross-sectional and time
series count data, and random effect and fixed effect Poisson and NB models have been used to analyse panel
count data. However, recent literature suggests that although the underlying distributional assumptions
of these models are appropriate for cross-sectional count data, they are not capable of taking into account
the effect of serial correlation often found in pure time series count data. Real-valued time series models,
such as the autoregressive integrated moving average (ARIMA) model, introduced by Box and Jenkins
have been used in many applications over the last few decades. However, when modelling non-negative
integer-valued data such as traffic accidents at a junction over time, Box and Jenkins models may be
inappropriate. This is mainly due to the normality assumption of errors in the ARIMA model. Over the
last few years, a new class of time series models known as integer-valued autoregressive (INAR) Poisson
models, has been studied by many authors. This class of models is particularly applicable to the analysis
of time series count data as these models hold the properties of Poisson regression and able to deal with
serial correlation, and therefore offers an alternative to the real-valued time series models.
The primary objective of this paper is to introduce the class of INAR models for the time series analysis of
traffic accidents in Great Britain. Different types of time series count data are considered: aggregated time
series data where both the spatial and temporal units of observation are relatively large (e.g., Great Britain
and years) and disaggregated time series data where both the spatial and temporal units are relatively
small (e.g., congestion charging zone and months). The performance of the INAR models is compared
with the class of Box and Jenkins real-valued models. The results suggest that the performance of these
two classes of models is quite similar in terms of coefficient estimates and goodness of fit for the case of
aggregated time series traffic accident data. This is because the mean of the counts is high in which case
the normal approximations and the ARIMA model may be satisfactory. However, the performance of INAR
Poisson models is found to be much better than that of the ARIMA model for the case of the disaggregated
time series traffic accident data where the counts is relatively low. The paper ends with a discussion on
the limitations of INAR models to deal with the seasonality and unobserved heterogeneity.
History
School
Architecture, Building and Civil Engineering
Citation
QUDDUS, M.A., 2008. Time series count data models: an empirical application to traffic accidents. Accident Analysis and Prevention, 40 (5), pp. 1732-1741.