A marginalized random effects hurdle negative binomial model for analyzing refined-scale crash frequency data
Rongjie Yu
Yiyun Wang
Mohammed Quddus
Jian Li
2134/37857
https://repository.lboro.ac.uk/articles/journal_contribution/A_marginalized_random_effects_hurdle_negative_binomial_model_for_analyzing_refined-scale_crash_frequency_data/9458564
Crash frequency prediction models have been an important subject of safety research
that unveils a relationship between crash occurrences and their influencing factors.
Recently, the hourly-based refined-scale crash frequency analysis becomes attractive since it holds the benefits of introducing time-varying explanatory information (e.g.
traffic volume and operating speed). However, crash frequency data with short time
intervals possess the analytical issues of excessive zeros and unobserved heterogeneity. In this study, a marginalized random effects hurdle negative binomial (MREHNB)
model was developed in which the hurdle modelling structure handles the excessive
zeros issue and site-specific random effect terms capture the factors associated with
unobserved heterogeneity. Moreover, the marginalized inference approach was first
introduced here to obtain the marginal mean inference for the overall population rather
than subject-specific estimations. Empirical analyses were conducted based on data
from the Shanghai urban expressway system, and the MREHNB model was compared
with the HNB (hurdle negative binomial) and the REHNB (random effects hurdle
negative binomial) model. In terms of model goodness-of-fits, REHNB and MREHNB
model showed substantial improvement compared to the HNB model while there was
no distinct difference between the REHNB and MREHNB models. However, as for the
estimated parameters, the MREHNB model provided better inference precisions.
20 Furthermore, the MREHNB model provided interesting findings for the crash
21 contributing factors, for example, higher ratios of local vehicles within the volume
22 would enhance the probability of crash occurrence; and a non-linear relationship was
23 concluded between traffic volume and crash frequency with the moderate level of
24 volume held the highest crash occurrence probability. Finally, in-depth analyses about
25 the modeling results and the model technique were discussed.
2019-06-03 08:51:45
Marginalized model
Site-specific random effects term
Hurdle negative binomial model
Excessive zeros
Unobserved heterogeneity
Built Environment and Design not elsewhere classified