Wasserstein Generative Adversarial Network to Address the Imbalanced Data Problem in Real-Time Crash Risk Prediction
Data
2022Language
en
Soggetto
Abstract
Real-time crash risk prediction models aim to identify pre-crash conditions as part of active traffic safety management. However, traditional models which were mainly developed through matched case-control sampling have been criticised due to their biased estimations. In this study, the state-of-art class balancing method known as the Wasserstein Generative Adversarial Network (WGAN) was introduced to address the class imbalance problem in the model development. An extremely imbalanced dataset consisted of 257 crashes and over 10 million non-crash cases from M1 Motorway in United Kingdom for 2017 was then utilized to evaluate the proposed method. The real-time crash prediction model was developed by employing Deep Neural Network (DNN) and Logistic Regression (LR). Crash predictions were performed under different crash to non-crash ratios where synthetic crashes were generated by Wasserstein Generative Adversarial Network (WGAN), Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic (ADASYN) sampling respectively. Comparisons were then made with algorithmic-level class balancing methods such as cost-sensitive learning and ensemble methods. Our findings suggest that WGAN clearly outperforms other oversampling methods in terms of handling the extremely imbalanced sample and the DNN model subsequently produces a crash prediction sensitivity of about 70% with a 5% false alarm rate. Based on the findings of this study, proactive traffic management strategies including Variable Speed Limit (VSL) and Dynamic Messing Signs (DMS) could be deployed to reduce the probability of crash occurrence. © 2000-2011 IEEE.