4.4 Article

Application of Machine Learning Techniques to Predict the Occurrence of Distraction-affected Crashes with Phone-Use Data

Journal

TRANSPORTATION RESEARCH RECORD
Volume 2676, Issue 2, Pages 692-705

Publisher

SAGE PUBLICATIONS INC
DOI: 10.1177/03611981211045371

Keywords

distraction-affected crashes; machine learning; XGBoost; SHAP; phone use while driving

Ask authors/readers for more resources

This study examines the importance of phone-use information for distraction-affected crashes and utilizes machine learning techniques like XGBoost for analysis and prediction, indicating the close relationship between phone use and distraction-affected crashes. The study also finds that distraction crashes are more likely to occur on road segments with higher exposure and uneven traffic flow conditions or with a higher volume of medium trucks.
Distraction occurs when a driver's attention is diverted from driving to a secondary task. The number of distraction-affected crashes has been increasing in recent years. Accurately predicting distraction-affected crashes is critical for roadway agencies to reduce distracted driving behaviors and distraction-affected crashes. Recently, more and more emerging phone-use data and machine learning techniques are available to safety researchers, and can potentially improve the prediction of distraction-affected crashes. Therefore, this study first examines if phone-use events provide essential information for distraction-affected crashes. The authors apply the machine learning technique (i.e., XGBoost) under two scenarios, with and without phone-use events, and compare their performances with two conventional statistical models: logistic regression model and mixed-effects logistic regression model. The comparison demonstrates the superiority of XGBoost over logistic regression with a high-dimensional unbalanced dataset. Further, this study implements SHAP (SHapley Additive exPlanation) to interpret the results and analyze the importance of individual features related to distraction-affected crashes and tests its ability to improve prediction accuracy. The trained XGBoost model achieves a sensitivity of 91.59%, a specificity of 85.92%, and 88.72% accuracy. The XGBoost and SHAP results suggest that: (1) phone-use information is an important factor associated with the occurrences of distraction-affected crashes; (2) distraction-affected crashes are more likely to occur on roadway segments with higher exposure (i.e., length and traffic volume), unevenness of traffic flow condition, or with medium truck volume.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available