Journal of Jilin University(Engineering and Technology Edition) ›› 2024, Vol. 54 ›› Issue (3): 719-726.doi: 10.13229/j.cnki.jdxbgxb.20221029

Previous Articles    

Real⁃time crash prediction of elevated expressway based on gene expression programming algorithm

Xiao-chi MA1,2,3(),Jian LU1,2,3()   

  1. 1.Jiangsu Key Laboratory of Urban ITS,Southeast University,Nanjing 211189,China
    2.Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies,Southeast University,Nanjing 211189,China
    3.School of Transportation,Southeast University,Nanjing 211189,China
  • Received:2022-08-15 Online:2024-03-01 Published:2024-04-18
  • Contact: Jian LU E-mail:sean98ma@seu.edu.cn;lujian_1972@seu.edu.cn

Abstract:

In order to effectively predict crash on elevated expressway, taking Yan'an elevated expressway in Shanghai as the research object, based on its traffic flow and crash data, an improved Gene Expression Programming algorithm with additional elite gene bank and extinction mechanism was applied to dig out ‘Crash Prediction Empirical Formula’. The prediction accuracy and interpretability of the empirical formula were verified by comparing with the results of machine learning and statistical analysis. The crash of another expressway was predicted by empirical formula without retraining and calibration, and the portability of the empirical formula was verified. The research results indicated that the prediction performance of the empirical formula on the Yan'an elevated expressway dataset is significantly improved compared with the traditional Logistics regression, and the receiver operating characteristic curve area and F1-score indexes are consistent with the artificial neural network model, identifying 74% of the crashes correctly. The good performance of the empirical formula on Hangzhou Shangtang elevated expressway dataset shows that the empirical formula has basic portability. In conclusion, the gene expression programming algorithm considers both high accuracy and interpretability for the crash risk prediction problem, and shows portability, which is helpful to establish a low-cost and efficient crash prediction system.

Key words: engineering of traffic and transportation system, crash prediction, gene expression programming, elevated expressway

CLC Number: 

  • U491.31

Fig.1

Improved gene expression programming flowchart"

Table 1

Confusion matrix of classification problems"

分类结果预测值1预测值0
真实值1TPFN
真实值0FPTN

Table 2

Data structure and statistical description"

变量含义最小值最大值均值
crash是否发生事故0.01.00.28
CI1拥堵指数16.974.247.9
AS1/(km·h-1速度11.370.036.4
SDV1/(km·h-1速度标准差0.02.30.4
SDV2/(km·h-1速度标准差0.02.40.4
VOL1/辆车道交通量45.5145.0108.0
AS2U/(km·h-1速度10.072.235.9
SDV1U/(km·h-1速度标准差0.02.80.5
VOL1U/辆车道交通量5.0150.090.8
AS1D/(km·h-1速度10.568.439.3
SDV1D/(km·h-1速度标准差0.02.70.4
SDV2D/(km·h-1速度标准差0.02.20.4
VOL1D/辆车道交通量27.5148.0102.7

Table 3

Operation parameter values of GEP"

参数数值参数数值
初始种群大小100根插串率0.1
函数符集F+ - * / sin sqrt sigmoid插串位数1~3
终结符集T表2的12个自变量和常数单点重组率0.4

染色体基因

长度

18两点重组率0.2
染色体数量4基因重组率0.1
常数集随机常数精英基因采集代数20
变异率0.035精英基因投放代数40
插串率0.1精英基因投放比例/%10

Table 4

Prediction accuracy of three models"

GEP经验公式Logistics回归ANN模型
AUC0.7020.6640.714
Sensitivity0.7420.5100.697
F1-score0.7170.4690.681
约登指数0.0830.2270.332

Table 5

Estimation results of logistics regression"

变量名参数估计B显著性参数估计指数值
AS1-0.0530.0000.948
AS1D0.0220.0941.022
SDV1D0.8520.0692.345
SDV2D-0.9980.0560.368
VOL1D-0.0170.0180.982
常量1.6570.0465.245

Fig.2

PDP of ANN to interpret variables"

Table 6

Prediction accuracy of two models"

参数GEP经验公式Logistics回归
AUC0.6830.653
Sensitivity0.7560.515
F1-score0.6790.462
约登指数0.0700.199
1 Hossain M, Abdel-Aty M, Quddus M A, et al. Real-time crash prediction models: state-of-the-art, design pathways and ubiquitous requirements[J]. Accident Analysis & Prevention, 2019, 124: 66-84.
2 Mannering F, Bhat C R, Shankar V, et al. Big data, traditional data and the tradeoffs between prediction and causality in highway-safety analysis[J]. Analytic Methods in Accident Research, 2020, 25: No. 100113.
3 Abdel-Aty M, Uddin N, Pande A, et al. Predicting freeway crashes from loop detector data by matched case-control logistic regression[J]. Transportation Research Record, 2004, 1897(1): 88-95.
4 郑来, 顾鹏, 卢健. 基于T-S模糊故障树和贝叶斯网络的重特大交通事故成因分析[J]. 交通信息与安全, 2021, 39(4): 43-51.
Zheng Lai, Gu Peng, Lu Jian. A cause analysis of extraordinarily severe traffic crashes based on T-S fuzzy fault tree and bayesian network[J]. Journal of Transport Information and Safety, 2021, 39(4): 43-51.
5 Yu R J, Zheng Y, Abdel-Aty M, et al. Exploring crash mechanisms with microscopic traffic flow variables: a hybrid approach with latent class logit and path analysis models[J]. Accident Analysis & Prevention, 2019, 125: 70-78.
6 Li L C, Sheng X, Du B, et al. A deep fusion model based on restricted Boltzmann machines for traffic accident duration prediction[J]. Engineering Applications of Artificial Intelligence, 2020, 93: No. 103686.
7 Wang L, Abdel-Aty M, Lee J, et al. Analysis of real-time crash risk for expressway ramps using traffic, geometric, trip generation, and socio-demographic predictors[J]. Accident Analysis & Prevention, 2019, 122: 378-384.
8 陈荔, 李聪颖, 詹立, 等. 城市道路交通事故形态影响因素分析与预测[J]. 长安大学学报: 自然科学版, 2022, 42(4): 98-107.
Chen Li, Li Cong-ying, Zhan Li, et al. Influencing factors analysis and prediction of urban road traffic accident patterns[J]. Journal of Chang'an University (Natural Science Edition), 2022, 42(4): 98-107.
9 Yu R J, Abdel-Aty M. Utilizing support vector machine in real-time crash risk evaluation[J]. Accident Analysis & Prevention, 2013, 51: 252-259.
10 Guo Miao, Zhao Xiao-hua, Yao Ying, et al. A study of freeway crash risk prediction and interpretation based on risky driving behavior and traffic flow data[J]. Accident Analysis & Prevention, 2021, 160: No. 106328.
11 Eboli L, Mazzulla G, Pungillo G. How to define the accident risk level of car drivers by combining objective and subjective measures of driving style[J]. Transportation Research Part F: Traffic Psychology and Behaviour, 2017, 49: 29-38.
12 Dingus T A, Guo F, Lee S, et al. Driver crash risk factors and prevalence evaluation using naturalistic driving data[J]. Proceedings of the National Academy of Sciences of the United States of America, 2016, 113(10): 2636-2641.
13 王海晓, 李永翔, 丁旭, 等. 基于边缘智能的城市下穿隧道车辆行车安全预测[J]. 吉林大学学报: 工学版, 2022, 52(6): 1337-1343.
Wang Hai-xiao, Li Yong-xiang, Ding Xu, et al. Traffic safety prediction of urban underpass tunnel vehicles based on edge intelligence[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(6): 1337-1343.
14 高珍, 高屹, 余荣杰, 等. 连续数据环境下的道路交通事故风险预测模型[J]. 中国公路学报, 2018, 31(4): 280-287.
Gao Zhen, Gao Yi, Yu Rong-jie, et al. Road crash risk prediction model for continuous streaming data environment[J]. China Journal of Highway and Transport, 2018, 31(4): 280-287.
15 李思瑶. 考虑交通异质性的城市快速路事故预测方法[D]. 武汉: 武汉理工大学能源与动力工程学院, 2020.
Li Si-yao. Study on urban expressway crash prediction method considering traffic heterogeneity[D]. Wuhan: School of Energy and Power Engineering, Wuhan University of Technology, 2020.
16 Xu Chuan, Wang Xue-song, Yang Hong, et al. Exploring the impacts of speed variances on safety performance of urban elevated expressways using GPS data[J]. Accident Analysis And Prevention, 2019, 123: 29-38.
17 Xu Cheng-cheng, Liu Pan, Wang Wei, et al. Real-time identification of traffic conditions prone to injury and non-injury crashes on freeways using genetic programming[J]. Journal of Advanced Transportation, 2016, 50(5): 701-716.
18 Candida F. Gene expression programming: a new adaptive algorithm for solving problems[J]. Complex Systems, 2000, 13(2): 87-129.
19 Yang Kui, Yu Rong-jie, Wang Xue-song, et al. How to determine an optimal threshold to classify real-time crash-prone traffic conditions?[J]. Accident Analysis & Prevention, 2018, 117: 250-261.
20 Behnood A, Al-Bdairi N S S. Determinant of injury severities in large truck crashes: a weekly instability analysis[J]. Safety Science, 2020, 131: No. 104911.
21 Chand A, Jayesh S, Bhasi A B. Road traffic accidents: an overview of data sources, analysis techniques and contributing factors[J]. Materials Today: Proceedings, 2021, 47: 5135-5141.
[1] ZONG Fang, LU Feng-rui, TANG Ming, LYU Jian-yu, WU Ting. Impact of habit and traffic condition on travel route selection [J]. 吉林大学学报(工学版), 2018, 48(4): 1023-1028.
[2] CHEN Yu,TANG Chang-jie,ZHU Ming-fang,DAI Shu-cheng,ZHU Rui,JIANG Yue,LI Chuan. FactorGEP: a novel factorization of polynomial algorithm based on GEP with gene reduction strategy [J]. 吉林大学学报(工学版), 2009, 39(06): 1612-1617.
[3] JIANG Yue, TANG Chang-jie, LI Chuan, LI Sheng-zhi, YE Shang-yu, WU Jiang . Identification of Hammerstein model based on gene expression programming [J]. 吉林大学学报(工学版), 2008, 38(05): 1114-1119.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!