Journal of Jilin University(Engineering and Technology Edition) ›› 2025, Vol. 55 ›› Issue (10): 3180-3188.doi: 10.13229/j.cnki.jdxbgxb.20240017

Previous Articles    

Modeling interaction policy of autonomous vehicle and pedestrian based on deep reinforcement learning

Wei-chao HU1,2(),Zhen-ming YANG3,Peng-cheng YU2,Yan-yan CHEN1,She-qiang MA3   

  1. 1.Faculty of Architecture,Civil and Transportation Engineering,Beijing University of Technology,Beijing 100124,China
    2.Research Institute for Road Safety of the Ministry of Public Security,Beijing 100062,China
    3.School of Traffic Management,People’s Public Security University of China,Beijing 100091,China
  • Received:2024-01-03 Online:2025-10-01 Published:2026-02-03

Abstract:

To facilitate safe and efficient interactions between Autonomous Vehicles(AVs) and pedestrians, this study employs the Multi-Agent Deep Deterministic Policy Gradient(MADDPG) algorithm to establish a pedestrian-vehicle interaction model in a mixed traffic context that includes both autonomous and human-driving vehicles. This model formulates interaction strategies enabling AVs to avert accidents without the necessity of direct inter-vehicle communication. In comparison with several benchmark algorithms, the proposed algorithm demonstrates substantial improvements in terms of training efficacy, collision frequency reduction, and traffic capacity. Additionally, the robustness of the proposed model is assessed across varied risk scenarios. Findings reveal that as the intensity of pedestrian behavioral randomness, or behavioral noise rises, the duration of interaction delays of both vehicle categories increases. Remarkably, the collision rate of AVs initially increases before declining, indicating an adaptive learning phase. Under conditions of elevated noise, AVs exhibit a superior capability for collision avoidance compared to human-driving vehicles, highlighting their enhanced resilience in chaotic urban traffic conditions. These outcomes underscore the potential of MADDPG-based frameworks to significantly contribute to safer, more efficient AV integration in mixed traffic scenarios.

Key words: engineering of transportation system, autonomous vehicle, interaction of vehicle and pedestrian, deep reinforcement learning, multi-agent system

CLC Number: 

  • U491

Fig.1

DDPG working flow chart"

Fig.2

Training reward comparison"

Fig.3

Velocity comparison"

Fig.4

AV/HDV collision rate"

Fig.5

AV/HDV dalay"

Fig.6

Agent interaction diagram"

Fig.7

Velocity change of AV1 and AV2"

[1] 公安部交通管理局. 中华人民共和国道路交通事故统计年报(2022年度)[R/OL].[2024-05-22].
[2] 刘荣, 王凤兰, 吕良东. 基于改进复制动态演化博弈模型的行人与机动车冲突[J].科学技术与工程,2020, 20(30): 12486-12491.
Liu Rong, Wang Feng-lan, Liang-dong Lyu. Game model of pedestrian-vehicle conflict based on improved replication dynamic evolution[J]. Science Technology and Engineering, 2020, 20(30): 12486-12491.
[3] Gupta S, Vasardani M, Winter S. Negotiation between vehicles and pedestrians for the right of way at intersections[J]. IEEE Transactions on Intelligent Transportation Systems, 2018, 20(3): 888-899.
[4] Kalatian A, Farooq B. Deepwait: pedestrian wait time estimation in mixed traffic conditions using deep survival analysis[C]∥IEEE Intelligent Transportation Systems Conference(ITSC),Auckland, New Zealand, 2019: 2034-2039.
[5] Schratter M, Hartmann M, Watzenig D. Pedestrian collision avoidance system for autonomous vehicles[J]. SAE International Journal of Connected and Automated Vehicles, 2019, 2(12): 279-293.
[6] Camara F, Romano R, Markkula G, et al. Empirical game theory of pedestrian interaction for autondomous vehicles[C]∥Proceedings of Measuring Behavior, Manchester, UK, 2018: 238-244.
[7] Chae H, Kang C M, Kim B D, et al. Autonomous braking system via deep reinforcement learning[C]∥IEEE 20th International Conference on Intelligent Transportation Systems(ITSC),Shanghai, China, 2017: 1-6.
[8] Papini G P R, Plebe A, Da Lio M, et al. A reinforcement learning approach for enacting cautious behaviours in autonomous driving system: safe speed choice in the interaction with distracted pedestrians[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 23(7): 8805-8822.
[9] Schroeder B J, Rouphail N M. Event-based modeling of driver yielding behavior at unsignalized crosswalks[J].Journal of Transportation Engineering, 2011, 137(7): 455-465.
[10] Zhao J, Malenje J O, Wu J, et al. Modeling the interaction between vehicle yielding and pedestrian crossing behavior at unsignalized midblock crosswalks[J].Transportation Research Part F: Traffic Psychology and Behaviour, 2020, 73: 222-235.
[11] 张健, 李青扬, 李丹, 等. 基于深度强化学习的自动驾驶车辆专用道汇入引导[J]. 吉林大学学报: 工学版, 2023, 53(9): 2508-2518.
Zhang Jian, Li Qing-yang, Li Dan, et al. Merging guidance of exclusive lanes for connected and autonomous vehicles based on deep reinforcement learning[J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(9): 2508-2518.
[12] 秦严严, 王昊, 王炜. 智能网联环境下的混合交通流LWR模型[J]. 中国公路学报, 2018, 31(11): 147-156.
Qin Yan-yan, Wang Hao, Wang Wei. LWR model for mixed traffic flow in connected and autonomous vehicular environments[J]. China Journal of Highway and Transport, 2018, 31(11): 147-156.
[13] Becker F, Axhausen K W. Literature review on surveys investigating the acceptance of automated vehicles[J]. Transportation, 2017, 44(6): 1293-1306.
[14] Elliott D, Keen W, Miao L. Recent advances in connected and automated vehicles[J]. Journal of Traffic and Transportation Engineering(English Edition),2019, 6(2): 109-131.
[15] Mnih V, Kavukcuoglu K, Silver D, et al. Playing atari with deep reinforcement learning[J/OL].[2023-03-10]. .
[16] Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous control with deep reinforcement learning[J/OL]. [2023-12-10]. https: arxiv.org/pdf/1509.02971.
[17] Lowe R, Wu Y, Tamar A, et al. Multi-agent actor-critic for mixed cooperative-competitive environments[J]. Advances in Neural Information Processing Systems, 2017, 30:1-16.
[18] Franois-Lavet V, Henderson P, Islam R,et al.An introduction to deep reinforcement learning[J]. Foundations and Trends® in Machine Learning, 2018, 11(3-4):219-354.
[19] Kiran B R, Sobh I, Talpaert V, et al. Deep reinforcement learning for autonomous driving: a survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 23(6): 4909-4926.
[20] Vasquez R, Farooq B. Multi-objective autonomous braking system using naturalistic dataset[C]∥IEEE Intelligent Transportation Systems Conference(ITSC),Auckland, New Zealand, 2019: 4348-4353.
[21] 王殿海, 金盛. 车辆跟驰行为建模的回顾与展望[J].中国公路学报, 2012, 25(1): 115-127.
Wang Dian-hai, Jin Sheng. Review and outlook of modeling of car following behavior[J]. China Journal of Highway and Transport, 2012, 25(1): 115-127.
[22] Willis A, Gjersoe N, Havard C, et al. Human movement behaviour in urban spaces: implications for the design and modelling of effective pedestrian environments[J]. Environment and Planning B: Planning and Design, 2004, 31(6): 805-828.
[23] Trumpp R, Bayerlein H, Gesbert D. Modeling interactions of autonomous vehicles and pedestrians with deep multi-agent reinforcement learning for collision avoidance[C]∥IEEE Intelligent Vehicles Symposium (IV), Beijing, China, 2022: 331-336.
[24] 王辉, 秦华, 冉令华, 等. 无交通信号路口行人过街的人车交互过程研究[J]. 科学技术与工程, 2023, 23(28):12275-12281.
Wang Hui, Qin Hua, Ran Ling-hua, et al. Human vehicle interaction process of pedestrian crossing at no traffic signal intersection[J].Sicence Technology and Engineering, 2023, 23(28): 12275-12281.
[25] Schmidt S, Faerber B. Pedestrians at the Kerb-recognising the action intentions of humans[J]. Transportation Research Part F: Traffic Psychology and Behaviour, 2009, 12(4): 300-310.
[26] Dean B A K.Grammatical design and crowd behaviour: a study of factors that influence human movement in urban spaces[C]∥Proceedings of the 10th International Conference on Computer Aided Architectural Design Research in Asia,New Delhi, India, 2005:648-650.
[27] Millard-Ball A. Pedestrians, autonomous vehicles, and cities[J]. Journal of Planning Education and Research, 2018, 38(1): 6-12.
[1] De-hua WU,Rong-feng CHEN. Characteristics of passenger-cargo mixed traffic flow in intelligent network and agglomeration lane-change strategy [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(8): 2588-2596.
[2] Zi-hao SHEN,Yong-sheng GAO,Hui WANG,Pei-qian LIU,Kun LIU. Deep deterministic policy gradient caching method for privacy protection in Internet of Vehicles [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(5): 1638-1647.
[3] Zhen-hai GAO,Cheng-yuan ZHENG,Rui ZHAO. Review of active safety verification and validation for autonomous vehicles in real and virtual scenarios [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(4): 1142-1162.
[4] Guang-he ZHU,Zhi-qiang ZHU,Yi-ping YUAN. Deep reinforcement learning optimization scheduling algorithm for continuous production line [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(7): 2086-2092.
[5] Hui-zhao TU,Chang LU,Miao-jia LU,Hao LI. Risk factors for autonomous vehicle road testing based on risk-avoiding disengagement [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(7): 1935-1943.
[6] Yun PU,Yin XU,Hai-xu LIU,Yi-fan TAN. An improved car⁃following model for connected and automated vehicles considering impact of multiple vehicles [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(5): 1285-1292.
[7] Hui-zhao TU,Wan-jin WANG,Peng QIAO,Jing-qiu GUO,Chang LU,Hai-fei WU. Analysis of drivers′ intervention behavior in autonomous truck road testing [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(3): 727-740.
[8] Da-yi QU,Ke-kun ZHANG,Yuan GU,Tao WANG,Hui SONG,Shou-chen DAI. Analysis of lane⁃changing decision⁃making behavior and molecular dynamics modeling for autonomous vehicles [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(3): 700-710.
[9] Jing-peng GAO,Guo-xuan WANG,Lu GAO. LSTM⁃MADDPG multi⁃agent cooperative decision algorithm based on asynchronous collaborative update [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(3): 797-806.
[10] Zhen-hai GAO,Rong-gui CAI,Tian-jun SUN,Tong YU,Hao-yuan ZHAO,Hao BAN. Data⁃filtering method for driving behavior based on vehicle shared autonomy [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(3): 589-599.
[11] Jian ZHANG,Qing-yang LI,Dan LI,Xia JIANG,Yan-hong LEI,Ya-ping JI. Merging guidance of exclusive lanes for connected and autonomous vehicles based on deep reinforcement learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(9): 2508-2518.
[12] Yan-tao TIAN,Yan-shi JI,Huan CHANG,Bo XIE. Deep reinforcement learning augmented decision⁃making model for intelligent driving vehicles [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(3): 682-692.
[13] Zhi-wei LIU,Zheng-yun SONG,Jian-rong LIU. Impact of shared autonomous vehicles on choice of subway station connection methods [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(12): 3424-3431.
[14] Wei-chao ZHUANG,Hao-nan Ding,Hao-xuan DONG,Guo-dong YIN,Xi WANG,Chao-bin ZHOU,Li-wei XU. Learning based eco⁃driving strategy of connected electric vehicle at signalized intersection [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(1): 82-93.
[15] Yong LIU,Lei XU,Chu-han ZHANG. Deep reinforcement learning model for text games [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 666-674.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!