Journal of Jilin University(Engineering and Technology Edition) ›› 2022, Vol. 52 ›› Issue (11): 2718-2727.doi: 10.13229/j.cnki.jdxbgxb20210412

Previous Articles    

A driving decision⁃making approach based on multi⁃sensing and multi⁃constraints reward function

Zhong-li WANG(),Hao WANG,Yan SHEN,Bai-gen CAI   

  1. School of Electronic of Information Engineering,Beijing Jiaotong University,Beijing 100044,China
  • Received:2021-05-10 Online:2022-11-01 Published:2022-11-16

Abstract:

Due to the complicated and volatile traffic scenes, deep learning-based approaches and most of the deep reinforcement learning approaches cannot satisfy the requirements of real applications. To address these issues, a reinforcement learning-based approach based on multi-sensing and multi-constraint reward function under SAC framework(MSMC-SAC) is proposed. The inputs of the method include front images and LiDAR data, as well as the bird's-eye view information generated from the perception results. The multiple information input is coded by an encoding network to obtain the representation in latent space, and the reconstructed information is used as the input for reinforcement learning module, and a reward function considering various constraints such as transverse-longitudinal error, heading, smoothness, and driving speed is designed. The performance of the proposed method in some typical traffic scenarios is simulated and verified with CARLA. The multi-constraint reward mechanism is analyzed. The simulation results show that the presented approach can generate the driving policies in many traffic scenarios, and the performance is outperformed against the existing SOTA methods.

Key words: vehicle engineering, deep reinforcement learning, driving policy, multi-reward function

CLC Number: 

  • U469.79

Fig.1

Bird's view representation"

Fig.2

Diagram of system structure"

Fig.3

Definition of vehicle attitude error based on VFG"

Fig.4

Plot of negative exponential error function"

Fig.5

Traffic map for training"

Fig.6

Raw input data under different traffic scenarios"

Fig.7

Results after reconstruction corresponding to input data in Fig.6"

Fig.8

Individual algorithm rewards using only raw sensor data"

Table 1

Algorithm mean variance using raw data"

回报曲线算法回报均值回报标准差
MSMC-SAC420.688.7
DDPG55.857.5
DQN187.5146.1
TD3361.3123.9

Fig.9

Algorithm payoff curve under raw data and bird's eye view input"

Table 2

Raw data and algorithm return data under bird's eye view input"

回报曲线算法回报均值回报标准差
MSMC-SAC462.2127.2
DDPG283.1156.6
DQN163.7137.1
TD3332.3128.0

Fig.10

Simulation results of DQN during curves"

Fig.11

Simulation results of DDPG"

Fig.12

Simulation results of TD3"

Fig.13

Simulation results of method in this paper in different scenarios"

Fig.14

Analysis of impact of constraint terms reyand rh"

Table 3

Algorithm return data under missing some rewards"

回报曲线算法回报均值回报标准差
MSMC-SAC462.2127.2
No_rey MSMC-SAC288.1113.1
No_rh MSMC-SAC103.8100.3
1 杨顺, 蒋渊德, 吴坚, 等. 基于多类型传感数据的自动驾驶深度强化学习方法[J]. 吉林大学学报: 工学版, 2019, 49(4): 1026-1033.
Yang Shun, Jiang Yuan⁃de, Wu Jian, et al. Autonomous driving policy learning based on deep reinforcement learning and multi⁃type sensor data[J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(4): 1026-1033.
2 Silver D, Bagnell J A, Stentz A. Learning from demonstration for autonomous navigation in complex unstructured terrain[J]. The International Journal of Robotics Research, 2010, 29(12): 1565-1592.
3 Lange S, Riedmiller M, Voigtländer A. Autonomous reinforcement learning on raw visual input data in a real world application[C]∥The 2012 International Joint Conference on Neural Networks, Brisbane, Australia, 2012: 1-8.
4 Yu A, Palefsky-Smith R, Bedi R. Deep reinforcement learning for simulated autonomous vehicle control[J/OL]. [2020-08-04].
5 Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous control with deep reinforcement learning[J/OL]. [2021-09-09].
6 Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor[J/OL]. [2018-01-04].
7 Bansal M, Krizhevsky A, Ogale A. Chauffeurnet: learning to drive by imitating the best and synthesizing the worst[J/OL]. [2020-12-07]. .48550/arXiv.1812.03079
8 Kingma D P, Welling M. Auto-encoding variational bayes[J/OL]. [2020-12-20]. 50/arXiv.1312.6114
9 Woo J, Yu C, Kim N. Deep reinforcement learning-based controller for path following of an unmanned surface vehicle[J]. Ocean Engineering, 2019, 183: 155-166.
10 Dosovitskiy A, Ros G, Codevilla F, et al. CARLA: an open urban driving simulator[J/OL]. [2020-11-10].
11 Mnih V, Kavukcuoglu K, Silver D, et al. Playing atari with deep reinforcement learning[J/OL]. [2022-10-31].
[1] Ke-yong WANG,Da-tong BAO,Su ZHOU. Data-driven online adaptive diagnosis algorithm towards vehicle fuel cell fault diagnosis [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(9): 2107-2118.
[2] Qi-ming CAO,Hai-tao MIN,Wei-yi SUN,Yuan-bin YU,Jun-yu JIANG. Hydrothermal characteristics of proton exchange membrane fuel cell start⁃up at low temperature [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(9): 2139-2146.
[3] Hai-lin KUI,Ze-zhao WANG,Jia-zhen ZHANG,Yang LIU. Transmission ratio and energy management strategy of fuel cell vehicle based on AVL⁃Cruise [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(9): 2119-2129.
[4] Yan LIU,Tian-wei DING,Yu-peng WANG,Jing DU,Hong-hui ZHAO. Thermal management strategy of fuel cell engine based on adaptive control strategy [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(9): 2168-2174.
[5] Cheng LI,Hao JING,Guang-di HU,Xiao-dong LIU,Biao FENG. High⁃order sliding mode observer for proton exchange membrane fuel cell system [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(9): 2203-2212.
[6] Pei ZHANG,Zhi-wei WANG,Chang-qing DU,Fu-wu YAN,Chi-hua LU. Oxygen excess ratio control method of proton exchange membrane fuel cell air system for vehicle [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(9): 1996-2003.
[7] Xun-cheng CHI,Zhong-jun HOU,Wei WEI,Zeng-gang XIA,Lin-lin ZHUANG,Rong GUO. Review of model⁃based anode gas concentration estimation techniques of proton exchange membrane fuel cell system [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(9): 1957-1970.
[8] Yao-wang PEI,Feng-xiang CHEN,Zhe HU,Shuang ZHAI,Feng-lai PEI,Wei-dong ZHANG,Jie-ran JIAO. Temperature control of proton exchange membrane fuel cell thermal management system based on adaptive LQR control [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(9): 2014-2024.
[9] Guang-di HU,Hao JING,Cheng LI,Biao FENG,Xiao-dong LIU. Multi⁃objective sliding mode control based on high⁃order fuel cell model [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(9): 2182-2191.
[10] Feng-xiang CHEN,Qi WU,Yuan-song LI,Tian-de MO,Yu LI,Li-ping HUANG,Jian-hong SU,Wei-dong ZHANG. Matching,simulation and optimization for 2.5 ton fuel cell/battery hybrid forklift [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(9): 2044-2054.
[11] Xiao-hua WU,Zhong-wei YU,Zhang-ling ZHU,Xin-mei GAO. Fuzzy energy management strategy of fuel cell buses [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(9): 2077-2084.
[12] Qing GAO,Hao-dong WANG,Yu-bin LIU,Shi JIN,Yu CHEN. Experimental analysis on spray mode of power battery emergency cooling [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1733-1740.
[13] Kui-yang WANG,Ren HE. Recognition method of braking intention based on support vector machine [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1770-1776.
[14] Jun-cheng WANG,Lin-feng LYU,Jian-min LI,Jie-yu REN. Optimal sliding mode ABS control for electro⁃hydraulic composite braking of distributed driven electric vehicle [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1751-1758.
[15] Han-wu LIU,Yu-long LEI,Xiao-feng YIN,Yao FU,Xing-zhong LI. Multi⁃point control strategy optimization for auxiliary power unit of range⁃extended electric vehicle [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1741-1750.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!