吉林大学学报(工学版) ›› 2025, Vol. 55 ›› Issue (12): 3918-3927.doi: 10.13229/j.cnki.jdxbgxb.20240606

• 交通运输工程·土木工程 • 上一篇    

基于语言推理和认知记忆的自动驾驶决策模型

王祥1(),谭国真1,彭衍飞1,任浩2,李健平1   

  1. 1.大连理工大学 计算机科学与技术学院,辽宁 大连 116081
    2.清华大学 精密仪器系,北京 100084
  • 收稿日期:2024-05-31 出版日期:2025-12-01 发布日期:2026-02-03
  • 作者简介:王祥(1997-),男,博士研究生.研究方向:人工智能,自动驾驶.E-mail:DrWangXiang@mail.dlut.edu.cn
  • 基金资助:
    国家自然科学基金重点项目(U1808206)

Autonomous driving decision⁃making model based on language reasoning and cognitive memory

Xiang WANG1(),Guo-zhen TAN1,Yan-fei PENG1,Hao REN2,Jian-ping LI1   

  1. 1.School of Computer Science and Technology,Dalian University of Technology,Dalian 116081,China
    2.Department of Precision Instrument,Tsinghua University,Beijing 100084,China
  • Received:2024-05-31 Online:2025-12-01 Published:2026-02-03

摘要:

针对传统自动驾驶系统安全性能不足、学习效率低等问题,提出了一种可持续学习和理解语言信息的自动驾驶安全决策模型。该模型参考人类驾驶的推理决策和经验积累过程,以大型语言模型(LLM)作为决策智能体,将思维链推理、两阶段注意力机制和认知记忆存储与检索整合到驾驶过程的上下文安全学习中;同时,采用运动学模块将LLM决策转化为可操作的驾驶命令,实现安全驾驶经验的持续学习。实验结果表明,本文决策模型相较于基于规则、强化学习和知识的方法,在安全、效率方面有显著提升,并具备持续学习和根据人类指令调整驾驶行为的能力,可为类人自动驾驶提供参考。

关键词: 车辆工程, 自动驾驶, 持续学习, 大语言模型, 思维链推理, 两阶段注意力机制

Abstract:

To address the issues of insufficent safety performance and low learning inefficient in traditional autonomous driving systems, an autonomous driving safety decision-making model capable of continuous learning and understanding linguistic information was proposed. Referring to the reasoning decision-making and experience accumulation processes in human driving, this model leverages a large language model (LLM) as the decision-making agent, integrating chain-of-thought reasoning, a two-stage attention mechanism, and cognitive memory storage and retrieval into the contextual safety learning of the driving process. Meanwhile, a kinematic module is employed to convert LLM decisions into executable driving commands, enabling the continuous learning of safe driving experiences. Experimental results demonstrate that the proposed decision-making model significantly improves safety and efficiency compared to rule, reinforcement learning, and knowledge-based approaches, and possesses the capability of continuous learning and adapting driving behaviors based on human instructions, providing a reference for human-like autonomous driving.

Key words: vehicle engineering, autonomous driving, continuous learning, large language model, chain-of-thought reasoning, two-stage attention mechanism

中图分类号: 

  • U495

图1

模型总体架构"

图2

自注意力头结构"

图3

两阶段记忆查询"

图4

上下文学习指导决策"

图5

车辆运动学模型"

表1

Llama 2数据流"

数据流参数类型说明
输入modelstringllama2-7b-chat-v2
messageslist输入的内容
result_formatstring用户返回的内容类型
输出request_idstringllama2-7b-chat-v2
outputlist调用结果信息

usage.input_

tokens

int用户输入文本转换为Token后的长度

usage.output_

tokens

int模型生成回复转换为Token后的长度

表2

注意力机制参数设置"

结构取值
输入[·,6]
层数编码器[64,64];2个注意力头;dk =32;解码器[64,64]
参数量3.4×104

表3

决策评价指标"

场景方法CζtPDPdCn

高速

环境

LRCMM012.826.41.351.3113.89
RBM426.819.73.253.1327.83
MFRLM835.215.73.843.4546.95
KDM314.629.52.472.3120.84

十字

路口

LRCMM032.847.21.840.9124.22
RBM275.132.14.212.6448.11
MFRLM968.430.55.281.0650.50
KDM742.558.53.983.6140.31

图6

高速环境决策过程"

图7

十字路口决策过程"

图8

持续学习结果"

表4

消融实验"

场景TAMCMCζtPDPdCn

高速

环境

××669.534.34.353.9448.00
×464.736.53.843.1243.82
×338.628.42.371.2626.90
014.727.11.441.2914.97

十字

路口

××233.748.94.213.9538.03
×332.547.84.183.8437.35
×132.147.33.893.5235.57
031.746.23.743.4334.55

表5

不同指令下的驾驶行为"

指导命令aˉ/(m·s-2sˉ/radvˉ/(m·s-1gˉ/mlˉ
DMA3.500.0334.78.67
DMC0.210.0121.241.81
NEC1.870.0228.325.62

图9

驾驶行为"

[1] 马依宁, 姜为, 吴靖宇, 等. 基于不同风格行驶模型的自动驾驶仿真测试自演绎场景研究[J]. 中国公路报, 2023, 36(2): 216-228.
Ma Yi-ning, Jiang Wei, Wu Jing-yu, et al. Self- evolution scenarios for simulation tests of autonomous vehicles based on different models of driving styles[J]. China Journal of Highway and Transport, 2023, 36(2): 216-228.
[2] 李伟东, 马草原, 史浩, 等. 基于分层强化学习的自动驾驶决策控制算法[J]. 吉林大学学报: 工学版, 2025, 55(5): 1798-1805.
Li Wei-dong, Ma Cao-yuan, Shi Hao, et al. An automatic driving decision control algorithm based on
hierarchical reinforcement learning[J]. Journal of Jilin University (Engineering and Technology Edition), 2025, 55(5): 1798-1805.
[3] 朱波, 张纪伟, 谈东奎, 等. 基于多源传感器与导航地图的端到端自动驾驶方法[J]. 汽车安全与节能学报, 2022, 13(4): 738-749.
Zhu Bo, Zhang Ji-wei, Tan Dong-kui, et al. End-to-end autonomous driving method based on multi-source sensor and navigation map[J]. Journal of Automotive Safety and Energy, 2022, 13(4): 738-749.
[4] Zhang Q X, Zhao Y H, Wang Y J, et al. Towards cross-task universal perturbation against black-box object detectors in autonomous driving[J]. Computer Networks, 2020, 180: No.107388.
[5] Wang S Y, Zhu Y X, Li Z H, et al. ChatGPT as your vehicle co-pilot: An initial attempt[J]. IEEE Transactions on Intelligent Vehicles, 2023, 8(12): 4706-4721.
[6] Cui Y D, Huang S C, Zhong J M, et al. DriveLLM: charting the path toward full autonomous driving with large language models[J]. IEEE Transactions on Intelligent Vehicles, 2023, 9(1): 1450-1464.
[7] Kojima T, Gu S S, Reid M, et al. Large language models are zero-shot reasoners[J]. Advances in Neural Information Processing Systems, 2022, 35: 22199- 22213.
[8] 王祥, 谭国真. 基于知识与大语言模型的高速环境自动驾驶决策研究[J]. 系统仿真学报, 2025(5): 1246-1255.
Wang Xiang, Tan Guo-zhen. Research on decision-making of autonomous driving in highway environment based on knowledge and large language model[J]. Journal of System Simulation, 2025(5): 1246-1255.
[9] Peng Y F, Tan G Z, Si H W, et al. DRL-GAT-SA: Deep reinforcement learning for autonomous driving planning based on graph attention networks and simplex architecture[J]. Journal of Systems Architecture, 2022, 126: No.102505.
[10] 胡宏宇, 张慧珺, 姚荣涵, 等. L3级自动驾驶接管过程驾驶员情景意识研究[J]. 吉林大学学报: 工学版, 2024, 54(2): 410-418.
Hu Hong-yu, Zhang Hui-jun, Yao Rong-han, et al. Driver's situational awareness in takeover process of L3 automated vehicles[J]. Journal of Jilin University (Engineering and Technology Edition), 2024, 54(2): 410-418.
[11] Nie X T, Liang Y P, Ohkura K. Autonomous highway driving using reinforcement learning with safety check system based on time-to-collision[J]. Artificial Life and Robotics, 2023, 28(1): 158-165.
[12] Chang M K, Lee S H, Chung C C. Comparative evaluation of dynamic and kinematic vehicle models[C]∥Conference on Decision and Control, Los Angeles, CA, USA, 2015: 648-653.
[13] Treiber M, Hennecke A, Helbing D. Congested traffic states in empirical observations and microscopic simulations[J]. Physical Review E, 2000, 62(2): 1805.
[14] Xin L, Kong Y T, Li S E, et al. Enable faster and smoother spatio-temporal trajectory planning for autonomous vehicles in constrained dynamic environment[J].Journal of Automobile Engineering, 2021, 235(4): 1101-1112.
[15] Li G F, Li S L, Li S, et al. Deep reinforcement learning enabled decision-making for autonomous driving at intersections[J]. Automotive Innovation, 2020, 3: 374-385.
[1] 兰巍,周政,王冠宇,王伟,张苗苗. 基于机器学习的汽车设计智能拟合方法[J]. 吉林大学学报(工学版), 2025, 55(9): 2858-2863.
[2] 孙天骏,杨惠喆,蔡荣贵,冯嘉仪,冉锐,刘斌. 面向纯电动汽车自适应巡航系统的人性化起停控制策略[J]. 吉林大学学报(工学版), 2025, 55(9): 2847-2857.
[3] 李寿涛,贾湘怡,朱军,郭洪艳,于丁力. 基于Level-K的智能驾驶汽车无信控交叉路口决策方法[J]. 吉林大学学报(工学版), 2025, 55(9): 3069-3078.
[4] 朱冰,孟鹏翔,刘斌,韩嘉懿,赵健,陈志成,宋东鉴,陶晓文. 基于交通环境信息的虚拟车道线拟合方法[J]. 吉林大学学报(工学版), 2025, 55(9): 2935-2945.
[5] 赵俊武,曲婷,胡云峰. 基于自适应采样的智能车辆轨迹规划方法[J]. 吉林大学学报(工学版), 2025, 55(8): 2802-2816.
[6] 于贵申,陈鑫,唐悦,赵春晖,牛艾佳,柴辉,那景新. 激光表面处理对铝-铝粘接接头剪切强度的影响[J]. 吉林大学学报(工学版), 2025, 55(8): 2555-2569.
[7] 高金武,孙少龙,王舜尧,高炳钊. 基于电机转矩补偿的增程器转速波动抑制策略[J]. 吉林大学学报(工学版), 2025, 55(8): 2475-2486.
[8] 贾美霞,胡建军,肖凤. 基于多软件联合的车用电机变工况多物理场仿真方法[J]. 吉林大学学报(工学版), 2025, 55(6): 1862-1872.
[9] 宋学伟,于泽平,肖阳,王德平,袁泉,李欣卓,郑迦文. 锂离子电池老化后性能变化研究进展[J]. 吉林大学学报(工学版), 2025, 55(6): 1817-1833.
[10] 肖纯,易子淳,周炳寅,张少睿. 基于改进鸽群优化算法的燃料电池汽车模糊能量管理策略[J]. 吉林大学学报(工学版), 2025, 55(6): 1873-1882.
[11] 王健,贾晨威. 面向智能网联车辆的轨迹预测模型[J]. 吉林大学学报(工学版), 2025, 55(6): 1963-1972.
[12] 李伟东,马草原,史浩,曹衡. 基于分层强化学习的自动驾驶决策控制算法[J]. 吉林大学学报(工学版), 2025, 55(5): 1798-1805.
[13] 卢荡,索艳茹,孙宇航,吴海东. 基于无量纲格式的轮胎侧倾侧偏力学特性预测[J]. 吉林大学学报(工学版), 2025, 55(5): 1516-1524.
[14] 高镇海,郑程元,赵睿. 真实与虚拟场景下自动驾驶车辆的主动安全性验证与确认综述[J]. 吉林大学学报(工学版), 2025, 55(4): 1142-1162.
[15] 张涛,林黄达,余中军. 混合动力车辆换挡的实时滚动优化控制方法[J]. 吉林大学学报(工学版), 2025, 55(4): 1215-1224.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!