基于Q学习算法的交通控制与诱导协同模式的在线选择

吉林大学学报(工学版) ›› 2010, Vol. 40 ›› Issue (05): 1215-1219.

基于Q学习算法的交通控制与诱导协同模式的在线选择

杨庆芳^1,2,杨朝²

1.吉林大学汽车动态模拟国家重点实验室,长春 130022；2.吉林大学交通学院,长春 130022

收稿日期:2009-11-05 出版日期:2010-09-01 发布日期:2010-09-01
通讯作者: 杨庆芳（1966-）,女,教授,博士生导师.研究方向:智能运输系统.E-mail:yangqf@jlu.edu.cn E-mail:yangqf@jlu.edu.cn
作者简介:杨庆芳（1966-）,女,教授,博士生导师.研究方向:智能运输系统.E-mail:yangqf@jlu.edu.cn

On line selection method of the traffic control and route guidance collaboration mode based on Q learning algorithm

YANG Qing-fang^1,2,YANG Chao²

1.State Key Laboratory of Automobile Dynamic Simulation, Jilin University, Changchun 130022, China|2.College of Transportation, Jilin University, Changchun 130022, China

Received:2009-11-05 Online:2010-09-01 Published:2010-09-01
Supported by:
“863”国家高技术研究发展计划项目(2007AA12Z242)

摘要/Abstract

摘要：

采用Q学习算法实现了交通控制与诱导协同模式的在线选择。首先，采用Q学习算法训练多智能体，根据多智能体内部的推理得到不同交通状态下的最优协同模式，最终实现交通控制与交通诱导协同模式的在线选择与转换。仿真结果表明，本文提出的基于Q学习算法的协同模式选择方法在一般交通拥挤状态下具有较好的协同控制效果，对比离线式模式选择方法更能适应交通状态的不断变化，从而达到有效避免严重交通拥堵、改善路网性能的目的。

关键词: 交通运输工程, 交通控制与诱导协同, 模式选择, Q学习算法, 回报函数

Abstract:

The on line traffic control and route guidance collaboration mode selection was realized by the Q learning algorithm. Using the multiintelligence agents trained with the Qlearning algorithm, the optimal collaboration mode was obtained under different traffic conditions according to the inner inference of the multiintelligence agent. So,the on line selection and switching of the traffic control and route guidance collaboration mode was accomplished. The simulation results show that the proposed collaboration mode selection method based on the Q learning is characterized by better collaboration control effect under the ordinary traffic congestion condition and more adaptive to constantly changing traffic condition than the traditional off line mode selection method. The proposed method is helpful to avoiding the heavy traffic congestion and improving the traffic network performance.

Key words: engineering of communications and transportation, collaboration of traffic control and route guidance, mode selection, Q learning algorithm, reward function

中图分类号:

U491

杨庆芳, 杨朝. 基于Q学习算法的交通控制与诱导协同模式的在线选择[J]. 吉林大学学报(工学版), 2010, 40(05): 1215-1219.

YANG Qing-Fang, YANG Chao. On line selection method of the traffic control and route guidance collaboration mode based on Q learning algorithm[J]. 吉林大学学报(工学版), 2010, 40(05): 1215-1219.

参考文献

相关文章 15

[1]	徐洪峰, 高霜霜, 郑启明, 章琨. 信号控制交叉口的复合动态车道管理方法[J]. 吉林大学学报(工学版), 2018, 48(2): 430-439.
[2]	王海玮, 温惠英, 刘敏. 夜间环境驾驶员精神负荷的生理特性评估与实验[J]. 吉林大学学报(工学版), 2017, 47(2): 420-428.
[3]	姜桂艳, 刘彬, 隋晓艳, 马明芳. 基于IC卡收费系统的公交客流信息实时采集方法[J]. 吉林大学学报(工学版), 2016, 46(4): 1076-1082.
[4]	宗芳, 王占中, 贾洪飞, 焦玉玲, 吴杨. 基于支持向量机的通勤日活动-出行持续时间预测[J]. 吉林大学学报(工学版), 2016, 46(2): 406-411.
[5]	潘义勇, 马健霄, 孙璐. 基于可靠度的动态随机交通网络耗时最优路径[J]. 吉林大学学报(工学版), 2016, 46(2): 412-417.
[6]	李世武, 徐艺, 孙文财, 王琳虹, 郭梦竹, 柴萌. 基于瞳孔直径的撞固定物冲突自反馈识别方法[J]. 吉林大学学报(工学版), 2016, 46(2): 418-425.
[7]	赵淑芝, 梁士栋, 马明辉, 刘华胜, 朱永刚. 信号交叉口实时排队长度估计[J]. 吉林大学学报(工学版), 2016, 46(1): 85-91.
[8]	刘华胜，赵淑芝，朱永刚，李晓玉. 基于有效路径的轨道交通接运线路设计模型[J]. 吉林大学学报(工学版), 2015, 45(2): 371-378.
[9]	祝进城，肖峰，帅斌，刘晓波. 城市出租车拥挤收费[J]. 吉林大学学报(工学版), 2015, 45(1): 89-96.
[10]	游峰, 张荣辉, 王海玮, 徐建闽, 温惠英. 欠驱动半挂汽车列车的运动建模与跟踪控制[J]. 吉林大学学报(工学版), 2014, 44(5): 1296-1302.
[11]	程国柱, 李德欢, 吴立新, 莫宣艳, 徐慧智. 城市道路人行横道处照明指标的确定[J]. 吉林大学学报(工学版), 2014, 44(5): 1308-1314.
[12]	李世武, 姚雪萍, 孙文财, 王琳虹, 赖祥翔, 王德强. 体现悬架特性的车辆载荷状态监测技术[J]. 吉林大学学报(工学版), 2014, 44(2): 335-342.
[13]	周伟, 赵胜川. 基于Mixed Logit模型的路线选择行为量化分析[J]. 吉林大学学报(工学版), 2013, 43(02): 304-309.
[14]	杨庆芳, 张彪, 高鹏. 基于改进动态递归神经网络的交通量短时预测方法[J]. , 2012, 42(04): 887-891.
[15]	李世武, 田晶晶, 王琳虹, 孙文财, 陈璐, 汪海正, 王燕. 多传感器的车辆装载状态动态监测方法 [J]. , 2012, (03): 569-574.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed