吉林大学学报(工学版) ›› 2021, Vol. 51 ›› Issue (4): 1427-1436.doi: 10.13229/j.cnki.jdxbgxb20200588

• 计算机科学与技术 • 上一篇    

基于改进YOLOv3的车辆前方动态多目标检测算法

金立生1,2(),郭柏苍1,王芳荣3,石健4()   

  1. 1.燕山大学 车辆与能源学院,河北 秦皇岛 066004
    2.燕山大学 河北省特种运载装备重点实验室,河北 秦皇岛 066004
    3.吉林大学 通信工程学院,长春 130022
    4.北京理工大学 机械与车辆学院,北京 100081
  • 收稿日期:2020-07-31 出版日期:2021-07-01 发布日期:2021-07-14
  • 通讯作者: 石健 E-mail:jinls@ysu.edu.cn;sj920443563@126.com
  • 作者简介:金立生(1975-),男,教授,博士生导师.研究方向:汽车安全,智能车辆导航.E-mail:jinls@ysu.edu.cn
  • 基金资助:
    国家自然科学基金项目(52072333);河北省重点研发计划项目(21340801D)

Dynamic multiple object detection algorithm for vehicle forward based on improved YOLOv3

Li-sheng JIN1,2(),Bai-cang GUO1,Fang-rong WANG3,Jian SHI4()   

  1. 1.School of Vehicle and Energy,Yanshan University,Qinhuangdao 066004,China
    2.Hebei Key Laboratory of Special Delivery Equipment,Yanshan University,Qinhuangdao 066004,China
    3.College of Communication Engineering,Jilin University,Changchun 130022,China
    4.School of Mechanical Engineering,Beijing Institute of Technology,Beijing 100081,China
  • Received:2020-07-31 Online:2021-07-01 Published:2021-07-14
  • Contact: Jian SHI E-mail:jinls@ysu.edu.cn;sj920443563@126.com

摘要:

现阶段的环境感知目标检测技术多为单类目标检测,或是将一幅图像中所有目标均列为待检目标,较少有对处于车辆前方的目标进行针对性的划分和检测。为了解决以上问题,提出了将车辆前方的待检目标分为两类:一是危险性较大,随时可能发生位移的动态目标,包括四轮车辆、二轮车辆和人;二是危险性较小,不会发生位移的静态目标,包括交通信号灯和交通标识。针对危险性较大的车辆前方动态多目标,提出了一种可以移植于嵌入式端的改进YOLOv3的目标检测算法,针对原始YOLOv3算法得到模型较大,难以在嵌入式端实时检测的缺点,以轻量型骨干网络MobileNetV2替换YOLOv3原始骨干网络Darknet-53进行特征提取,在训练中加入群组归一化操作,并使用Adam作为优化器。使用提取后的BDD100K数据集进行训练,利用未参与训练的BDD100K部分数据集和自采标注的Team_test数据集进行测试。研究结果表明,相比于原始YOLOv3算法,本文算法的漏检率可以维持在5%以内,在mAP提升0.020的基础上,本文模型在参数量上较YOLOv3基础模型减小了约89%,在CPU下的Inference Time缩小了约70%。

关键词: 无人驾驶技术, 环境感知, 深度学习, 多目标检测, 轻量化模型

Abstract:

The task of object detection plays an important role in the safe driving of driverless vehicles. Currently, the object detection technology of environment percept is mostly one-class object detection or all the objects in an image are listed as the target to be detected. Numerous studies have not yet focused on object division and detection of the objects in front of the vehicle. To solve the above problems, in this paper, the objects to be detected in front of vehicles are divided into two categories. One is the dynamic targets with high risk and displacement at any time, including four-wheel vehicle, two-wheel vehicle and people. The other one is the static targets with less danger and no displacement, including traffic lights and traffic signs. For the dynamic multiple objects in front of the vehicle, an improved algorithm of object detection based on YOLOv3 is proposed, which can be transplanted to the embedded system. To overcome the shortcoming of the original YOLOv3 algorithm, that it is difficult to get real-time detection in the embedded terminal, the original backbone network Darknet-53 was replaced with MobileNetV2 to extract features, adding Group Normalization operation in the training process and using Adam as optimizer. The extracted BDD100K dataset is used for training. The model is tested with BDD100k partial dataset not involved in training and Team_test dataset produced by our research group. The results show that compared with original YOLOv3, the missing rate (MR) of the algorithm in this paper can be kept within 5%, and based on the increase of 0.020 in mAP, comparing with the basic model of YOLOv3, the parameters of YOLOv3-MobileNetV2 model are reduced by about 89%, the Inference Time is reduced by about 70% under the CPU.

Key words: driverless technology, environment percept, deep learning, multiple object detection, lightweight model

中图分类号: 

  • U491.2

图1

YOLOv3网络结构示意"

图2

Darknet-53网络"

图3

标准卷积核分解"

图4

MobileNets与MobileNetV2的微结构对比"

图5

ResNet与MobileNetV2的微结构对比"

图6

MobileNetV2网络"

图7

BN与GN示意图"

表1

实验采用的数据集"

Instance ObjectBDD100KTeam_test
Car1 021 84710 716
Bus16 5051 078
Truck42 9631 082
Bike10 229894
Motor4 296673
Person129 2626 023
Rider6 461834

图8

Loss变化曲线"

表2

两模型MR与EFR对比"

Index ModelMR/%EFR/%
YOLOv37.34.6
本文模型4.93.7

图9

混淆矩阵"

表3

两模型AP与mAP对比"

Model APYOLOv3本文模型
Car0.8330.864
Bus0.8190.827
Truck0.7840.775
Bike0.7430.763
Motor0.7160.729
Person0.8050.836
Rider0.7530.783
mAP0.7800.800

表4

两模型Params与Inference Time对比 (of two models)"

Model IndexYOLOv3本文模型
Params/MB246.528.3
Inference Time(CPU)/ms25681

图10

测试集检测效果可视化"

图11

实际场景检测效果可视化"

1 Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, USA, 2005:886-893.
2 Platt J C. A fast algorithm for training support vector machines[J]. Journal of Information Technology, 1998, 2(5):1-28.
3 Felzenszwalb P, McAllester D, Ramaman D. A discriminatively trained, multiscale, deformable part model[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, USA, 2008:1-8.
4 Felzenszwalb P, Girshick R, McAllester D, et al. Object detection with discriminatively trained partbased models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9):1627-1645.
5 Felzenszwalb P, Girshick R, McAllester D. Cascade object detection with deformable part models[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, USA, 2010: 2241-2248.
6 Hinton G E, Osindero S, Teh Y. A fast learning algorithm for deep belief nets [J]. Neural Computation, 2006, 18(7): 1527-1554.
7 李晓飞. 基于深度学习的行人及骑车人车载图像识别方法[D]. 北京:清华大学军事交通学院, 2016.
Li Xiao-fei. On-board pedestrian and cyclist recognition using deep learning methods[D]. Beijing: Military Transport Academy,Tsinghua University, 2016.
8 李珊珊. 基于深度学习的交通场景多目标检测[D]. 长沙:湖南大学会计学院,2017.
Li Shan-shan. The research of multi-object detection in traffic scene based on deep learning[D]. Changsha:School of Accounting, Hunan University, 2017.
9 杨恺, 徐友春, 安相璧, 等. 基于深度学习的车辆检测方法[J]. 计算机与网络, 2018, 44(19): 58-61.
Yang Kai, Xu You-chun, An Xiang-bi,et al. Vehicle detection method based on deep learning[J]. Computer & Network, 2018, 44(19): 58-61.
10 华夏, 王新晴, 王东, 等. 基于改进SSD的交通大场景多目标检测[J]. 光学学报, 2018, 38(12): 221-231.
Hua Xia, Wang Xin-qing, Wang Dong, et al. Multi-objective detection of traffic scenes based on improved SSD[J]. Acta Optica Sinica, 2018, 38(12): 221-231.
11 Dhall A, Dai D, van Gool L. Real-time 3D traffic cone detection for autonomous driving[C]∥The 30th IEEE Intelligent Vehicles Symposium, Paris, France, 2019: 494-501.
12 李大华, 汪宏威, 高强, 等. 一种卷积神经网络的车辆和行人检测算法[J]. 激光杂志, 2020, 41(4):70-75.
Li Da-hua, Wang Hong-wei, Gao Qiang, et al. Vehicle and pedestrian detection algorithm based on convolutional neural network[J]. Laser Journal, 2020, 41(4):70-75.
13 新华网. 报告显示:2019年我国外卖行业交易额预计超6000亿元[J]. 中国食品学报, 2020, 20(1):157.
net Xinhua. Report shows: in 2019, China's foreign sales industry transactions are expected to exceed 600 billion yuan[J]. Journal of Chinese Institute of Food Science and Technology, 2020, 20(1):157.
14 Girshick R, Donahue J, Darrelland T, et al. Rich feature hierarchies for object detection and semantic segmentation[C]∥The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, USA, 2014: 580-587.
15 Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]∥The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016: 779-788.
16 Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, 2017: 6517-6525.
17 Redmon J, Farhadi A. YOLOv3: an incremental improvement[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, 2018.
18 Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, 2017: 936-944.
19 Mark Sandler, Andrew Howard, Zhu Meng-long, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, 2018: 4510-4520.
20 Howard A G, Zhu M, Chen B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[Z]. arXiv preprint arXiv:, 2017.
21 He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016: 770-778.
22 Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]∥International Conference on International Conference on Machine Learning, Lile, France, 2015: 448-456.
23 Wu Yu-xin, He Kai-ming. Group normalization[C]∥European Conference on Computer Vision(ECCV), Munich, Germany,2018: 3-19.
24 Qian N. On the momentum term in gradient descent learning algorithms[J]. Neural Networks, 1999, 12(1):145-151.
25 Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization[J]. Journal of Machine Learning Research, 2011, 12(7):257-269.
26 Kingma D, Ba J. Adam: a method for stochastic optimization[DB/OL]. [2018-10-22]. .
27 Yu F, Xian W, Chen Y, et al. BDD100K: a diverse driving dataset for heterogeneous multitask learning[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 2633-2642.
[1] 于向军,槐元辉,姚宗伟,孙中朝,俞安. 工程车辆无人驾驶关键技术[J]. 吉林大学学报(工学版), 2021, 51(4): 1153-1168.
[2] 李锦青,周健,底晓强. 基于循环生成对抗网络的学习型光学图像加密方案[J]. 吉林大学学报(工学版), 2021, 51(3): 1060-1066.
[3] 宋震,李俊良,刘贵强. 基于深度学习和限幅模糊的变转速液压动力源恒流量预测方法[J]. 吉林大学学报(工学版), 2021, 51(3): 1106-1110.
[4] 袁哲明,袁鸿杰,言雨璇,李钎,刘双清,谭泗桥. 基于深度学习的轻量化田间昆虫识别及分类模型[J]. 吉林大学学报(工学版), 2021, 51(3): 1131-1139.
[5] 彭博,张媛媛,王玉婷,唐聚,谢济铭. 基于自动编码机-分类器的视频交通状态自动识别[J]. 吉林大学学报(工学版), 2021, 51(3): 886-892.
[6] 赵宏伟,刘晓涵,张媛,范丽丽,龙曼丽,臧雪柏. 基于关键点注意力和通道注意力的服装分类算法[J]. 吉林大学学报(工学版), 2020, 50(5): 1765-1770.
[7] 谌华,郭伟,闫敬文,卓文浩,吴良斌. 基于深度学习的SAR图像道路识别新方法[J]. 吉林大学学报(工学版), 2020, 50(5): 1778-1787.
[8] 郜峰利,陶敏,李雪妍,何昕,杨帆,王卓,宋俊峰,佟丹. 基于深度学习的CT影像脑卒中精准分割[J]. 吉林大学学报(工学版), 2020, 50(2): 678-684.
[9] 徐谦,李颖,王刚. 基于深度学习的行人和车辆检测[J]. 吉林大学学报(工学版), 2019, 49(5): 1661-1667.
[10] 郭立民,陈鑫,陈涛. 基于AlexNet模型的雷达信号调制类型识别[J]. 吉林大学学报(工学版), 2019, 49(3): 1000-1008.
[11] 王新竹, 李骏, 李红建, 尚秉旭. 基于三维激光雷达和深度图像的自动驾驶汽车障碍物检测方法[J]. 吉林大学学报(工学版), 2016, 46(2): 360-365.
[12] 李抵非, 田地, 胡雄伟. 基于分布式内存计算的深度学习方法[J]. 吉林大学学报(工学版), 2015, 45(3): 921-925.
[13] 王荣本;李琳辉;郭烈;金立生;张明恒 . 基于立体视觉的越野环境感知技术[J]. 吉林大学学报(工学版), 2008, 38(03): 520-0524.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!