Journal of Jilin University(Engineering and Technology Edition) ›› 2021, Vol. 51 ›› Issue (4): 1427-1436.doi: 10.13229/j.cnki.jdxbgxb20200588

Previous Articles    

Dynamic multiple object detection algorithm for vehicle forward based on improved YOLOv3

Li-sheng JIN1,2(),Bai-cang GUO1,Fang-rong WANG3,Jian SHI4()   

  1. 1.School of Vehicle and Energy,Yanshan University,Qinhuangdao 066004,China
    2.Hebei Key Laboratory of Special Delivery Equipment,Yanshan University,Qinhuangdao 066004,China
    3.College of Communication Engineering,Jilin University,Changchun 130022,China
    4.School of Mechanical Engineering,Beijing Institute of Technology,Beijing 100081,China
  • Received:2020-07-31 Online:2021-07-01 Published:2021-07-14
  • Contact: Jian SHI E-mail:jinls@ysu.edu.cn;sj920443563@126.com

Abstract:

The task of object detection plays an important role in the safe driving of driverless vehicles. Currently, the object detection technology of environment percept is mostly one-class object detection or all the objects in an image are listed as the target to be detected. Numerous studies have not yet focused on object division and detection of the objects in front of the vehicle. To solve the above problems, in this paper, the objects to be detected in front of vehicles are divided into two categories. One is the dynamic targets with high risk and displacement at any time, including four-wheel vehicle, two-wheel vehicle and people. The other one is the static targets with less danger and no displacement, including traffic lights and traffic signs. For the dynamic multiple objects in front of the vehicle, an improved algorithm of object detection based on YOLOv3 is proposed, which can be transplanted to the embedded system. To overcome the shortcoming of the original YOLOv3 algorithm, that it is difficult to get real-time detection in the embedded terminal, the original backbone network Darknet-53 was replaced with MobileNetV2 to extract features, adding Group Normalization operation in the training process and using Adam as optimizer. The extracted BDD100K dataset is used for training. The model is tested with BDD100k partial dataset not involved in training and Team_test dataset produced by our research group. The results show that compared with original YOLOv3, the missing rate (MR) of the algorithm in this paper can be kept within 5%, and based on the increase of 0.020 in mAP, comparing with the basic model of YOLOv3, the parameters of YOLOv3-MobileNetV2 model are reduced by about 89%, the Inference Time is reduced by about 70% under the CPU.

Key words: driverless technology, environment percept, deep learning, multiple object detection, lightweight model

CLC Number: 

  • U491.2

Fig.1

Network structure of YOLOv3"

Fig.2

Network of Darknet-53"

Fig.3

Standard convolutional filter decomposition"

Fig.4

Microstructure comparison between MobileNets and MobileNetV2"

Fig.5

Microstructure comparison between ResNet and MobileNetV2"

Fig.6

Network of MobileNetV2"

Fig.7

Diagrams of BN and GN"

Table 1

Data sets for testing"

Instance ObjectBDD100KTeam_test
Car1 021 84710 716
Bus16 5051 078
Truck42 9631 082
Bike10 229894
Motor4 296673
Person129 2626 023
Rider6 461834

Fig.8

Loss change curve"

Table 2

Comparison for MR and EFR of two models"

Index ModelMR/%EFR/%
YOLOv37.34.6
本文模型4.93.7

Fig.9

Confusion Matrix"

Table 3

Comparison for AP and mAP of two models"

Model APYOLOv3本文模型
Car0.8330.864
Bus0.8190.827
Truck0.7840.775
Bike0.7430.763
Motor0.7160.729
Person0.8050.836
Rider0.7530.783
mAP0.7800.800

Table 4

Comparison for Params and Inference Time"

Model IndexYOLOv3本文模型
Params/MB246.528.3
Inference Time(CPU)/ms25681

Fig.10

Visualization of detection on test set"

Fig.11

Visualization of detection on real-world"

1 Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, USA, 2005:886-893.
2 Platt J C. A fast algorithm for training support vector machines[J]. Journal of Information Technology, 1998, 2(5):1-28.
3 Felzenszwalb P, McAllester D, Ramaman D. A discriminatively trained, multiscale, deformable part model[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, USA, 2008:1-8.
4 Felzenszwalb P, Girshick R, McAllester D, et al. Object detection with discriminatively trained partbased models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9):1627-1645.
5 Felzenszwalb P, Girshick R, McAllester D. Cascade object detection with deformable part models[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, USA, 2010: 2241-2248.
6 Hinton G E, Osindero S, Teh Y. A fast learning algorithm for deep belief nets [J]. Neural Computation, 2006, 18(7): 1527-1554.
7 李晓飞. 基于深度学习的行人及骑车人车载图像识别方法[D]. 北京:清华大学军事交通学院, 2016.
Li Xiao-fei. On-board pedestrian and cyclist recognition using deep learning methods[D]. Beijing: Military Transport Academy,Tsinghua University, 2016.
8 李珊珊. 基于深度学习的交通场景多目标检测[D]. 长沙:湖南大学会计学院,2017.
Li Shan-shan. The research of multi-object detection in traffic scene based on deep learning[D]. Changsha:School of Accounting, Hunan University, 2017.
9 杨恺, 徐友春, 安相璧, 等. 基于深度学习的车辆检测方法[J]. 计算机与网络, 2018, 44(19): 58-61.
Yang Kai, Xu You-chun, An Xiang-bi,et al. Vehicle detection method based on deep learning[J]. Computer & Network, 2018, 44(19): 58-61.
10 华夏, 王新晴, 王东, 等. 基于改进SSD的交通大场景多目标检测[J]. 光学学报, 2018, 38(12): 221-231.
Hua Xia, Wang Xin-qing, Wang Dong, et al. Multi-objective detection of traffic scenes based on improved SSD[J]. Acta Optica Sinica, 2018, 38(12): 221-231.
11 Dhall A, Dai D, van Gool L. Real-time 3D traffic cone detection for autonomous driving[C]∥The 30th IEEE Intelligent Vehicles Symposium, Paris, France, 2019: 494-501.
12 李大华, 汪宏威, 高强, 等. 一种卷积神经网络的车辆和行人检测算法[J]. 激光杂志, 2020, 41(4):70-75.
Li Da-hua, Wang Hong-wei, Gao Qiang, et al. Vehicle and pedestrian detection algorithm based on convolutional neural network[J]. Laser Journal, 2020, 41(4):70-75.
13 新华网. 报告显示:2019年我国外卖行业交易额预计超6000亿元[J]. 中国食品学报, 2020, 20(1):157.
net Xinhua. Report shows: in 2019, China's foreign sales industry transactions are expected to exceed 600 billion yuan[J]. Journal of Chinese Institute of Food Science and Technology, 2020, 20(1):157.
14 Girshick R, Donahue J, Darrelland T, et al. Rich feature hierarchies for object detection and semantic segmentation[C]∥The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, USA, 2014: 580-587.
15 Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]∥The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016: 779-788.
16 Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, 2017: 6517-6525.
17 Redmon J, Farhadi A. YOLOv3: an incremental improvement[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, 2018.
18 Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, 2017: 936-944.
19 Mark Sandler, Andrew Howard, Zhu Meng-long, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, 2018: 4510-4520.
20 Howard A G, Zhu M, Chen B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[Z]. arXiv preprint arXiv:, 2017.
21 He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016: 770-778.
22 Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]∥International Conference on International Conference on Machine Learning, Lile, France, 2015: 448-456.
23 Wu Yu-xin, He Kai-ming. Group normalization[C]∥European Conference on Computer Vision(ECCV), Munich, Germany,2018: 3-19.
24 Qian N. On the momentum term in gradient descent learning algorithms[J]. Neural Networks, 1999, 12(1):145-151.
25 Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization[J]. Journal of Machine Learning Research, 2011, 12(7):257-269.
26 Kingma D, Ba J. Adam: a method for stochastic optimization[DB/OL]. [2018-10-22]. .
27 Yu F, Xian W, Chen Y, et al. BDD100K: a diverse driving dataset for heterogeneous multitask learning[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 2633-2642.
[1] Xiang-jun YU,Yuan-hui HUAI,Zong-wei YAO,Zhong-chao SUN,An YU. Key technologies in autonomous vehicle for engineering [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(4): 1153-1168.
[2] Jin-qing LI,Jian ZHOU,Xiao-qiang DI. Learning optical image encryption scheme based on CycleGAN [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 1060-1066.
[3] Zhen SONG,Jun-liang LI,Gui-qiang LIU. Constant flow prediction method of variable speed hydraulic power source based on deep learning and limitation fuzzy [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 1106-1110.
[4] Zhe-ming YUAN,Hong-jie YUAN,Yu-xuan YAN,Qian LI,Shuang-qing LIU,Si-qiao TAN. Automatic recognition and classification of field insects based on lightweight deep learning model [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 1131-1139.
[5] Bo PENG,Yuan-yuan ZHANG,Yu-ting WANG,Ju TANG,Ji-ming XIE. Automatic traffic state recognition from videos based on auto⁃encoder and classifiers [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 886-892.
[6] Hong-wei ZHAO,Xiao-han LIU,Yuan ZHANG,Li-li FAN,Man-li LONG,Xue-bai ZANG. Clothing classification algorithm based on landmark attention and channel attention [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1765-1770.
[7] Hua CHEN,Wei GUO,Jing-wen YAN,Wen-hao ZHUO,Liang-bin WU. A new deep learning method for roads recognition from SAR images [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1778-1787.
[8] Qian XU,Ying LI,Gang WANG. Pedestrian-vehicle detection based on deep learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(5): 1661-1667.
[9] Li⁃min GUO,Xin CHEN,Tao CHEN. Radar signal modulation type recognition based on AlexNet model [J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(3): 1000-1008.
[10] LI Di-fei, TIAN Di, HU Xiong-wei. A method of deep learning based on distributed memory computing [J]. 吉林大学学报(工学版), 2015, 45(3): 921-925.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!