吉林大学学报(工学版) ›› 2019, Vol. 49 ›› Issue (5): 1661-1667.doi: 10.13229/j.cnki.jdxbgxb20180642

• • 上一篇    

基于深度学习的行人和车辆检测

徐谦1,2(),李颖1,2,王刚1,2()   

  1. 1. 吉林大学 计算机科学与技术学院,长春 130012
    2. 吉林大学 符号计算与知识工程教育部重点实验室,长春 130012
  • 收稿日期:2018-06-19 出版日期:2019-09-01 发布日期:2019-09-11
  • 通讯作者: 王刚 E-mail:xuqian16@jlu.edu.cn;wanggang.jlu@gmail.com
  • 作者简介:徐谦(1990-),男,博士研究生.研究方向:计算机视觉.E-mail:xuqian16@jlu.edu.cn
  • 基金资助:
    国家自然科学基金项目(61602206)

Pedestrian-vehicle detection based on deep learning

Qian XU1,2(),Ying LI1,2,Gang WANG1,2()   

  1. 1. College of Computer Science and Technology, Jilin University, Changchun 130012, China
    2. Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
  • Received:2018-06-19 Online:2019-09-01 Published:2019-09-11
  • Contact: Gang WANG E-mail:xuqian16@jlu.edu.cn;wanggang.jlu@gmail.com

摘要:

为解决自动驾驶行车环境目标检测的问题,提出了一种基于深度学习的行人和车辆检测网络PVDNet。在网络底层,改进了跳跃连接结构,提出多级跳跃连接MLSC,加速了模型的收敛速度和收敛精度;在网络顶层,设计了一种多层特征融合方法MLFF,将底层特征与顶层特征融合以提高检测精度;在网络输出层,提出了一种单维卷积方法ODC替代全连接层,减少了模型参数以提高检测速度。实验表明:与原始的Faster R-CNN相比,PVDNet在数据集PascalVOC2007、PascalVOC2012、MS COCO、KITTI上行人和车辆平均检测准确率分别提高了3.7%、6.1%、5.6%、9.62%。

关键词: 人工智能, 目标检测, 深度学习, 无人驾驶

Abstract:

A pedestrian-vehicle detection network (PVDNet) is presented for pedestrian and vehicle detection in driving environment based on deep learning. First, on the low layers, an improved skip connection called Multi-Level Skip Connection (MLSC) is proposed to accelerate the convergence speed and the accuracy of the model. Second, on the top layers, a Multi-Layer Features Fusion (MLFF) method is designed to improve the detection accuracy by combining the low-level features with the high-level features. Finally, on the output layer, an One-Dimensional Convolution (ODC) Method is proposed to reduce the model parameters and improve the detection speed by replacing the fully connection layer. Experiments of the proposed PVDNet were carried out on the PascalVOC2007, PascalVOC2012, MS COCO, KITTI datasets. results show that, compared with the original Faster R-CNN, the mean average detection accuracies on the PascalVOC2007, PascalVOC2012, MS COCO, KITTI datasets are promoted 3.7%, 6.1%, 5.6%, 9.62% respectively by using PVDNet.

Key words: artificial intelligence, object detection, deep learning, self-driving

中图分类号: 

  • TP301.6

图1

Faster R-CNN网络整体架构"

图2

PVDNet整体结构"

图3

单维卷积"

表1

各测试集上的行人和车辆检测结果"

数据集 FasterR-CNN YOLO SSD(300) PVDNet

Pascal

VOC2007

行人AP 76.7 57.3 74.5 81.4
车辆AP 84.5 65.4 80.8 87.2
MAP 80.60 61.35 77.65 84.30

Pascal

VOC2012

行人AP 79.6 63.5 77.5 84.3
车辆AP 76.4 55.8 74.7 83.9
MAP 78.00 59.65 76.10 84.10
MS COCO 行人AP 57.9 36.2 53.6 62.7
车辆AP 67.5 51.2 62.4 73.9
MAP 62.70 43.70 58.00 68.30
KITTI 行人AP 65.91 24.35 88.69 85.32
车辆AP 79.11 35.86 66.41 78.93
MAP 72.51 30.11 77.55 82.13

图4

测试集上PVDNet检测效果图"

表2

TitanX上模型检测速度对比"

算 法 帧率/FPS
Faster R-CNN 7
YOLO 155
SSD(300) 58
PVDNet 25

图5

PVDNet在测试集上预测错误效果图"

1 曲昭伟, 魏福禄, 魏巍, 等 . 雷达与视觉信息融合的行人检测方法[J]. 吉林大学学报: 工学版, 2013, 43(5): 1230-1234.
Qu Zhao-wei , Wei Fu-lu , Wei Wei , et al . Pedestrian detection by radar vision data fusion[J]. Journal of Jilin University (Engineering and Technology Edition), 2013, 43(5): 1230-1234.
2 Park K , Kim S , Sohn K . Unified multi-spectral pedestrian detection based on probabilistic fusion networks[J]. Pattern Recognition, 2018, 80: 143-155.
3 Zhang X W , Cheng L , Li B , et al . Too far to see? not really!—pedestrian detection with scale-aware localization policy[J]. IEEE Transactions on Image Processing, 2017, 27(8): 3703-3715.
4 李琳辉, 伦智梅, 连静, 等 . 基于卷积神经网络的道路车辆检测方法[J]. 吉林大学学报: 工学版, 2017, 47(2): 384-391.
Li Lin-hui , Zhi-mei Lun , Lian Jing , et al . Convolution neural network-based vehicle detection method[J]. Journal of Jilin University (Engineering and Technology Edition), 2017, 47(2): 384-391.
5 Karaimer H C , Baris I , Bastanlar Y . Detection and classification of vehicles from omnidirectional videos using multiple silhouettes[J]. Pattern Analysis and Applications, 2017, 20(3): 893-905.
6 Ershadi N Y , Menendez J M , Jimenez D . Robust vehicle detection in different weather conditions: using MIPM[J/OL]. [2018-06-19]. https:⫽journals.plos.org/plosone/article?id=10.1371/journal.pone.0191355
7 Girshick R , Donahue J , Darrell T , et al . Region based convolutional networks for accurate object detection and segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(1): 142-158.
8 He K , Zhang X , Ren S , et al . Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 37(9): 1904-1916.
9 Girshick R . Fast R-CNN[C]⫽International Conference on Computer Vision, Santiago, Chile, 2015: 1440-1448.
10 Ren S Q , He K M , Girshick R , et al . Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
11 Redmon J , Divvala S , Girshick R , et al . You only look once: unified, real-time object detection[C]⫽IEEE Computer Vision and Pattern Recognition, Las Vegas, Nevada, 2016: 779-788.
12 Liu W , Anguelov D , Erhan D , et al . SSD: single shot multibox detector[C]⫽European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 21-37.
13 Shelhamer E , Long J , Darrell T . Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640-651.
14 Liu W , Anguelov D , Erhan D , et al . SSD: single shot multibox detector[EB/OL]. [2018-09-11]. https:⫽github.com/weiliu89/caffe/tree/ssd
15 Ren S Q , He K M , Girshick R , et al . Faster R-CNN (python implementation)[EB/OL]. [2018-09-11]. https:⫽github.com/rbgirshick/pyfaster-rcnn
16 Redmon J , Divvala S , Girshick R , et al . YOLO: real-time object detection[EB/OL]. [2018-09-11]. https:⫽pjreddie.com/darknet/yolo/
[1] 杨顺,蒋渊德,吴坚,刘海贞. 基于多类型传感数据的自动驾驶深度强化学习方法[J]. 吉林大学学报(工学版), 2019, 49(4): 1026-1033.
[2] 高万夫,张平,胡亮. 基于已选特征动态变化的非线性特征选择方法[J]. 吉林大学学报(工学版), 2019, 49(4): 1293-1300.
[3] 郭立民,陈鑫,陈涛. 基于AlexNet模型的雷达信号调制类型识别[J]. 吉林大学学报(工学版), 2019, 49(3): 1000-1008.
[4] 欧阳丹彤,肖君,叶育鑫. 基于实体对弱约束的远监督关系抽取[J]. 吉林大学学报(工学版), 2019, 49(3): 912-919.
[5] 黄勇,杨德运,乔赛,慕振国. 高分辨合成孔径雷达图像的耦合传统恒虚警目标检测[J]. 吉林大学学报(工学版), 2018, 48(6): 1904-1909.
[6] 顾海军, 田雅倩, 崔莹. 基于行为语言的智能交互代理[J]. 吉林大学学报(工学版), 2018, 48(5): 1578-1585.
[7] 董飒, 刘大有, 欧阳若川, 朱允刚, 李丽娜. 引入二阶马尔可夫假设的逻辑回归异质性网络分类方法[J]. 吉林大学学报(工学版), 2018, 48(5): 1571-1577.
[8] 王旭, 欧阳继红, 陈桂芬. 基于垂直维序列动态时间规整方法的图相似度度量[J]. 吉林大学学报(工学版), 2018, 48(4): 1199-1205.
[9] 张浩, 占萌苹, 郭刘香, 李誌, 刘元宁, 张春鹤, 常浩武, 王志强. 基于高通量数据的人体外源性植物miRNA跨界调控建模[J]. 吉林大学学报(工学版), 2018, 48(4): 1206-1213.
[10] 李雄飞, 冯婷婷, 骆实, 张小利. 基于递归神经网络的自动作曲算法[J]. 吉林大学学报(工学版), 2018, 48(3): 866-873.
[11] 黄岚, 纪林影, 姚刚, 翟睿峰, 白天. 面向误诊提示的疾病-症状语义网构建[J]. 吉林大学学报(工学版), 2018, 48(3): 859-865.
[12] 刘杰, 张平, 高万夫. 基于条件相关的特征选择方法[J]. 吉林大学学报(工学版), 2018, 48(3): 874-881.
[13] 刘雪娟, 袁家斌, 许娟, 段博佳. 量子k-means算法[J]. 吉林大学学报(工学版), 2018, 48(2): 539-544.
[14] 王旭, 欧阳继红, 陈桂芬. 基于多重序列所有公共子序列的启发式算法度量多图的相似度[J]. 吉林大学学报(工学版), 2018, 48(2): 526-532.
[15] 杨欣, 夏斯军, 刘冬雪, 费树岷, 胡银记. 跟踪-学习-检测框架下改进加速梯度的目标跟踪[J]. 吉林大学学报(工学版), 2018, 48(2): 533-538.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!