基于深度学习的高速公路小目标检测算法

doi:10.13229/j.cnki.jdxbgxb.20230939

摘要/Abstract

摘要：

针对高速公路路侧摄像头拍摄的图像中，远端的行人和车辆目标小、实时检测难问题，提出一种改进的目标检测算法YOLOv5s-3S-4PDH。首先，采用Shufflenetv2-Stem-SPPF网络结构，提高模型的运行速度；其次，引入加速归一化加权融合特征图和160×160小目标检测层，优化小目标检测性能；然后，引入改进的解耦头机制，提高小目标检测的定位和分类精度；最后，采用Focal EIoU作为定位损失函数，加快模型训练的收敛速度。在自建行人和车辆数据集上进行对比实验，结果表明：该算法与YOLOv5s基准网络算法相比，计算量和参数量分别减少了10.1%和24.6%，检测速度和精度分别提高了15.4%和2.1%；在VisDrone2019数据集上进行的迁移学习实验表明，该算法对所有目标类别的平均精度高于YOLOv5s。YOLOv5s-3S-4PDH算法在满足小目标检测实时性与精度的同时，也具备泛化能力。

关键词: 交通运输规划与管理, 高速公路, 目标检测, 深度学习

Abstract:

To address the challenging issue of real-time detection of small distant pedestrians and vehicles in images captured by roadside cameras on expressways， an improved object detection algorithm YOLOv5s-3S-4PDH was proposed. Firstly， the Shufflenetv2-Stem-SPPF network structure was used to improve the running speed of the algorithm. Secondly， the accelerated normalized weighted fusion feature map and the 160×160 small object detection layer were introduced to optimize the performance of small object detection； Then， the improved decoupling head mechanism was introduced to improve the localization and classification accuracy of small object detection. Finally， Focal EIoU was used as the localization loss function of the algorithm to accelerate the training convergence speed of the algorithm. The results show that： compared with the YOLOv5s on the self-built pedestrian and vehicle dataset， the computation and parameter amount of the proposed algorithm are reduced by 10.1% and 24.6%， respectively， and the detection speed and accuracy are increased by 15.4% and 2.1%， respectively； Transfer learning experiment on the VisDrone2019 dataset shows that the proposed algorithm has better average precision for all categories. The proposed algorithm not only meets the real-time and accuracy requirements of small object detection， but also has generalization ability.

Key words: transportation planning and management, expressway, object detection, deep learning

中图分类号:

U495

徐慧智,郝东升,徐小婷,蒋时森. 基于深度学习的高速公路小目标检测算法[J]. 吉林大学学报(工学版), 2025, 55(6): 2003-2014.

Hui-zhi XU,Dong-sheng HAO,Xiao-ting XU,Shi-sen JIANG. Expressway small object detection algorithm based on deep learning[J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(6): 2003-2014.

图/表 18

图1

图2

图3

图4

图5

图6

图7

图8

表1

表2

实验超参数"

超参数	系数值
初始学习率	0.01
循环学习率	0.01
动量	0.937
权重衰减系数	0.000 5
预热学习轮数	3.0
预热学习动量	0.8
预热初始偏置学习率	0.1
边界框回归损失系数	0.05
分类损失系数	0.5
置信损失系统	1.0
有无物体BCE Loss中正样本权重	1.0
分类BCE Loss中正样本权重	1.0
$I o U$ 训练阈值	0.2
Anchor的宽高比	4.0

表2

表3

不同主干网络对比实验结果"

模型	主干网络	$A P @ 0.5 / %$		$m A P @ 0.5 / %$	$F P S$	$P a r a m s /$ 10⁶	$F L O P s /$ 10⁹
模型	主干网络	行人	车辆	$m A P @ 0.5 / %$	$F P S$	$P a r a m s /$ 10⁶	$F L O P s /$ 10⁹
YOLOv5s	C3Net	89.3	95.3	92.3	26.6	7.02	15.8
YOLOv5s-1S	Shufflenetv2	83.0	93.9	88.5	50.2	3.34	7.3
YOLOv5s-2S	Shufflenetv2-Stem	86.7	92.3	90.4	48.7	3.36	9.0
YOLOv5s-3S	Shufflenetv2-Stem-SPPF	88.1	94.9	91.5	46.5	4.02	9.5

表3

表4

消融和对比实验结果"

序号	Focal EIoU	BiFPN	解耦头	增加160×160 检测层	$A P @ 0.5 / %$		$m A P @ 0.5 / %$	$F P S$	$P a r a m s /$ 10⁶	$F L O P s /$ 10⁹
序号	Focal EIoU	BiFPN	解耦头	增加160×160 检测层	行人	车辆	$m A P @ 0.5 / %$	$F P S$	$P a r a m s /$ 10⁶	$F L O P s /$ 10⁹
YOLOv8s					91.8	95.9	93.8	29.6	11.1	28.4
1					88.1	94.9	91.5	46.5	4.02	9.5
2	??				88.3（+0.2）	95.2（+0.3）	91.8（+0.3）	45.4	4.02	9.5
3	??	??			88.6（+0.3）	95.3（+0.1）	91.9（+0.1）	40.1	4.17	10.2
4	??	??	??		90.1（+0.5）	95.4（+0.1）	92.7（+0.8）	38.6	4.40	10.9
5	??	??	??	??	93.1（+3.0）	95.5（+0.1）	94.4（+1.7）	30.7	5.31	14.2

表4

表5

算法性能对比 (%)"

算法	$m A P @ 0.5$	$F P S$	$P a r a m s$	$F L O P s$
YOLOv5s	+2.1	+15.4	-24.6	-10.1
YOLOv8s	+0.6	+3.7	-52.2	-50.0

表5

图9

图10

图11

表6

迁移学习实验结果"

种类	YOLOv5s			YOLOv5s-3S-4DPH
种类	$P / %$	$R / %$	$A P @ 0.5 / %$	$P / %$	$R / %$	$A P @ 0.5 / %$
pedestrian	48.5	39.7	40.7	56.4	41.7	45.6
people	45.2	35.6	33.5	49.8	35.2	36.4
bicycle	29.1	16.9	13.8	29.0	16.7	14.5
car	64.0	73.5	74.4	67.2	80.9	81.1
van	47.5	36.9	36.8	50.7	41.8	42.2
truck	55.3	30.9	32.2	47.0	31.6	32.3
tricycle	40.7	23.1	19.9	44.2	25.1	24.6
awning-tricycle	24.0	11.6	10.4	26.7	15.4	13.3
bus	61.1	43.8	46.8	62.5	50.2	53.3
motor	48.0	43.2	39.1	53.7	45.3	44.6
$m A P @ 0.5 / %$	34.8			38.8
$m A P @ 0.5 : 0.95 / %$	19.2			22.1

表6

图12

参考文献 30

[1]	梁鸿, 王庆玮, 张千, 等. 小目标检测技术研究综述[J]. 计算机工程与应用, 2021, 57(1): 17-28.
	Liang Hong, Wang Qing-wei, Zhang Qian, et al. Small object detection technology: a review[J]. Computer Engineering and Applications, 2021, 57(1): 17-28.
[2]	Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. Computer Science, 2014, 21(8): 91-103.
[3]	Krizhevsky A, Sutskever I, Hinton E G. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
[4]	王芋人, 武德安. 一种提高小目标检测准确率的数据增强方法[J]. 激光杂志, 2021, 42(11): 41-45.
	Wang Yu-ren, Wu De-an. Data augmentation method for improving the accuracy of small target detection[J]. Laser Journal, 2021, 42(11): 41-45.
[5]	杨慧剑, 孟亮. 基于改进的YOLOv5的航拍图像中小目标检测算法[J]. 计算机工程与科学, 2023, 45(6): 1063-1070.
	Yang Hui-jian, Meng Liang. A small target detection algorithm based on improved YOLOv5 in aerial image[J]. Computer Engineering & Science, 2023, 45(6): 1063-1070.
[6]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[7]	Singh B, Davis L S. An analysis of scale invariance in object detection-SNIP[C]∥IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018:3578-3587.
[8]	Zhang S, Zhu X, Lei Z, et al. Faceboxes:a CPU real-time face detector with high accuracy[C]∥IEEE International Joint Conference on Biometrics, Denver, USA, 2017: 1-9.
[9]	王建中, 王加乐, 于子博, 等. 士兵和装甲车目标多尺度检测方法[J]. 北京理工大学学报, 2023, 43(2): 203-212.
	Wang Jian-zhong, Wang Jia-le, Yu Zi-bo, et al. Multi-scale detection method for soldier and armored vehicle objects[J]. Transactions of Beijing Institute of Technology, 2023, 43(2): 203-212.
[10]	谌雨章, 黄逸姿, 张钧涵. 基于多速率空洞卷积的多尺度水下小目标检测[J]. 计算机工程, 2023, 49(6): 257-264.
	Chen Yu-zhang, Huang Yi-zi, Zhang Jun-han. Multi-scale underwater small object detection based on multi-rate dilated convolution[J]. Computer Engineering, 2023, 49(6): 257-264.
[11]	李成豪, 张静, 胡莉, 等. 基于多尺度感受野融合的小目标检测算法[J]. 计算机工程与应用, 2022, 58(12): 177-182.
	Li Cheng-hao, Zhang Jing, Hu Li, et al. Small object detection algorithm based on multiscale receptive field fusion[J]. Computer Engineering and Applications, 2022, 58(12): 177-182.
[12]	董亚盼, 高陈强, 谌放, 等. 基于注意力机制的红外小目标检测方法[J]. 重庆邮电大学学报: 自然科学版, 2023, 35(2): 219-226.
	Dong Ya-pan, Gao Chen-qiang, Chen Fang, et al. Infrared small target detection method based on attention mechanism[J]. Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition), 2023, 35(2): 219-226.
[13]	Qu J S, Su C, Zhang Z W, et al. Dilated convolution and feature fusion SSD network for small object detection in remote sensing images[J]. IEEE Access, 2020, 8: 82832-82843.
[14]	Li K, Cheng G, Bu S, et al. Rotation-insensitive and context-augmented object detection in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 56(4): 2337-2348.
[15]	Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]∥IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 779-788.
[16]	Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]∥IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6517-6525.
[17]	Redmon J, Farhadi A. YOLOv3: an incremental improvement[C]∥IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018:1-6.
[18]	Bochkovskiy A, Wang C Y, Liao H. YOLOv4: optimal speed and accuracy of object detection[DB/OL].[2023-06-05].
[19]	Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation[C]∥IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 8759-8768.
[20]	Ma N, Zhang X, Zheng H T, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]∥IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 116-131.
[21]	Yu C, Gao C, Wang J, et al. BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation[J]. International Journal of Computer Vision, 2021, 129: 3051-3068.
[22]	陈奎, 刘晓, 贾立娇, 等. 基于轻量化网络与增强多尺度特征融合的绝缘子缺陷检测[J].高压技术,2024(3):1289-1300.
	Chen Kui, Liu Xiao, Jia Li-jiao, et al. Insulator defect detection based on lightweight network and enhanced multi-scale feature fusion[J].高压技术,2024(3):1289-1300.
[23]	Tan M, Pang R, Le A V. EfficientDet: scalable and efficient object detection[C]∥IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020:10778-10787.
[24]	高新波, 莫梦竟成, 汪海涛, 等. 小目标检测研究进展[J]. 数据采集与处理, 2021, 36(3):391-417.
	Gao Xin-bo, Jing-cheng Momeng, Wang Hai-tao, et al. Recent advances in small object detection[J]. Journal of Data Acquisition and Processing, 2021, 36(3):391-417.
[25]	Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]∥IEEE Conference on Computer Vision and Pattern Recognition Honolulu, USA, 2017: 2117-2125.
[26]	Song G, Liu Y, Wang X. Revisiting the sibling head in object detector[C]∥IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 11560-11569.
[27]	Ge Z, Liu S, Wang F, et al. YOLOX: exceeding YOLO series in 2021[DB/OL]. [2023-06-10].
[28]	Zhang Y F, Ren W, Zhang Z, et al. Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146-157.
[29]	徐慧智, 宋爱秋, 武笑宇. 基于均匀设计的船舶目标检测深度学习模型训练方法[J]. 科学技术与工程, 2022, 22(25): 11241-11249.
	Xu Hui-zhi, Song Ai-qiu, Wu Xiao-yu. Training method of deep learning to ship target detection based on uniform design[J]. Science Technology and Engineering, 2022, 22(25) : 11241-11249.
[30]	冒国韬, 邓天民, 于楠晶. 基于多尺度分割注意力的无人机航拍图像目标检测算法[J]. 航空学报, 2023, 44(5): 273-283.
	Mao Guo-tao, Deng Tian-min, Yu Nan-jing. Object detection in UAV images based on multiscale split attention[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(5): 273-283.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

环境项	环境规格
CPU	Inter（R） Core（TM） i9-12900
内存	64 GB
显卡	NVIDA RTX A4000
操作系统	Windows 11
编程语言	Python 3.9.10
深度学习框架	Pytorch 1.12.1
集成开发环境	Pycharm社区版2022.3.2
CUDA	12.0
CUDNN	8.3.2