吉林大学学报(理学版) ›› 2023, Vol. 61 ›› Issue (3): 557-566.

• • 上一篇    下一篇

基于注意力改进的自适应空间特征融合目标检测算法

逄晨曦, 李文辉   

  1. 吉林大学 计算机科学与技术学院, 长春 130012
  • 收稿日期:2022-02-28 出版日期:2023-05-26 发布日期:2023-05-26
  • 通讯作者: 李文辉 E-mail:liwh@jlu.edu.cn

Adaptive Spatial Feature Fusion Object Detection Algorithm Based on Attention Improvement

PANG Chenxi, LI Wenhui   

  1. College of Computer Science and Technology, Jilin University, Changchun 130012, China
  • Received:2022-02-28 Online:2023-05-26 Published:2023-05-26

摘要: 针对传统目标检测存在小目标特征提取能力差、 识别率低等问题, 提出一种基于YOLOv4改进的目标检测算法, 采用注意力改进的自适应空间特征融合策略生成金字塔形特征表示, 解决了目标检测尺度变化带来的挑战. 通过这种新的数据驱动的金字塔特征融合策略, 在不影响小目标识别的前提下, 提高了中、 大目标的精度. 其将注意力学习图像特征和提取特征相结合, 提高了特征检测的准确性. 使用新的损失函数结合自适应空间特征融合策略和指数滑动平均, 基于YOLOv4, 在数据集MS COCO上多次实验的仿真结果表明, 该算法在速度和精度之间取得了最佳折中, 对于数据集MS COCO, mAP达到41.5%, AP50达到63.8%, 相比于原算法提升了1.1%. 改进算法对
数据集MS COCO具有较高的鲁棒性, 从而有效提高了目标的检测识别率.

关键词: 目标检测, 卷积神经网络, 特征金字塔, 注意力机制

Abstract: Aiming at the problem that  the traditional object detection had poor feature extraction ability and low recognition rate for small targets, we proposed an improved object detection algorithm based on YOLOv4, which used the attention improved adaptive spatial feature fusion (AIASFF) strategy to generate a pyramid feature representation, and solved the challenges brought by changes in object detection scale. Through this new data-driven pyramid feature fusion strategy, the accuracy of medium and large targets was improved without affecting small target recognition. It combined attention learning image features with extracted features to improve the accuracy of feature detection. The new loss function was combined with the adaptive spatial feature fusion strategy and the exponential moving average,  the simulation results of multiple experiments on the MS COCO dataset based on YOLOv4 show that the algorithm achieves the best compromise between speed and accuracy. For the MS COCO dataset, mAP reaches 41.5% and AP50 reaches 63.8%, which is 1.1% higher than the original algorithm. The improved algorithm has high robustness to MS COCO dataset, thereby  effectively improving the detection and recognition rate of the targets.

Key words: object detection, convolutional neural network, feature pyramid, attention mechanism

中图分类号: 

  • TP391.4