吉林大学学报(工学版) ›› 2023, Vol. 53 ›› Issue (12): 3518-3528.doi: 10.13229/j.cnki.jdxbgxb.20220166

• 计算机科学与技术 • 上一篇    

基于多重注意力机制的无锚框目标跟踪算法

刘晶红1(),邓安平1,2,陈琪琪1,2,彭佳琦3,左羽佳1   

  1. 1.中国科学院 长春光学精密机械与物理研究所,长春 130033
    2.中国科学院大学,北京 100039
    3.中国人民解放军陆军装备部驻沈阳地区军代局驻长春地区第一军代室,长春 130022
  • 收稿日期:2022-02-21 出版日期:2023-12-01 发布日期:2024-01-12
  • 作者简介:刘晶红(1967-),女,研究员,硕士.研究方向:光电成像.E-mail:liu1577@126.com
  • 基金资助:
    国家自然科学基金面上项目(62175233)

Anchorfree target tracking algorithm based on multiple attention mechanism

Jing-hong LIU1(),An-ping DENG1,2,Qi-qi CHEN1,2,Jia-qi PENG3,Yu-jia ZUO1   

  1. 1.Changchun Institute of Optics,Fine Mechanics and Physics,Chinese Academy of Sciences,Changchun 130033,China
    2.University of Chinese of Sciences,Beijing 100039,China
    3.The First Military Representative Office of the Military Representative Bureau of the Army Equipment Department of the Chinese People's Liberation Army in Shenyang and in Changchun,Changchun 130022,China
  • Received:2022-02-21 Online:2023-12-01 Published:2024-01-12

摘要:

针对现有孪生神经网络跟踪算法两个分支相互独立缺少信息交互,在受到目标遮挡、相似目标干扰等挑战下无法精确鲁棒跟踪目标的现状,提出了一种基于多重注意力机制的无锚框目标跟踪算法。使用多重注意力机制编码目标模板特征和搜索区域特征,通过自注意力机制提升特征显著性后,利用互注意力机制聚合目标模板与搜索区域之间的特征信息,强化了算法对目标和背景的鉴别能力。同时,引入无锚框机制,以逐像素的方式完成端到端的视觉目标跟踪任务,避免了锚框机制人为干预的弊端。实验结果表明,在OTB50、OTB100、GOT-10K公开数据集上,本文提出的基于多重注意力机制的无锚框目标跟踪算法针对目标遮挡以及相似目标干扰等挑战具有较强的鲁棒性,有效提升了跟踪算法的准确率和成功率。

关键词: 计算机视觉, 目标跟踪, 注意力机制, 无锚框机制

Abstract:

Siamese network based trackers have two branches which are independent of each other and lack of infor-mation interaction. So it cannot accurately and robust tracking under the challenges of target occlusion and similar object. To solve this problem, an anchor-free target tracking algorithm based on multiple attention mechanism was proposed. Multiple attention mechanism was used to encode the target template and search area features. After improving the feature significance through self-attention mechanism, mutual attention mechanism was used to aggregate the feature interaction between target template and search area, which strengthens this algorithm's discri-mination ability between target and background. At the same time, the anchor-free mechanism was used to complete the end-to-end visual target tracking task pixel by pixel, avoiding the disadvantages of human intervention caused by the anchor frame mechanism. Extensive experiments are conducted on many challenging benchmarks like OTB50, OTB100 and GOT-10K. These results show the anchor-free target tracking algorithm based on multiple attention mechanism proposed has strong robustness against the challenges of target occlusion and similar object, and effectively improves the precision rate and success rate of the tracking algorithm.

Key words: computer vision, object tracking, attention mechanism, anchor-free

中图分类号: 

  • TP391.4

图1

本文算法框架图"

图2

空间注意力机制图"

图3

通道注意力机制图"

图4

互注意力机制图"

图5

OTB50对比图"

图6

OTB50数据集上4个挑战属性的准确率和成功率对比图"

图7

算法热力图对比图"

图8

定性结果对比图"

表1

不同算法在GOT-10K数据集的实验对比结果"

指标SiamFCSiamRPNSiamRPN ++ATOMSiamCAR本文
AO0.3740.4630.5160.5560.5690.576
SR0.500.4040.5490.6200.6340.6700.672
SR0.750.1440.2530.3340.4020.4150.439
FPS25.87426211817

表2

运行效率对比实验"

算法FLOPsParamsFPS
SiamCAR83.2G91.9M18
本文84.5G93.9M17
1 Guo Dong-yan, Wang Jun, Cui Ying, et al. SiamCAR: siamese fully convolutional classification and regression for visual tracking[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 6269-6277.
2 Baker S, Matthews I. Lucas-kanade 20 years on: a unifying framework[J]. International Journal of Computer Vision, 2004, 56(3): 221-255.
3 Collins R T. Mean-shift blob tracking through scale space[C]∥2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, 2003: No. II-234.
4 Henriques J F, Caseiro R, Martins P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 37(3): 583-596.
5 Tao R, Gavves E, Smeulders A W M. Siamese instance search for tracking[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016: 1420-1429.
6 Bertinetto L, Valmadre J, Henriques J F, et al. Fully-convolutional siamese networks for object tracking[C]∥European Conference on Computer Vision, Germany, Cham, 2016: 850-865.
7 Li Bo, Yan Junjie, Wu Wei, et al. High performance visual tracking with siamese region proposal network[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 8971-8980.
8 Li Bo, Wu Wei, Wang Qiang, et al. Evolution of siamese visual tracking with very deep networks[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 16-20.
9 Zhu Zheng, Wang Qiang, Li Bo, et al. Distractor-aware siamese networks for visual object tracking[C]∥Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 2018: 101-117.
10 王侃, 苏航, 曾浩, 等. 表观增强的深度目标跟踪算法[J]. 吉林大学学报: 工学版, 2022, 52(11): 2676-2684.
Wang Kan, Su Hang, Zeng Hao, et al. Deep target tracking using augmented apparent information[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(11): 2676-2684.
11 Roy A G, Navab N, Wachinger C. Concurrent spatial and channel' squeeze & excitation' in fully convolutional networks[C]∥International Conference on Medical Image Computing and Computer-assisted Intervention, Germany, Cham, 2018: 421-429.
12 Wang X L, Girshick R, Gupta A, et al. Non-local neural networks[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 7794-7803.
13 Woo S, Park J, Lee J Y, et al. CBAM: convolutional block attention module[C]∥Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 2018: 3-19.
14 He An-feng, Luo Chong, Tian Xin-mei, et al. A twofold siamese network for real-time object tracking[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 4834-4843.
15 Wang Qiang, Teng Zhu, Xing Jun-liang, et al. Learning attentions: residual attentional siamese network for high performance online visual tracking[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 4854-4863.
16 才华, 王学伟, 付强, 等. 基于动态模板更新的孪生网络目标跟踪算法[J]. 吉林大学学报: 工学版, 2022, 52(5): 1106-1116.
Cai Hua, Wang Xue-wei, Fu Qiang, et al. Siamese network target tracking algorithm based on dynamic template updating[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(5): 1106-1116.
17 He Kai-ming, Zhang Xiang-yu, Ren Shao-qing, et al. Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016: 770-778.
18 Howard A G, Zhu M L, Chen B, et al. Mobilenets: efficient convolutional neural networks for mobile vision applications[J/OL]. [2022-02-01].
19 Lin T Y, Maire M, Belongie S, et al. Microsoft coco: common objects in context[C]∥European Conference on Computer Vision, Cham,Germany, 2014: 740-755.
20 Huang Liang-hua, Zhao Xin, Huang Kai-qi. Got-10k: a large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 43(5): 1562-1577.
21 Deng J, Dong W, Socher R, et al. Imagenet: a large-scale hierarchical image database[C]∥2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009: 248-255.
22 Wu Yi, Jongwoo Lim, Yang Ming-hsuan. Online object tracking: a benchmark[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, US, 2013: 2411-2418.
[1] 霍光,林大为,刘元宁,朱晓冬,袁梦,盖迪. 基于多尺度特征和注意力机制的轻量级虹膜分割模型[J]. 吉林大学学报(工学版), 2023, 53(9): 2591-2600.
[2] 郭晓新,李佳慧,张宝亮. 基于高分辨率网络的视杯和视盘的联合分割[J]. 吉林大学学报(工学版), 2023, 53(8): 2350-2357.
[3] 唐菲菲,周海莲,唐天俊,朱洪洲,温永. 融合动静态变量的滑坡多步位移预测方法[J]. 吉林大学学报(工学版), 2023, 53(6): 1833-1841.
[4] 田彦涛,黄兴,卢辉遒,王凯歌,许富强. 基于注意力与深度交互的周车多模态行为轨迹预测[J]. 吉林大学学报(工学版), 2023, 53(5): 1474-1480.
[5] 吕卫,韩镓泽,褚晶辉,井佩光. 基于多模态自注意力网络的视频记忆度预测[J]. 吉林大学学报(工学版), 2023, 53(4): 1211-1219.
[6] 陈小波,陈玲. 定位噪声统计特性未知的变分贝叶斯协同目标跟踪[J]. 吉林大学学报(工学版), 2023, 53(4): 1030-1039.
[7] 田彦涛,许富强,王凯歌,郝子绪. 考虑周车信息的自车期望轨迹预测[J]. 吉林大学学报(工学版), 2023, 53(3): 674-681.
[8] 江晟,王鹏朗,邓志吉,别一鸣. 基于深度学习的交通事故救援图像融合算法[J]. 吉林大学学报(工学版), 2023, 53(12): 3472-3480.
[9] 耿庆田,赵杨,李清亮,于繁华,李晓宁. 基于注意力机制的LSTM和ARIMA集成方法在土壤温度中应用[J]. 吉林大学学报(工学版), 2023, 53(10): 2973-2981.
[10] 曲优,李文辉. 基于多任务联合学习的多目标跟踪方法[J]. 吉林大学学报(工学版), 2023, 53(10): 2932-2941.
[11] 国强,崔玉强,王勇. 无线传感器网络中基于动态簇的节点调度算法[J]. 吉林大学学报(工学版), 2022, 52(6): 1466-1476.
[12] 欧阳继红,郭泽琪,刘思光. 糖尿病视网膜病变分期双分支混合注意力决策网络[J]. 吉林大学学报(工学版), 2022, 52(3): 648-656.
[13] 李先通,全威,王华,孙鹏程,安鹏进,满永兴. 基于时空特征深度学习模型的路径行程时间预测[J]. 吉林大学学报(工学版), 2022, 52(3): 557-563.
[14] 陈晓雷,孙永峰,李策,林冬梅. 基于卷积神经网络和双向长短期记忆的稳定抗噪声滚动轴承故障诊断[J]. 吉林大学学报(工学版), 2022, 52(2): 296-309.
[15] 王侃,苏航,曾浩,覃剑. 表观增强的深度目标跟踪算法[J]. 吉林大学学报(工学版), 2022, 52(11): 2676-2684.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!