吉林大学学报(工学版) ›› 2022, Vol. 52 ›› Issue (12): 2906-2915.doi: 10.13229/j.cnki.jdxbgxb20210414

• 计算机科学与技术 • 上一篇    下一篇

基于双孪生网络的可见光热红外目标实时鲁棒跟踪

符磊1(),顾文彬1(),艾勇保2,李伟3,郑南1,王留洋1   

  1. 1.陆军工程大学 野战工程学院,南京 210007
    2.军事科学院 国防科技创新研究院,北京 100071
    3.中国人民解放军 93182部队,沈阳 110000
  • 收稿日期:2021-05-10 出版日期:2022-12-01 发布日期:2022-12-08
  • 通讯作者: 顾文彬 E-mail:fulei10@mails.jlu.edu.cn;guwenbin11@126.com
  • 作者简介:符磊(1985-),男,博士研究生. 研究方向:计算机视觉. E-mail:fulei10@mails.jlu.edu.cn
  • 基金资助:
    国防科技创新特区项目

Real⁃time robust RGB⁃T target tracking based on dual Siamese network

Lei FU1(),Wen-bin GU1(),Yong-bao AI2,Wei LI3,Nan ZHENG1,Liu-yang WANG1   

  1. 1.College of Field Engineering,Army Engineering University of PLA,Nanjing 210007,China
    2.National Academy of Defense Science and Technology Innovation,Academy of Military Sciences,Beijing 100071,China
    3.93182 PLA Troops,Shenyang 110000,China
  • Received:2021-05-10 Online:2022-12-01 Published:2022-12-08
  • Contact: Wen-bin GU E-mail:fulei10@mails.jlu.edu.cn;guwenbin11@126.com

摘要:

针对可见光与热红外图像融合跟踪中采用孪生网络架构进行跟踪时鲁棒性较差的问题,提出了一种基于双孪生网络特征融合的可见光热红外(RGB-T)目标实时跟踪方法。首先,采用两个孪生网络分别对可见光和红外图像的模板分支和搜索分支进行特征提取,得到两种模态特征层;然后,利用自注意力特征增强模块(SFEM)对两种模态的特征进行增强,并使用双模特征融合(DMFF)模块对模板分支和搜索分支增强后的特征分别进行融合;最后,将融合后的模板分支和搜索分支进行特定任务的互相关操作,并通过分类和回归分支得到目标位置,从而完成跟踪。在灰度热红外目标跟踪数据集(GTOT)上的测试结果表明,本文方法的精确率(PR)为91.8%,成功率(SR)为78.1%,运行速度为60 f/s。与其他RGB-T融合跟踪方法相比,本文方法能够在保持实时处理速度的同时具备较高的鲁棒性。

关键词: 计算机应用, 可见光热红外, 双孪生网络, 通道注意力, 空间注意力

Abstract:

Aiming at the problem of poor robustness in visible light and thermal infrared image fusion tracking using Siamese network architecture, a real-time RGB-Thermal (RGB-T) infrared target tracking method based on feature fusion of dual Siamese network was proposed. Firstly, dual Siamese network was used to extract the features of template branch and the search branch of the visible light and infrared images, and two different modalities feature layers were obtained. Secondly, the self-attention feature enhancement module (SFEM) was used to enhance the features of the two modalities, and the dual-modal feature fusion (DMFF) module was used to fuse the features of template branch and search branch respectively. Finally, the template branch and search branch were used for cross-correlation operation, and the target position was obtained by classification and regression branches, so as to complete target tracking.

The test results on the grayscale thermal infrared target tracking dataset (GTOT) show that the precision rate (PR) of the proposed method is 91.8%, the success rate (SR) is 78.1%, and the running speed is 60 f/s. It shows that, compared with other RGB-T fusion tracking methods, the proposed method has higher robustness while maintaining real-time processing speed.

Key words: computer application, RGB-Thermal infrared, dual Siamese network, channel attention, spatial attention

中图分类号: 

  • TP391.41

图1

SEFF网络整体框架图"

图2

SFEM原理图"

图3

DMFF原理图"

图4

UAVRGB-T数据集中经过配准后的图像对示例"

图5

不同算法性能对比"

表1

基于属性的PR分数对比 (%)"

算法OCCLILSVFMTCSODEF
SiamDW23+RGB-T67.57068.971.163.576.469.1
RT-MDNet2273.377.279.178.173.785.673.1
DAPNet887.390.084.782.389.393.791.9
DAFNet987.389.982.280.989.893.994.7
MANet1088.291.486.984.788.993.292.3
SEFF88.891.494.187.991.790.689.3

表2

基于属性的SR分数对比 (%)"

算法OCCLILSVFMTCSODEF
SiamDW23+RGB-T53.658.856.557.651.758.558.2
RT-MDNet2257.663.863.764.159.063.461.0
DAPNet867.472.264.861.969.069.277.1
DAFNet968.472.766.564.270.369.876.5
MANet1069.673.670.669.470.270.075.2
SEFF75.277.479.973.077.575.175.3

图6

GTOT数据集消融实验结果"

表3

不同算法执行速度对比"

算法显卡型号速度/(f·s-1
DAPNet81080Ti2
DAFNet91080Ti26
MANet1010801.11
SEFF1080Ti60

图7

五种算法跟踪结果对比图"

1 Lu Z, Rathod V, Votel R, et al. Retinatrack: online single stage joint detection and tracking[C]∥Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 14656-14666.
2 徐涛, 马克, 刘才华. 基于深度学习的行人多目标跟踪方法[J]. 吉林大学学报:工学版, 2021, 51(1): 27-38.
Xu Tao, Ma Ke, Liu Cai-hua. Multi object pedestrian tracking based on deep learning[J]. Journal of Jilin University (Engineering and Technology Edition), 2021, 51(1): 27-38.
3 孟琭, 杨旭. 目标跟踪算法综述[J]. 自动化学报, 2019, 45(7): 1244-1260.
Meng Lu, Yang Xu. A survey of object tracking algorithms[J]. Acta Automatica Sinica, 2019, 45(7): 1244-1260.
4 苑晶, 李阳, 董星亮, 等. 基于运动模式在线分类的移动机器人目标跟踪[J]. 仪器仪表学报, 2017, 38(3): 568-577.
Yuan Jing, Li Yang, Dong Xing-liang, et al. Target tracking with a mobile robot based on online classification for motion patterns[J]. Chinese Journal of Scientific Instrument, 2017, 38(3): 568-577.
5 Hao J, Zhou Y, Zhang G, et al. A review of target tracking algorithm based on UAV[C]∥Proceedings of the 2018 IEEE International Conference on Cyborg and Bionic Systems, Shenzhen, China, 2018: 328-333.
6 Zhang X, Ye P, Leung H, et al. Object fusion tracking based on visible and infrared images: a comprehensive review[J]. Information Fusion, 2020, 63: 166-187.
7 Li C, Wu X, Zhao N, et al. Fusing two-stream convolutional neural networks for RGB-T object tracking[J]. Neurocomputing, 2017, 281: 78-85.
8 Zhu Y, Li C, Luo B, et al. Dense feature aggregation and pruning for RGBT tracking[C]∥The 27th ACM International Conference, Nice, France, 2019: 465-472.
9 Gao Y, Li C, Zhu Y, et al. Deep adaptive fusion network for high performance RGBT tracking[C]∥2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Soule, South Korea, 2019: 91-99.
10 Li C, Lu A, Zheng A, et al. Multi-adapter RGBT tracking[C]∥2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, South Korea, 2019: 2262-2270.
11 Zhang X C, Ye P, Peng S Y, et al. SiamFT: an RGB-infrared fusion tracking method via fully convolutional Siamese networks[J]. IEEE Access, 2019, 7(1): 122122-122133.
12 申亚丽. 基于特征融合的RGBT双模态孪生跟踪网络[J]. 红外与激光工程, 2021, 50(3): 20200459.
Shen Ya-li. RGBT dual-modal Siamese tracking network with feature fusion[J]. Infrared and Laser Engineering, 2021, 50(3): 20200459.
13 Xu Y, Wang Z, Li Z, et al. Towards robust and accurate visual tracking with target estimation guidelines[C]∥Association for the Advance of Artificial Intelligence (AAAI), New York, USA, 2020: 125491-125565.
14 Li B, Yan J, Wu W, et al. High performance visual tracking with Siamese region proposal network[C]∥Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, 2017: 8971-8980.
15 Krizhevsky A, Sutskever I, Hinton G. Imagenet classification with deep convolutional neural networks[C]∥Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, USA, 2012: 1106-1164.
16 Fu J, Liu J, Tian H, et al. Dual attention network for scene segmentation[C]∥Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 3146-3154.
17 François C. Xception: deep learning with depthwise separable convolutions[C]∥Conference on Computer Vision and Pattern Recognition (CVPR), Hawai, USA, 2017: 1800-1807.
18 Lin T, Goyal P, Girshick R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2020, 42(2): 318-327.
19 Yu J, Jiang Y, Wang Z, et al. UnitBox: an advanced object detection network[C]∥ACM international conference on Multimedia, Amsterdam, Netherlands. 2016: 516-520.
20 Li C, Hui C, Hu S, et al. Learning collaborative sparse representation for grayscale-thermal tracking[J]. IEEE Transactions on Image Processing, 2016, 25(12): 5743-5756.
21 Li C, Liang X, Lu Y, et al. RGB-T object tracking: benchmark and baseline[J]. Pattern Recognition, 2019: 96: No. 106977.
22 Jung I, Son J, Beak M, et al. Real-time MDNet[C]∥European Conference on Computer Vision (ECCV), Munich, Germany, 2018: 89-104.
23 Zhang Z, Peng H. Deeper and wider Siamese networks for real-time visual tracking[C]∥Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 4591-4600.
[1] 祁贤雨,王巍,王琳,赵玉飞,董彦鹏. 基于物体语义栅格地图的语义拓扑地图构建方法[J]. 吉林大学学报(工学版), 2023, 53(2): 569-575.
[2] 时小虎,吴佳琦,吴春国,程石,翁小辉,常志勇. 基于残差网络的弯道增强车道线检测方法[J]. 吉林大学学报(工学版), 2023, 53(2): 584-592.
[3] 郭鹏,赵文超,雷坤. 基于改进Jaya算法的双资源约束柔性作业车间调度[J]. 吉林大学学报(工学版), 2023, 53(2): 480-487.
[4] 刘近贞,高国辉,熊慧. 用于脑组织分割的多尺度注意网络[J]. 吉林大学学报(工学版), 2023, 53(2): 576-583.
[5] 赵宏伟,张健荣,朱隽平,李海. 基于对比自监督学习的图像分类框架[J]. 吉林大学学报(工学版), 2022, 52(8): 1850-1856.
[6] 秦贵和,黄俊锋,孙铭会. 基于双手键盘的虚拟现实文本输入[J]. 吉林大学学报(工学版), 2022, 52(8): 1881-1888.
[7] 胡丹,孟新. 基于时变网格的对地观测卫星搜索海上船舶方法[J]. 吉林大学学报(工学版), 2022, 52(8): 1896-1903.
[8] 曲福恒,丁天雨,陆洋,杨勇,胡雅婷. 基于邻域相似性的图像码字快速搜索算法[J]. 吉林大学学报(工学版), 2022, 52(8): 1865-1871.
[9] 白天,徐明蔚,刘思铭,张佶安,王喆. 基于深度神经网络的诉辩文本争议焦点识别[J]. 吉林大学学报(工学版), 2022, 52(8): 1872-1880.
[10] 周丰丰,朱海洋. 基于三段式特征选择策略的脑电情感识别算法SEE[J]. 吉林大学学报(工学版), 2022, 52(8): 1834-1841.
[11] 周丰丰,张亦弛. 基于稀疏自编码器的无监督特征工程算法BioSAE[J]. 吉林大学学报(工学版), 2022, 52(7): 1645-1656.
[12] 王军,徐彦惠,李莉. 低能耗支持完整性验证的数据融合隐私保护方法[J]. 吉林大学学报(工学版), 2022, 52(7): 1657-1665.
[13] 康耀龙,冯丽露,张景安,陈富. 基于谱聚类的高维类别属性数据流离群点挖掘算法[J]. 吉林大学学报(工学版), 2022, 52(6): 1422-1427.
[14] 王文军,余银峰. 考虑数据稀疏的知识图谱缺失连接自动补全算法[J]. 吉林大学学报(工学版), 2022, 52(6): 1428-1433.
[15] 陈雪云,贝学宇,姚渠,金鑫. 基于G⁃UNet的多场景行人精确分割与检测[J]. 吉林大学学报(工学版), 2022, 52(4): 925-933.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 胡宗杰,吴志军,高光海,李理光 . 油束热碰壁制备柴油均质预混合气的优化[J]. 吉林大学学报(工学版), 2007, 37(01): 79 -84 .
[2] 王庆年,郑君峰,王伟华 . 一种新的并联混合动力客车的自适应控制策略[J]. 吉林大学学报(工学版), 2008, 38(02): 249 -0253 .
[3] 董伟,于秀敏,张友坤 . 汽车下长坡时发动机制动CVT控制策略
[J]. 吉林大学学报(工学版), 2006, 36(05): 650 -0653 .
[4] 李自立,陈增强,袁著祉 . 含状态项积分的时滞非线性系统鲁棒控制[J]. 吉林大学学报(工学版), 2007, 37(06): 1392 -1396 .
[5] 董婧;赵晓晖;应娜. 基于二进小波变换的基音检测算法[J]. 吉林大学学报(工学版), 2006, 36(06): 978 -0982 .
[6] 何仁, 陈庆樟. 用双开关磁阻电机的汽车能量再生制动技术[J]. 吉林大学学报(工学版), 2009, 39(05): 1137 -1141 .
[7] 李所军,高海波,邓宗全. 为月球车低能耗越障的摇臂悬架参数多目标优化[J]. 吉林大学学报(工学版), 2010, 40(03): 729 -0734 .
[8] 杨忠振,崔丛 . 基于神经网络的道路交通污染物浓度预测[J]. 吉林大学学报(工学版), 2007, 37(03): 705 -0708 .
[9] 姜辉, 郭孔辉, 张建伟. 基于路径规划的自动平行泊车转向控制器[J]. 吉林大学学报(工学版), 2011, 41(02): 293 -0297 .
[10] 孟令启,杜勇,马生彪,郭斌 . 中厚板轧机垂直振动的非线性[J]. 吉林大学学报(工学版), 2009, 39(03): 712 -0715 .