基于多重注意力机制的无锚框目标跟踪算法

doi:10.13229/j.cnki.jdxbgxb.20220166

Abstract

Abstract:

Siamese network based trackers have two branches which are independent of each other and lack of infor-mation interaction. So it cannot accurately and robust tracking under the challenges of target occlusion and similar object. To solve this problem， an anchor-free target tracking algorithm based on multiple attention mechanism was proposed. Multiple attention mechanism was used to encode the target template and search area features. After improving the feature significance through self-attention mechanism， mutual attention mechanism was used to aggregate the feature interaction between target template and search area， which strengthens this algorithm's discri-mination ability between target and background. At the same time， the anchor-free mechanism was used to complete the end-to-end visual target tracking task pixel by pixel， avoiding the disadvantages of human intervention caused by the anchor frame mechanism. Extensive experiments are conducted on many challenging benchmarks like OTB50， OTB100 and GOT-10K. These results show the anchor-free target tracking algorithm based on multiple attention mechanism proposed has strong robustness against the challenges of target occlusion and similar object， and effectively improves the precision rate and success rate of the tracking algorithm.

Key words: computer vision, object tracking, attention mechanism, anchor-free

CLC Number:

TP391.4

Jing-hong LIU,An-ping DENG,Qi-qi CHEN,Jia-qi PENG,Yu-jia ZUO. Anchor⁃free target tracking algorithm based on multiple attention mechanism[J].Journal of Jilin University(Engineering and Technology Edition), 2023, 53(12): 3518-3528.

Figures/Tables 10

Fig.1

Fig.2

Fig.3

Fig.4

Fig.5

Fig.6

Fig.7

Fig.8

Table 1

Experimental comparison results of different algorithms on the GOT-10K dataset"

指标	SiamFC	SiamRPN	SiamRPN ++	ATOM	SiamCAR	本文
AO	0.374	0.463	0.516	0.556	0.569	0.576
$S R 0.50$	0.404	0.549	0.620	0.634	0.670	0.672
$S R 0.75$	0.144	0.253	0.334	0.402	0.415	0.439
FPS	25.8	74	26	21	18	17

Table 1

Table 2

References 22

1	Guo Dong-yan, Wang Jun, Cui Ying, et al. SiamCAR: siamese fully convolutional classification and regression for visual tracking[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 6269-6277.
2	Baker S, Matthews I. Lucas-kanade 20 years on: a unifying framework[J]. International Journal of Computer Vision, 2004, 56(3): 221-255.
3	Collins R T. Mean-shift blob tracking through scale space[C]∥2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, 2003: No. II-234.
4	Henriques J F, Caseiro R, Martins P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 37(3): 583-596.
5	Tao R, Gavves E, Smeulders A W M. Siamese instance search for tracking[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016: 1420-1429.
6	Bertinetto L, Valmadre J, Henriques J F, et al. Fully-convolutional siamese networks for object tracking[C]∥European Conference on Computer Vision, Germany, Cham, 2016: 850-865.
7	Li Bo, Yan Junjie, Wu Wei, et al. High performance visual tracking with siamese region proposal network[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 8971-8980.
8	Li Bo, Wu Wei, Wang Qiang, et al. Evolution of siamese visual tracking with very deep networks[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 16-20.
9	Zhu Zheng, Wang Qiang, Li Bo, et al. Distractor-aware siamese networks for visual object tracking[C]∥Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 2018: 101-117.
10	王侃, 苏航, 曾浩, 等. 表观增强的深度目标跟踪算法[J]. 吉林大学学报: 工学版, 2022, 52(11): 2676-2684.
	Wang Kan, Su Hang, Zeng Hao, et al. Deep target tracking using augmented apparent information[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(11): 2676-2684.
11	Roy A G, Navab N, Wachinger C. Concurrent spatial and channel' squeeze & excitation' in fully convolutional networks[C]∥International Conference on Medical Image Computing and Computer-assisted Intervention, Germany, Cham, 2018: 421-429.
12	Wang X L, Girshick R, Gupta A, et al. Non-local neural networks[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 7794-7803.
13	Woo S, Park J, Lee J Y, et al. CBAM: convolutional block attention module[C]∥Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 2018: 3-19.
14	He An-feng, Luo Chong, Tian Xin-mei, et al. A twofold siamese network for real-time object tracking[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 4834-4843.
15	Wang Qiang, Teng Zhu, Xing Jun-liang, et al. Learning attentions: residual attentional siamese network for high performance online visual tracking[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 4854-4863.
16	才华, 王学伟, 付强, 等. 基于动态模板更新的孪生网络目标跟踪算法[J]. 吉林大学学报: 工学版, 2022, 52(5): 1106-1116.
	Cai Hua, Wang Xue-wei, Fu Qiang, et al. Siamese network target tracking algorithm based on dynamic template updating[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(5): 1106-1116.
17	He Kai-ming, Zhang Xiang-yu, Ren Shao-qing, et al. Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016: 770-778.
18	Howard A G, Zhu M L, Chen B, et al. Mobilenets: efficient convolutional neural networks for mobile vision applications[J/OL]. [2022-02-01].
19	Lin T Y, Maire M, Belongie S, et al. Microsoft coco: common objects in context[C]∥European Conference on Computer Vision, Cham,Germany, 2014: 740-755.
20	Huang Liang-hua, Zhao Xin, Huang Kai-qi. Got-10k: a large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 43(5): 1562-1577.
21	Deng J, Dong W, Socher R, et al. Imagenet: a large-scale hierarchical image database[C]∥2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009: 248-255.
22	Wu Yi, Jongwoo Lim, Yang Ming-hsuan. Online object tracking: a benchmark[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, US, 2013: 2411-2418.

Related Articles 15

[1]	Guang HUO,Da-wei LIN,Yuan-ning LIU,Xiao-dong ZHU,Meng YUAN,Di GAI. Lightweight iris segmentation model based on multiscale feature and attention mechanism [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(9): 2591-2600.
[2]	Xiao-xin GUO,Jia-hui LI,Bao-liang ZHANG. Joint segmentation of optic cup and disc based on high resolution network [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(8): 2350-2357.
[3]	Fei-fei TANG,Hai-lian ZHOU,Tian-jun TANG,Hong-zhou ZHU,Yong WEN. Multi⁃step prediction method of landslide displacement based on fusion dynamic and static variables [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(6): 1833-1841.
[4]	Yan-tao TIAN,Xing HUANG,Hui-qiu LU,Kai-ge WANG,Fu-qiang XU. Multi⁃mode behavior trajectory prediction of surrounding vehicle based on attention and depth interaction [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(5): 1474-1480.
[5]	Wei LYU,Jia-ze HAN,Jing-hui CHU,Pei-guang JING. Multi⁃modal self⁃attention network for video memorability prediction [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(4): 1211-1219.
[6]	Yan-tao TIAN,Fu-qiang XU,Kai-ge WANG,Zi-xu HAO. Expected trajectory prediction of vehicle considering surrounding vehicle information [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(3): 674-681.
[7]	Sheng JIANG,Peng-lang WANG,Zhi-ji DENG,Yi-ming BIE. Image fusion algorithm for traffic accident rescue based on deep learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(12): 3472-3480.
[8]	You QU,Wen-hui LI. Multiple object tracking method based on multi-task joint learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(10): 2932-2941.
[9]	Ji-hong OUYANG,Ze-qi GUO,Si-guang LIU. Dual⁃branch hybrid attention decision net for diabetic retinopathy classification [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 648-656.
[10]	Xian-tong LI,Wei QUAN,Hua WANG,Peng-cheng SUN,Peng-jin AN,Yong-xing MAN. Route travel time prediction on deep learning model through spatiotemporal features [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 557-563.
[11]	Xiao⁃lei CHEN,Yong⁃feng SUN,Ce LI,Dong⁃mei LIN. Stable anti⁃noise fault diagnosis of rolling bearing based on CNN⁃BiLSTM [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(2): 296-309.
[12]	Kan WANG,Hang SU,Hao ZENG,Jian QIN. Deep target tracking using augmented apparent information [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(11): 2676-2684.
[13]	Da-ke ZHOU,Chao ZHANG,Xin YANG. Self-supervised 3D face reconstruction based on multi-scale feature fusion and dual attention mechanism [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(10): 2428-2437.
[14]	Jie CAO,Xue QU,Xiao-xu LI. Few⁃shot image classification method based on sliding feature vectors [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(5): 1785-1791.
[15]	De-xing WANG,Ruo-you WU,Hong-chun YUAN,Peng GONG,Yue WANG. Underwater image restoration based on multi-scale attention fusion and convolutional neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(4): 1396-1404.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Anchor⁃free target tracking algorithm based on multiple attention mechanism

RICH HTML

PDF (PC)