基于深度学习的行人多目标跟踪方法

doi:10.13229/j.cnki.jdxbgxb20200509

摘要/Abstract

摘要：

综合了近年来基于检测跟踪的主流行人多目标跟踪方法，介绍了基于检测的行人多目标跟踪方法概念，从目标检测、特征提取和数据关联与跟踪三个阶段对行人多目标跟踪方法进行了概述，比较并评价了这些方法在MOTChallenge系列数据集上的性能，阐述了多目标跟踪的未来研究方向。

关键词: 计算机视觉, 多目标跟踪, 目标检测, 特征提取, 数据关联

Abstract:

A survey of the mainstream multi object tracking methods based on tracking by detection in recent years is carried out. Then， the concept of detection based multi object tracking is introduced. The multi object tracking methods are summarized in object detection， feature extraction and data association & tracking. The performance of some multi object tracking（MOT） methods are compared and evaluated on the MOTChallenge series datasets. The future development direction of multi object tracking is discussed.

Key words: computer vision, multi object tracking, object detection, feature extraction, data association

中图分类号:

TP391

徐涛,马克,刘才华. 基于深度学习的行人多目标跟踪方法[J]. 吉林大学学报(工学版), 2021, 51(1): 27-38.

Tao XU,Ke MA,Cai-hua LIU. Multi object pedestrian tracking based on deep learning[J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 27-38.

图/表 9

图1

图2

图3

图4

图5

表1

表2

表3

表4

参考文献 67

1	Fan L, Wang Z, Cail B, et al. A survey on multiple object tracking algorithm[C]∥IEEE International Conference on Information and Automation(ICIA), Ningbo, China, 2016: 1855-1862.
2	Chu Q, Ouyang W, Li H, et al. Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism[C]∥Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017: 4836-4845.
3	Son J, Baek M, Cho M, et al. Multi-object tracking with quadruplet convolutional neural networks[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017: 5620-5629.
4	Fang K, Xiang Y, Li X, et al. Recurrent autoregressive networks for online multi-object tracking[C]∥IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, 2018: 466-475.
5	Kim C, Li F, Rehg J M. Multi-object tracking with neural gating using bilinear lstm[C]∥European Conference on Computer Vision(ECCV), Munich,Germany,2018: 208-224.
6	Xu Y, Zhou X, Chen S, et al. Deep learning for multiple object tracking: a survey[J]. IET Computer Vision, 2019, 13(4): 355-368.
7	Sun Z, Chen J, Liang C, et al. A survey of multiple pedestrian tracking based on tracking-by-detection framework[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020(99):1-10.
8	Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
9	Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]∥Proceedings of the European Conference on Computer Vision(ECCV), Amsterdam, Netherlands, 2016: 21-37.
10	Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA,2016: 779-788.
11	Bewley A, Ge Z, Ott L, et al. Simple online and realtime tracking[C]∥IEEE International Conference on Image Processing (ICIP), Phoenix, USA,2016: 3464-3468.
12	Yu F, Li W, Li Q, et al. POI: multiple object tracking with high performance detection and appearance feature[C]∥ Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, Netherlands, 2016: 36-42.
13	Kieritz H, Hubner W, Arens M. Joint detection and online multi-object tracking[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, USA,2018: 1540-1548.
14	Dawei Z, Hao F, Liang X, et al. Multi-object tracking with correlation filter for autonomous vehicle[J]. Sensors, 2018, 18(7): 2004-2011.
15	Redmon J, Farhadi A. YOLOv3:an incremental improvement[EB/OL]. [2018-04-08].
16	Jiang M, Hai T, Pan Z, et al. Multi-agent deep reinforcement learning for multi-object tracker[J]. IEEE Access, 2019, 7: 32400-32407.
17	He M, Luo H, Hui B, et al. Fast online multi-pedestrian tracking via integrating motion model and deep appearance model[J]. IEEE Access, 2019, 7: 89475-89486.
18	Zhou Q, Zhong B, Zhang Y, et al. Deep alignment network based multi-person tracking with occlusion and motion reasoning[J]. IEEE Transactions on Multimedia, 2018, 21(5): 1183-1194.
19	Ku J, Mozifian M, Lee J, et al. Joint 3d proposal generation and object detection from view aggregation[C]∥ 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 2018: 1-8.
20	Baser E, Balasubramanian V, Bhattacharyya P, et al. Fantrack: 3d multi-object tracking with feature association network[C]∥ IEEE Intelligent Vehicles Symposium(IV), Paris, France, 2019: 1426-1433.
21	Cao Z, Simon T, Wei S E, et al. Realtime multi-person 2d pose estimation using part affinity fields[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA,2017: 1302-1310.
22	Ristani E, Tomasi C. Features for multi-target multi-camera tracking and re-identification[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA,2018: 6036-6046.
23	Kim C, Li F, Ciptadi A, et al. Multiple hypothesis tracking revisited[C]∥Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 4696-4704.
24	Wojke N, Bewley A, Paulus D. Simple online and realtime tracking with a deep association metric[C]∥IEEE International Conference on Image Processing(ICIP), Beijing, 2017: 3645-3649.
25	Mahmoudi N, Ahadi S M, Rahmati M. Multi-target tracking using CNN-based features: CNNMTT[J]. Multimedia Tools and Applications, 2019, 78(6): 7077-7096.
26	Sheng H, Zhang Y, Chen J, et al. Heterogeneous association graph fusion for target association in multiple object tracking[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 29(11): 3269-3280.
27	Chen L, Ai H, Shang C, et al. Online multi-object tracking with convolutional neural networks[C]∥IEEE International Conference on Image Processing(ICIP), Beijing, 2017: 645-649.
28	Peng J, Wang C. Chained-tracker: chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking[EB/OL]. [2020-10-13].
29	He Z, Li J, Liu D, et al. Tracking by animation: unsupervised learning of multi-object attentive trackers[C]∥ Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA,2019: 1318-1327.
30	Brasó G, Leal-Taixé L. Learning a neural solver for multiple object tracking[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA,2020: 6247-6257.
31	Porzi L, Hofinger M, Ruiz I, et al. Learning multi-object tracking and segmentation from automatic annotations[C]∥ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA,2020: 6846-6855.
32	Kim M, Alletto S. Similarity mapping with enhanced siamese network for multi-object tracking[EB/OL]. [2020-10-13].
33	Lee S, Kim E. Multiple object tracking via feature pyramid Siamese networks[J]. IEEE Access, 2018, 7: 8181-8194.
34	Leal-Taixé L, Canton-Ferrer C, Schindler K. Learning by tracking: siamese CNN for robust target association[C]∥ Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Las Vegas, USA,2016: 418-425.
35	Zhu J, Yang H, Liu N, et al. Online multi-object tracking with dual matching attention networks[C]∥Proceedings of the European Conference on Computer Vision(ECCV), Munich,Germany, 2018: 366-382.
36	Tang S, Andriluka M, Andres B, et al. Multiple people tracking by lifted multicut and person re-identification[C]∥ Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA,2017: 3539-3548.
37	Yin J, Wang W, Meng Q, et al. A unified object motion and affinity model for online multi-object tracking[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA,2020: 6768-6777.
38	Lu Y, Lu C, Tang C K. Online video object detection using association LSTM[C]∥Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2344-2352.
39	Sadeghian A, Alahi A, Savarese S. Tracking the untrackable: Learning to track multiple cues with long-term dependencies[C]∥Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017: 300-311.
40	Chen L, Peng X, Ren M. Recurrent metric networks and batch multiple hypothesis for multi-object tracking[J]. IEEE Access, 2019, 7: 3093-3105.
41	Rosello P, Kochenderfer M J. Multi-agent reinforcement learning for multi-object tracking[C]∥Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, Stockholm, Sweden, 2018: 1397-1404.
42	Munkres J. Algorithms for the assignment and transportation problems[J]. Journal of the Society for Industrial and Applied Mathematics, 1957, 5(1): 32-38.
43	Xu Y, Osep A, Ban Y, et al. How to train your deep multi-object tracker[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA,2020: 6787-6796.
44	Milan A, Rezatofighi S H, Dick A, et al. Online multi-target tracking using recurrent neural networks[C]∥ Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, USA,2017:4225-4232.
45	Yoon K, Kim D Y, Yoon Y C, et al. Data association for multi-object tracking via deep neural networks[J]. Sensors, 2019, 19(3): 559-574.
46	Chu P, Fan H, Tan C C, et al. Online multi-object tracking with instance-aware tracker and dynamic model refreshment[C]∥IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, USA, 2019: 161-170.
47	Kuhn H W. The Hungarian method for the assignment problem[J]. Naval Research Logistics, 2005, 52(1): 7-21.
48	Wu B, Nevatia R. Tracking of multiple, partially occluded humans based on static body part detection[C]∥IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR'06), New York, USA,2006: 951-958.
49	Keni B, Rainer S. Evaluating multiple object tracking performance: The CLEAR MOT Metrics[J]. Eurasip Journal on Image & Video Processing, 2008 (1): 246309.
50	Ristani E, Solera F, Zou R, et al. Performance measures and a data set for multi-target, multi-camera tracking[C]∥European Conference on Computer Vision, Amsterdam, Netherlands, 2016: 17-35.
51	Leal-Taixé L, Milan A, Reid I, et al. Motchallenge 2015: Towards a benchmark for multi-target tracking[EB/OL]. [2015-04-08].
52	Leal-Taixé L, Milan A, Reid I, et al. MOT16: A benchmark for multi-object tracking[EB/OL]. [2016-05-03].
53	Felzenszwalb P F, Girshick R B, Mcallester D, et al. Object detection with discriminatively trained part-based models[J]. IEEE Transactions on Software Engineering, 2010, 32(9): 1627-1645.
54	Yang F, Choi W, Lin Y. Exploit all the layers: fast and accurate CNN object detector with scale dependent pooling and cascaded rejection classifiers[C]∥IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Las Vegas,USA, 2016:2129-2137.
55	Dendorfer P, Rezatofighi H, Milan A, et al. CVPR19 tracking and detection challenge: how crowded can it get?[EB/OL]. [2019-06-10].
56	Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? the kitti vision benchmark suite[C]∥IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA,2012: 3354-3361.
57	Ferryman J, Shahrokni A. Pets2009: dataset and challenge[C]∥Twelfth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, Snowbird, USA,2009: 1-6.
58	Andriluka M, Roth S, Schiele B. Monocular 3d pose estimation and tracking by detection[C]∥IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, USA,2010: 623-630.
59	Wen L, Du D, Cai Z, et al. UA-DETRAC: a new benchmark and protocol for multi-object detection and tracking[J]. Computer Vision and Image Understanding, 2020, 193: 102907.
60	Leal-Taixé L, Milan A. Tracking the trackers: an analysis of the state of the art in multiple object tracking[EB/OL]. [2020-10-13].
61	Yoon K, Song Y, Jeon M. Multiple hypothesis tracking algorithm for multi-target multi-camera tracking with disjoint views[J]. IET Image Processing, 2018, 12(7): 1175-1184.
62	Liang Y, Zhou Y. Multi-camera tracking exploiting person re-id technique[C]∥The 24th International Conference on Neural Information Processing, Guangzhou, China, 2017: 397-404.
63	Yoo H, Kim K, Byeon M, et al. Online scheme for multiple camera multiple target tracking based on multiple hypothesis tracking[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 27(3): 454-469.
64	Tang Z, Naphade M, Liu M Y, et al. Cityflow: a city-scale benchmark for multi-target multi-camera vehicle tracking and re-identification[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 8797-8806.
65	Imani Y, Teyfouri N, Ahmadzadeh M R, et al. A new method for multiple sperm cells tracking[J]. Journal of Medical Signals & Sensors, 2014, 4(1): 35-42.
66	Meirovitch Y, Mi L, Saribekyan H, et al. Cross-classification clustering: an efficient multi-object tracking technique for 3-D instance segmentation in connectomics[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 8425-8435.
67	Mittek M, Psota E T, Pérez L C, et al. Health monitoring of group-housed pigs using depth-enabled multi-object tracking[C]∥Proceedings of International Conference on Pattern Recognition (ICPR), Cancun, Mexico,2016: 9-12.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

度量名称	期望分值	简述
MOTA↑	100%	多目标跟踪准确度
MOTP↑	100%	多目标跟踪精度
MT↑	100%	最多跟踪的目标
ML↓	0%	最少丢失的目标
Frag↓	0	跟踪被打断的总次数
IDSW↓	0	身份切换的总次数
FP↓	0	错误正样本数量
FN↓	0	错误负样本数量
IDF1↑	100%	识别F值
Hz↑	正无穷	处理速度（以FPS为单位，但不包括检测器的处理速度）

数据集	视频来源	长度	轨迹数量	FPS	相机状况	视点	密度	天气
MOT2015	TUD?Crossing	201	13	25	静止	水平	5.5	多云
	PETS2009?S2L2	436	42	7	静止	高	22.1	多云
	ETH?Crossing	219	26	14	移动	低	4.9	多云
	ADL?Rundle?1	500	32	30	移动	水平	18.6	晴
	KITTI?16	209	17	10	静止	水平	8.1	晴
MOT2016	MOT16?01	450	23	30	静止	水平	14.2	多云
	MOT16?03	1500	148	30	静止	高	69.7	夜晚
	MOT16?06	1194	221	14	移动	低	9.7	晴
	MOT16?12	900	86	30	移动	水平	9.2	室内
MOT2017	MOT17?01	450	24	30	静止	水平	14.3	晴
	MOT17?03	1500	148	30	静止	高	69.8	夜晚
	MOT17?06	1194	222	14	移动	水平	9.9	晴

方法	MOTA	MOTP	FP	FN	IDSW
RNN_LSTM ^[44]	19.0	71.0	11 578	36 706	1 490
MARLMOT^[41]	27.7	72.5	6 092	21 976	767
SiameseCNN^[38]	29.0	71.2	5 160	37 798	639
MHT_DAM^[23]	32.4	71.8	9 064	32 060	435
QuadMOT^[3]	33.8	73.4	7 898	32 061	703
STAM^[2]	34.3	70.5	5 154	34 848	348
RAN^[4]	35.1	70.9	6 771	32 717	381
AMIR^[39]	37.6	71.7	7 933	29 397	1026
AP_HWDPL^[27]	38.5	72.6	4 005	33 203	586
MPN^[30]	51.5	76.0	7 620	21 780	375

方法	MOTA	MOTP	FP	FN	IDSW
DAN^[18]	40.8	74.4	15 143	91 792	1 051
MHT_bLSTM^[5]	42.1	75.9	11 637	93 172	753
QuadMOT^[3]	44.1	76.4	6 388	94 775	745
MHT_DAM^[23]	45.8	76.3	6 412	91 758	590
STAM^[2]	46.0	74.9	6 895	91 117	473
DMAN^[35]	46.1	73.8	7 909	89 874	532
AMIR^[39]	47.2	75.8	2 681	92 856	774
LMP^[36]	48.8	79.0	6 654	86 245	481
MPN^[30]	58.6	78.9	4 949	70 252	354
DeepSORT^[24]	61.4	79.1	12 852	56 668	781
NSH^[17]	63.9	78.5	9 829	55 000	913
POI^[12]	66.1	79.5	5 061	55 914	805
CTracker^[28]	67.6	78.4	8 934	48 305	1897

[1]	车翔玖,刘华罗,邵庆彬. 基于Fast RCNN改进的布匹瑕疵识别算法[J]. 吉林大学学报(工学版), 2019, 49(6): 2038-2044.
[2]	徐谦,李颖,王刚. 基于深度学习的行人和车辆检测[J]. 吉林大学学报(工学版), 2019, 49(5): 1661-1667.
[3]	赵宏伟,李明昭,刘静,胡黄水,王丹,臧雪柏. 基于自然性和视觉特征通道的场景分类[J]. 吉林大学学报(工学版), 2019, 49(5): 1668-1675.
[4]	王鹏宇,赵世杰,马天飞,熊晓勇,程馨. 基于联合概率数据关联的车用多传感器目标跟踪融合算法[J]. 吉林大学学报(工学版), 2019, 49(5): 1420-1427.
[5]	黄勇,杨德运,乔赛,慕振国. 高分辨合成孔径雷达图像的耦合传统恒虚警目标检测[J]. 吉林大学学报(工学版), 2018, 48(6): 1904-1909.
[6]	车翔玖, 王利, 郭晓新. 基于多尺度特征融合的边界检测算法[J]. 吉林大学学报(工学版), 2018, 48(5): 1621-1628.
[7]	耿庆田, 于繁华, 王宇婷, 高琦坤. 基于特征融合的车型检测新算法[J]. 吉林大学学报(工学版), 2018, 48(3): 929-935.
[8]	杨超宇, 李策, 梁胤程, 杨峰. 基于改进粒子滤波的煤矿视频监控模糊目标检测[J]. 吉林大学学报(工学版), 2017, 47(6): 1976-1985.
[9]	董强, 刘晶红, 周前飞. 用于遥感图像拼接的改进SURF算法[J]. 吉林大学学报(工学版), 2017, 47(5): 1644-1652.
[10]	许岩岩, 陈辉, 刘家驹, 袁金钊. CELL处理器并行实现立体匹配算法[J]. 吉林大学学报(工学版), 2017, 47(3): 952-958.
[11]	姜宏, 李垠, 吕巍. 基于线性收缩的大阵列MIMO雷达目标盲检测[J]. 吉林大学学报(工学版), 2017, 47(3): 973-980.
[12]	尹明, 战荫伟, 裴海龙. 基于稀疏补算子学习的图像融合方法[J]. 吉林大学学报(工学版), 2016, 46(6): 2052-2058.
[13]	孙挺, 齐迎春, 耿国华. 基于帧间差分和背景差分的运动目标检测算法[J]. 吉林大学学报(工学版), 2016, 46(4): 1325-1329.
[14]	肖钟捷. 基于小波空间特征谱熵的数字图像识别[J]. 吉林大学学报(工学版), 2015, 45(6): 1994-1998.
[15]	刘红，孙爽滋，王庆元，李延忠. 基于PSO的模拟电路故障信息特征提取[J]. 吉林大学学报(工学版), 2015, 45(2): 675-680.