基于亚像素定位的人体姿态热图后处理

doi:10.13229/j.cnki.jdxbgxb.20230268

Abstract

Abstract:

To improve the prediction accuracy of joint points of the heatmap， this paper proposes a postprocessing method of human pose heatmap based on sub-pixel localization. The method includes two strategies： the first is the sub-pixel shift processing of the flipped image heatmap， which can eliminate the unaligned deviation from the original image heatmap； the second is the heatmap decoding for local region surface fitting to achieve sub-pixel localization of the joint points. The heatmap postprocessing method in this paper is independent of the network model and can be applied to the current heatmap-based human pose estimation models without any modification. To verify the effectiveness of the proposed method， experiments have been carried out by using two publicly available datasets named COCO2017 and MPII. The average precision can be improved by 0.9 and 1.1 on COCO2017， respectively， by adopting two deep learning models， i.e.， HRNet-W32-256×192 model and Simple Baseline-W32-256×192 model.

Key words: computer vision, human pose estimation, heatmap postprocessing, gaussian fitting, heatmap decoding

CLC Number:

TP391.4

Yu WANG,Kai ZHAO. Postprocessing of human pose heatmap based on sub⁃pixel location[J].Journal of Jilin University(Engineering and Technology Edition), 2024, 54(5): 1385-1392.

Figures/Tables 6

Fig.1

Fig.2

Table 1

Experimental results of COCO dataset"

网络	模型	分辨率	解码方法	AP	AP.5	AP.75	AP（M）	AP（L）	AR	AR.5	AR.75	AR（M）	AR（L）
HRNet	C-32	256×192	标准偏移	74.4	90.5	81.9	70.8	81.0	79.8	94.2	86.5	75.7	85.8
			SSFH	74.7	90.6	82.0	71.0	81.6	80.0	94.3	86.6	75.7	86.3
			DARK	74.8	90.4	82.0	71.4	81.6	80.2	94.1	86.7	76.1	86.2
			LRSF	75.0	90.5	82.0	71.4	81.7	80.2	94.1	86.7	76.1	86.2
			SSFH+LRSF	75.3	90.6	82.2	71.6	82.3	80.5	94.3	86.7	76.3	86.7
		384×288	标准偏移	75.8	90.6	82.5	72.0	82.7	80.9	94.3	86.9	76.7	87.1
			SSFH	75.9	90.6	82.8	72.0	82.9	81.0	94.3	87.1	76.6	87.4
			DARK	75.9	90.6	82.4	72.1	82.9	81.1	94.3	87.0	76.7	87.4
			LRSF	76.0	90.6	82.5	72.2	83.0	81.1	94.3	87.0	76.9	87.3
			SSFH+LRSF	76.2	90.6	82.8	72.3	83.2	81.2	94.3	87.1	76.8	87.6
	C-48	256×192	标准偏移	75.1	90.6	82.2	71.5	81.8	80.4	94.3	86.7	76.2	86.4
			SSFH	75.3	90.6	82.4	71.3	82.4	80.5	94.2	86.6	76.1	86.9
			DARK	75.5	90.5	82.5	71.9	82.3	80.6	94.2	86.7	76.4	87
			LRSF	75.6	90.6	82.5	71.9	82.4	80.7	94.2	86.8	76.5	86.9
			SSFH+LRSF	75.9	90.7	82.5	72.0	82.9	81.0	94.2	86.7	76.6	87.3
		384×288	标准偏移	76.3	90.8	82.9	72.3	83.4	81.2	94.2	87.1	76.7	87.6
			SSFH	76.4	90.8	83.1	72.3	83.6	81.3	94.2	87.3	76.8	87.8
			DARK	76.5	90.8	82.9	72.4	83.5	81.3	94.2	87	76.8	87.7
			LRSF	76.5	90.8	82.9	72.5	83.6	81.3	94.2	87.0	76.9	87.7
			SSFH+LRSF	76.6	90.8	83.1	72.5	83.9	81.5	94.2	87.3	76.9	88.0
simple baseline	R-50	256×192	标准偏移	70.4	88.6	78.3	67.1	77.2	76.3	92.9	83.4	72.1	82.4
			SSFH	70.7	88.6	78.1	67.3	77.6	76.6	92.9	83.2	72.2	72.2
			DARK	71.0	88.6	78.5	67.7	77.9	76.7	93	83.3	72.5	83
			LRSF	71.2	88.6	78.6	67.8	78.1	76.8	93.0	83.4	72.6	83.0
			SSFH+LRSF	71.5	88.6	78.7	67.9	78.3	77.2	93.1	83.5	72.7	83.4
		384×288	标准偏移	72.2	89.3	78.9	68.1	79.7	77.6	93.2	83.8	72.8	84.6
			SSFH	72.3	89.4	78.9	68.1	80.1	77.7	93.3	83.6	72.9	84.8
			DARK	72.4	89.3	79.0	68.2	80.1	77.8	93.2	83.8	73.0	84.8
			LRSF	72.5	89.3	79.1	68.4	80.1	77.9	93.2	83.9	73.0	84.8
			SSFH+LRSF	72.6	89.4	79.1	68.4	80.4	78.0	93.2	83.8	73.1	85.1
	R-101	256×192	标准偏移	71.4	89.3	79.3	68.1	78.1	77.1	93.4	84	73.0	83.2
			SSFH	71.7	89.3	79.4	68.2	78.7	77.3	93.3	84.1	73.0	83.6
			DARK	71.9	89.3	79.5	68.9	78.6	77.6	93.5	84.1	73.5	83.7
			LRSF	72.0	89.3	79.6	68.9	78.7	77.7	93.5	84.2	73.6	83.7
			SSFH+LRSF	72.3	89.3	79.5	68.8	79.4	77.9	93.3	84.2	73.6	84.2
		384×288	标准偏移	73.6	89.6	80.3	69.9	81.1	79.1	93.6	85.1	74.5	85.8
			SSFH	73.8	89.5	80.6	69.7	81.4	79.0	93.2	85.3	74.3	85.9
			DARK	73.9	89.4	80.6	69.9	81.4	79.1	93.2	85.3	74.4	85.9
			LRSF	73.9	73.9	80.7	70.0	81.5	79.2	93.3	85.3	74.5	86.0
			SSFH+LRSF	74.1	89.4	80.8	70.0	81.8	79.3	93.3	85.4	74.5	86.2
	R-152	256×192	标准偏移	72.0	89.3	79.8	68.7	78.9	77.8	93.4	84.6	73.6	83.9
			SSFH	72.4	89.4	79.7	68.9	79.5	78.0	93.5	84.5	73.7	84.3
			DARK	72.6	89.3	80	69.4	79.7	78.3	93.4	84.9	74.1	84.4
			LRSF	72.7	89.3	80.0	69.4	79.7	78.3	93.4	84.9	74.1	84.4
			SSFH+LRSF	72.9	89.3	80.0	69.4	80.2	78.5	93.4	84.8	74.2	84.8
		384×288	标准偏移	74.3	89.6	81.1	70.5	81.6	79.7	93.7	85.8	75.1	86.3
			SSFH	74.4	89.6	81.3	81.3	81.9	79.8	93.7	85.9	75.1	86.5
			DARK	74.6	89.6	81.4	70.9	82	79.8	93.6	86	75.2	86.5
			LRSF	74.6	89.6	81.5	70.8	82.0	79.9	93.6	86.0	75.3	86.5
			SSFH+LRSF	74.7	89.6	81.5	70.7	82.3	80.0	93.6	86.0	75.3	86.7

Table 1

Table 2

Fig.3

Fig.4

References 15

1	Islam M U, Mahmud H, Ashraf F B, et al. Yoga posture recognition by detecting human joint points in real time using microsoft kinect[C]∥ IEEE Region 10 Humanitarian Technology Conference (R10-HTC), Dhaka, Bangladesh, 2017: 668-673.
2	李贻斌, 郭佳旻, 张勤. 人体步态识别方法与技术[J]. 吉林大学学报: 工学版, 2020, 50(1): 1-18.
	Li Yi-bin, Guo Jia-min, Zhang Qin. Methods and technologies of human gait recognition[J]. Journal of Jilin University (Engineering and Technology Edition), 2020, 50(1): 1-18.
3	田皓宇, 马昕, 李贻斌. 基于骨架信息的异常步态识别方法[J]. 吉林大学学报: 工学版, 2022, 52(4): 725-737.
	Tian Hao-yu, Ma Xin, Li Yi-bin. Skeleton-based abnormal gait recognition: a survey[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(4): 725-737.
4	Tang S, Andriluka M, Andres B, et al. Multiple people tracking by lifted multicut and person re-identification[C]∥ Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hololulu, USA, 2017: 3539-3548.
5	侯春萍, 杨庆元, 黄美艳, 等. 基于语义耦合和身份一致性的跨模态行人重识别方法[J]. 吉林大学学报: 工学版, 2022, 52(12): 2954-2963.
	Hou Chun-ping, Yang Qing-yuan, Huang Mei-yan, et al. Cross⁃modality person re-identification based on semantic coupling and identity-consistence constraint[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(12): 2954-2963.
6	Cheng Y, Yi P, Liu R, et al. Human-robot interaction method combining human pose estimation and motion intention recognition[C]∥ IEEE 24th International Conference on Computer Supported Cooperative Work in Design, Dalian, China, 2021: 958-963.
7	Toshev A, Szegedy C. Deeppose: human pose estimation via deep neural networks[C]∥ Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 1653-1660.
8	Sun K, Xiao B, Liu D, et al. Deep high-resolution representation learning for human pose estimation[C]∥ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seoul, South Korea, 2019: 5693-5703.
9	Xiao B, Wu H, Wei Y. Simple baselines for human pose estimation and tracking[C]∥ Proceedings of the European Conference on Computer Vision, Munichi, Germany, 2018: 466-481.
10	Huang J, Zhu Z, Guo F, et al. The devil is in the details: delving into unbiased data processing for human pose estimation[C]∥ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 5700-5709.
11	Zhang F, Zhu X, Dai H, et al. Distribution-aware coordinate representation for human pose estimation[C]∥ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 7093-7102.
12	Newell A, Yang K, Deng J. Stacked hourglass networks for human pose estimation[C]∥ Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, Netherlands, 2016: 483-499.
13	Lin T Y, Maire M, Belongie S, et al. Microsoft coco: common objects in context[C]∥ Computer Vision⁃ECCV 2014: 13th European Conference, Zurich, Switzerland, 2014: 740-755.
14	Andriluka M, Pishchulin L, Gehler P, et al. 2d human pose estimation: new benchmark and state of the art analysis[C]∥ Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 3686-3693.
15	Yang Y, Ramanan D. Articulated pose estimation with flexible mixtures-of-parts[C]∥ CVPR 2011, Colorado Springs, USA, 2011: 1385-1392.

Related Articles 15

[1]	Jing-hong LIU,An-ping DENG,Qi-qi CHEN,Jia-qi PENG,Yu-jia ZUO. Anchor⁃free target tracking algorithm based on multiple attention mechanism [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(12): 3518-3528.
[2]	Kan WANG,Hang SU,Hao ZENG,Jian QIN. Deep target tracking using augmented apparent information [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(11): 2676-2684.
[3]	Jie CAO,Xue QU,Xiao-xu LI. Few⁃shot image classification method based on sliding feature vectors [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(5): 1785-1791.
[4]	Tao XU,Ke MA,Cai-hua LIU. Multi object pedestrian tracking based on deep learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 27-38.
[5]	Hong-wei ZHAO,Ming-zhao LI,Jing LIU,Huang-shui HU,Dan WANG,Xue-bai ZANG. Scene classification based on degree of naturalness and visual feature channels [J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(5): 1668-1675.
[6]	CHE Xiang-jiu, WANG Li, GUO Xiao-xin. Improved boundary detection based on multi-scale cues fusion [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1621-1628.
[7]	XU Yan-yan, CHEN Hui, LIU Jia-ju, YUAN Jin-zhao. Cell processor stereo matching parallel computation [J]. 吉林大学学报(工学版), 2017, 47(3): 952-958.
[8]	ZHANG Bao-hua, HUANG Wen-qian, LI Jiang-bo, ZHAO Chun-jiang, LIU Cheng-liang, HUANG Dan-feng. Online sorting of irregular potatoes based on I-RELIEF and SVM method [J]. 吉林大学学报(工学版), 2014, 44(6): 1811-1817.
[9]	YANG Yan, LIU Sa, LIAN Shi-bin, ZHU Xiao-dong. Analysis of fruit tree pests morphological characteristics based on computer vision [J]. 吉林大学学报(工学版), 2013, 43(增刊1): 235-238.
[10]	SHANG Fei, MA Jun-Xiao, YAO Li, TIAN Di, QIU Chun-Ling. Multifeature fusion based method for monitoring working status of instruments [J]. 吉林大学学报(工学版), 2010, 40(02): 545-0548.
[11]	GE Liang, ZHU Qing-sheng, FU Si-si, LUO Da-jiang, LIU Jin-feng. Improved image dense stereo matching algorithm [J]. 吉林大学学报(工学版), 2010, 40(01): 212-0217.
[12]	GUAN Xin, JIA Xin, GAO Zhen-hai . Adaptive threshold algorithm based on contrast-regional homogeneity analysis of lane image [J]. 吉林大学学报(工学版), 2008, 38(04): 758-763.
[13]	Wan Peng，Sun Yu，Sun Yong-hai . Recognition method of rice kernel shape based on computer vision [J]. 吉林大学学报(工学版), 2008, 38(02): 489-0492.
[14]	Tian Jian, Li Jiang, Li Yaqiao. Calibration technique of photogrammetry of traffic accident scene [J]. 吉林大学学报(工学版), 2006, 36(增刊1): 136-0139.
[15]	GUAN Xin, DONG Yin-ping, GAO Zhen-hai. LMedSquare based road curve fitting algorithm [J]. 吉林大学学报(工学版), 2004, (2): 194-197.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Postprocessing of human pose heatmap based on sub⁃pixel location

RICH HTML

PDF (PC)