Journal of Jilin University(Engineering and Technology Edition) ›› 2024, Vol. 54 ›› Issue (5): 1385-1392.doi: 10.13229/j.cnki.jdxbgxb.20230268

Previous Articles    

Postprocessing of human pose heatmap based on sub⁃pixel location

Yu WANG1(),Kai ZHAO1,2   

  1. 1.School of Electronic Information Engineering,Changchun University of Science and Technology,Changchun 130022,China
    2.School of Electronics and Information Engineering,Heilongjiang University of Science and Technology,Harbin 150022,China
  • Received:2023-03-26 Online:2024-05-01 Published:2024-06-11

Abstract:

To improve the prediction accuracy of joint points of the heatmap, this paper proposes a postprocessing method of human pose heatmap based on sub-pixel localization. The method includes two strategies: the first is the sub-pixel shift processing of the flipped image heatmap, which can eliminate the unaligned deviation from the original image heatmap; the second is the heatmap decoding for local region surface fitting to achieve sub-pixel localization of the joint points. The heatmap postprocessing method in this paper is independent of the network model and can be applied to the current heatmap-based human pose estimation models without any modification. To verify the effectiveness of the proposed method, experiments have been carried out by using two publicly available datasets named COCO2017 and MPII. The average precision can be improved by 0.9 and 1.1 on COCO2017, respectively, by adopting two deep learning models, i.e., HRNet-W32-256×192 model and Simple Baseline-W32-256×192 model.

Key words: computer vision, human pose estimation, heatmap postprocessing, gaussian fitting, heatmap decoding

CLC Number: 

  • TP391.4

Fig.1

Heatmap postprocessing flow"

Fig.2

Decoding effect of local region surface fitting"

Table 1

Experimental results of COCO dataset"

网络模型分辨率解码方法APAP.5AP.75AP(M)AP(L)ARAR.5AR.75AR(M)AR(L)
HRNetC-32256×192标准偏移74.490.581.970.881.079.894.286.575.785.8
SSFH74.790.682.071.081.680.094.386.675.786.3
DARK74.890.482.071.481.680.294.186.776.186.2
LRSF75.090.582.071.481.780.294.186.776.186.2
SSFH+LRSF75.390.682.271.682.380.594.386.776.386.7
384×288标准偏移75.890.682.572.082.780.994.386.976.787.1
SSFH75.990.682.872.082.981.094.387.176.687.4
DARK75.990.682.472.182.981.194.387.076.787.4
LRSF76.090.682.572.283.081.194.387.076.987.3
SSFH+LRSF76.290.682.872.383.281.294.387.176.887.6
C-48256×192标准偏移75.190.682.271.581.880.494.386.776.286.4
SSFH75.390.682.471.382.480.594.286.676.186.9
DARK75.590.582.571.982.380.694.286.776.487
LRSF75.690.682.571.982.480.794.286.876.586.9
SSFH+LRSF75.990.782.572.082.981.094.286.776.687.3
384×288标准偏移76.390.882.972.383.481.294.287.176.787.6
SSFH76.490.883.172.383.681.394.287.376.887.8
DARK76.590.882.972.483.581.394.28776.887.7
LRSF76.590.882.972.583.681.394.287.076.987.7
SSFH+LRSF76.690.883.172.583.981.594.287.376.988.0
simple baselineR-50256×192标准偏移70.488.678.367.177.276.392.983.472.182.4
SSFH70.788.678.167.377.676.692.983.272.272.2
DARK71.088.678.567.777.976.79383.372.583
LRSF71.288.678.667.878.176.893.083.472.683.0
SSFH+LRSF71.588.678.767.978.377.293.183.572.783.4
384×288标准偏移72.289.378.968.179.777.693.283.872.884.6
SSFH72.389.478.968.180.177.793.383.672.984.8
DARK72.489.379.068.280.177.893.283.873.084.8
LRSF72.589.379.168.480.177.993.283.973.084.8
SSFH+LRSF72.689.479.168.480.478.093.283.873.185.1
R-101256×192标准偏移71.489.379.368.178.177.193.48473.083.2
SSFH71.789.379.468.278.777.393.384.173.083.6
DARK71.989.379.568.978.677.693.584.173.583.7
LRSF72.089.379.668.978.777.793.584.273.683.7
SSFH+LRSF72.389.379.568.879.477.993.384.273.684.2
384×288标准偏移73.689.680.369.981.179.193.685.174.585.8
SSFH73.889.580.669.781.479.093.285.374.385.9
DARK73.989.480.669.981.479.193.285.374.485.9
LRSF73.973.980.770.081.579.293.385.374.586.0
SSFH+LRSF74.189.480.870.081.879.393.385.474.586.2
R-152256×192标准偏移72.089.379.868.778.977.893.484.673.683.9
SSFH72.489.479.768.979.578.093.584.573.784.3
DARK72.689.38069.479.778.393.484.974.184.4
LRSF72.789.380.069.479.778.393.484.974.184.4
SSFH+LRSF72.989.380.069.480.278.593.484.874.284.8
384×288标准偏移74.389.681.170.581.679.793.785.875.186.3
SSFH74.489.681.381.381.979.893.785.975.186.5
DARK74.689.681.470.98279.893.68675.286.5
LRSF74.689.681.570.882.079.993.686.075.386.5
SSFH+LRSF74.789.681.570.782.380.093.686.075.386.7

Table 2

Experimental results of MPII dataset"

解码方法HeadShoulderElbowWristHipKneeAnkleMeanPckh@0.1
标准偏移97.195.990.386.489.187.183.390.337.7
SSFH97.196.190.386.589.187.083.390.438.9
DARK97.296.090.486.589.387.183.390.540.3
LRSF97.296.090.586.589.587.183.390.541.3
SSFH+LRSF97.296.190.686.589.387.083.490.541.5

Fig.3

Comparison of speed and accuracy of different heatmap postprocessing methods"

Fig.4

Visualization analysis"

1 Islam M U, Mahmud H, Ashraf F B, et al. Yoga posture recognition by detecting human joint points in real time using microsoft kinect[C]∥ IEEE Region 10 Humanitarian Technology Conference (R10-HTC), Dhaka, Bangladesh, 2017: 668-673.
2 李贻斌, 郭佳旻, 张勤. 人体步态识别方法与技术[J]. 吉林大学学报: 工学版, 2020, 50(1): 1-18.
Li Yi-bin, Guo Jia-min, Zhang Qin. Methods and technologies of human gait recognition[J]. Journal of Jilin University (Engineering and Technology Edition), 2020, 50(1): 1-18.
3 田皓宇, 马昕, 李贻斌. 基于骨架信息的异常步态识别方法[J]. 吉林大学学报: 工学版, 2022, 52(4): 725-737.
Tian Hao-yu, Ma Xin, Li Yi-bin. Skeleton-based abnormal gait recognition: a survey[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(4): 725-737.
4 Tang S, Andriluka M, Andres B, et al. Multiple people tracking by lifted multicut and person re-identification[C]∥ Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hololulu, USA, 2017: 3539-3548.
5 侯春萍, 杨庆元, 黄美艳, 等. 基于语义耦合和身份一致性的跨模态行人重识别方法[J]. 吉林大学学报: 工学版, 2022, 52(12): 2954-2963.
Hou Chun-ping, Yang Qing-yuan, Huang Mei-yan, et al. Cross⁃modality person re-identification based on semantic coupling and identity-consistence constraint[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(12): 2954-2963.
6 Cheng Y, Yi P, Liu R, et al. Human-robot interaction method combining human pose estimation and motion intention recognition[C]∥ IEEE 24th International Conference on Computer Supported Cooperative Work in Design, Dalian, China, 2021: 958-963.
7 Toshev A, Szegedy C. Deeppose: human pose estimation via deep neural networks[C]∥ Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 1653-1660.
8 Sun K, Xiao B, Liu D, et al. Deep high-resolution representation learning for human pose estimation[C]∥ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seoul, South Korea, 2019: 5693-5703.
9 Xiao B, Wu H, Wei Y. Simple baselines for human pose estimation and tracking[C]∥ Proceedings of the European Conference on Computer Vision, Munichi, Germany, 2018: 466-481.
10 Huang J, Zhu Z, Guo F, et al. The devil is in the details: delving into unbiased data processing for human pose estimation[C]∥ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 5700-5709.
11 Zhang F, Zhu X, Dai H, et al. Distribution-aware coordinate representation for human pose estimation[C]∥ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 7093-7102.
12 Newell A, Yang K, Deng J. Stacked hourglass networks for human pose estimation[C]∥ Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, Netherlands, 2016: 483-499.
13 Lin T Y, Maire M, Belongie S, et al. Microsoft coco: common objects in context[C]∥ Computer Vision⁃ECCV 2014: 13th European Conference, Zurich, Switzerland, 2014: 740-755.
14 Andriluka M, Pishchulin L, Gehler P, et al. 2d human pose estimation: new benchmark and state of the art analysis[C]∥ Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 3686-3693.
15 Yang Y, Ramanan D. Articulated pose estimation with flexible mixtures-of-parts[C]∥ CVPR 2011, Colorado Springs, USA, 2011: 1385-1392.
[1] Jing-hong LIU,An-ping DENG,Qi-qi CHEN,Jia-qi PENG,Yu-jia ZUO. Anchorfree target tracking algorithm based on multiple attention mechanism [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(12): 3518-3528.
[2] Kan WANG,Hang SU,Hao ZENG,Jian QIN. Deep target tracking using augmented apparent information [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(11): 2676-2684.
[3] Jie CAO,Xue QU,Xiao-xu LI. Few⁃shot image classification method based on sliding feature vectors [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(5): 1785-1791.
[4] Tao XU,Ke MA,Cai-hua LIU. Multi object pedestrian tracking based on deep learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 27-38.
[5] Hong-wei ZHAO,Ming-zhao LI,Jing LIU,Huang-shui HU,Dan WANG,Xue-bai ZANG. Scene classification based on degree of naturalness and visual feature channels [J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(5): 1668-1675.
[6] CHE Xiang-jiu, WANG Li, GUO Xiao-xin. Improved boundary detection based on multi-scale cues fusion [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1621-1628.
[7] XU Yan-yan, CHEN Hui, LIU Jia-ju, YUAN Jin-zhao. Cell processor stereo matching parallel computation [J]. 吉林大学学报(工学版), 2017, 47(3): 952-958.
[8] ZHANG Bao-hua, HUANG Wen-qian, LI Jiang-bo, ZHAO Chun-jiang, LIU Cheng-liang, HUANG Dan-feng. Online sorting of irregular potatoes based on I-RELIEF and SVM method [J]. 吉林大学学报(工学版), 2014, 44(6): 1811-1817.
[9] YANG Yan, LIU Sa, LIAN Shi-bin, ZHU Xiao-dong. Analysis of fruit tree pests morphological characteristics based on computer vision [J]. 吉林大学学报(工学版), 2013, 43(增刊1): 235-238.
[10] SHANG Fei, MA Jun-Xiao, YAO Li, TIAN Di, QIU Chun-Ling. Multifeature fusion based method for monitoring working status of instruments [J]. 吉林大学学报(工学版), 2010, 40(02): 545-0548.
[11] GE Liang, ZHU Qing-sheng, FU Si-si, LUO Da-jiang, LIU Jin-feng. Improved image dense stereo matching algorithm [J]. 吉林大学学报(工学版), 2010, 40(01): 212-0217.
[12] GUAN Xin, JIA Xin, GAO Zhen-hai . Adaptive threshold algorithm based on contrast-regional homogeneity
analysis of lane image
[J]. 吉林大学学报(工学版), 2008, 38(04): 758-763.
[13] Wan Peng,Sun Yu,Sun Yong-hai . Recognition method of rice kernel shape based on computer vision [J]. 吉林大学学报(工学版), 2008, 38(02): 489-0492.
[14] Tian Jian, Li Jiang, Li Yaqiao. Calibration technique of photogrammetry of traffic accident scene [J]. 吉林大学学报(工学版), 2006, 36(增刊1): 136-0139.
[15] GUAN Xin, DONG Yin-ping, GAO Zhen-hai. LMedSquare based road curve fitting algorithm [J]. 吉林大学学报(工学版), 2004, (2): 194-197.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!