Journal of Jilin University(Engineering and Technology Edition) ›› 2024, Vol. 54 ›› Issue (1): 251-258.doi: 10.13229/j.cnki.jdxbgxb.20220280

Previous Articles    

3D human joint point recognition based on weakly supervised migration network

Zhi-yong SUN1(),Hong-you LI2,Jun-yong YE1()   

  1. 1.Key Laboratory of Optoelectronic Technology of the Ministry of Education,Chongqing University,Chongqing 400044,China
    2.Department of Basic Education,Chongqing Police College,Chongqing 401331,China
  • Received:2022-03-21 Online:2024-01-30 Published:2024-03-28
  • Contact: Jun-yong YE E-mail:sunzhiyong@cqu.edu.cn;ygyocr@cqu.edu.cn

Abstract:

Aiming at the lack of depth information and incomplete spatial structure information of behavior and posture in 2D images, a 3D human joint point recognition method based on weak supervised migration network is proposed. Firstly, an end-to-end 3D human pose estimation framework for real images is proposed. The depth neural network is trained with 2D and 3D mixed label images. In the 2D human pose recognition sub network, the depth regression module is added to improve the 2D human pose recognition sub network to solve the problem of depth ambiguity in 3D human pose recognition; Secondly, in the 3D human pose recognition sub network, 3D geometric constraints are introduced to standardize the human pose recognition. For the case of no real depth label, it can better learn the depth features and effectively solve the problem of human pose recognition with occlusion. In human 3.6m and mpii data sets, the average error of joint point prediction is lower than that of other methods, and has better 3D human posture recognition effect.

Key words: migration network, pose recognition, 3D joint points, geometric constraints, depth regression

CLC Number: 

  • TP391

Fig.1

Schematic illustration of deep neural network trained by experimental images with 3D labels (right) and real images with 2D labels (left)"

Fig.2

Architecture of 3D HPE network based on weakly supervised transfer"

Fig.3

Hourglass network structure"

Table 1

2D joints accuracy results on Human 3.6M dataset"

方法文献[10文献[11文献[24文献[25无约束有约束
acc/%90.0190.5790.9191.6287.6990.93

Fig.4

Results of 2D and 3D human pose estimation from various actions"

Table 2

3D joints MPJPE results on Human 3.6M dataset"

方法坐立抽烟行走交谈吃东西拍照打招呼打电话购物等候遛狗平均
文献[10133.14106.65114.0597.5789.98139.17107.87107.31136.09106.2187.03114.18
文献[11110.1984.9571.3673.4776.82110.6786.4386.2874.7985.7886.2688.39
文献[24124.52107.4279.36109.3187.05143.32103.16116.1899.78118.09114.2379.9
文献[2596.1970.8282.0369.7460.5585.4268.7776.3675.0468.4554.4174.14
无约束74.7964.3463.9761.1658.1267.2971.7562.5456.3868.7852.2263.76
有约束75.2064.1563.2260.7058.2265.5371.4162.0355.5866.0551.4363.05

Table 3

Evaluation of left-right Symmetry of with andwithout constraint on MPII dataset"

无约束有约束
大臂42.4 mm37.8 mm
小臂60.4 mm50.7 mm
大腿43.5 mm43.4 mm
小腿59.4 mm47.8 mm
大臂6.27 px4.80 px
小臂10.11 px6.64 px
大腿6.89 px4.93 px
小腿8.03 px6.22 px
1 Insafutdinov E, Pishchulin L, Andres B, et al. DeeperCut: a deeper, stronger, and faster multi-person pose estimation model[J/OL].[2016-12-10].
2 Bulat A, Tzimiropoulos G. Human pose estimation via convolutional part heat map regression[C]∥The 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 717-732.
3 Chu X, Yang W, Ouyang W, et al. Multi-context attention for human pose estimation[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 1831-1840.
4 Newell A, Yang K, Deng J. Stacked hourglass networks for human pose estimation[C]∥The 14th European Conference Computer Vision, Amsterdam, The Netherlands, 2016: 483-499.
5 Ionescu C, Papava D, Olaru V, et al. Human3.6M: large scale datasets and predictive methods for 3D human sensing in natural environments[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2014, 36(7): 1325-1339.
6 Sigal L, Balan A O, Black M J. Humaneva: synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion[J]. International Journal of Computer Vision, 2010, 87(1/2): 4-27.
7 Zhou X, Sun X, Zhang W, et al. Deep kinematic pose regression[C]∥The 14th European Conference Computer Vision, Amsterdam, The Netherlands, 2016: 186-201.
8 Li S, Chan A B. 3D human pose estimation from monocular images with deep convolutional neural network[C]∥The 12th Asian Conference on Computer Vision, Singapore, 2014: 332-347.
9 Bogo F, Kanazawa A, Lassner C, et al. Keep it SMPL: automatic estimation of 3D human pose and shape from a single image[C]∥The 14th European Conference, Amsterdam, The Netherlands, 2016: 561-578.
10 Chen C H, Ramanan D. 3D human pose estimation= 2d pose estimation+ matching[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA,2017: 7035-7043.
11 Tome D, Russell C, Agapito L. Lifting from the deep: convolutional 3D pose estimation from a single image[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 2500-2509.
12 Wu J, Xue T, Lim J J, et al. Single image 3D interpreter network[C]∥The 14th European Conference, Amsterdam, The Netherlands, 2016: 365-382.
13 Yasin H, Iqbal U, Kruger B, et al. A dual-source approach for 3D pose estimation from a single image[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4948-4956.
14 Zhou X, Zhu M, Leonardos S, et al. Sparseness meets deepness: 3D human pose estimation from monocular video[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4966-4975.
15 Wei S E, Ramakrishna V, Kanade T, et al. Convolutional pose machines[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4724-4732.
16 Akhter I, Black M J. Pose-conditioned joint angle limits for 3D human pose reconstruction[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1446-1455.
17 Ramakrishna V, Kanade T, Sheikh Y. Reconstructing 3D human pose from 2D image landmarks[C]∥The 12th European Conference on Computer Vision, Florence, Italy, 2012: 573-586.
18 Zhou X, Leonardos S, Hu X, et al. 3D shape estimation from 2D landmarks: a convex relaxation approach[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 4447-4455.
19 Wei S E, Ramakrishna V, Kanade T, et al. Convolutional pose machines[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4724-4732.
20 Tome D, Russell C, Agapito L. Lifting from the deep: Convolutional 3D pose estimation from a single image[C]∥Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 2500-2509.
21 Zhou X, Zhu M, Leonardos S, et al. Sparseness meets deepness: 3d human pose estimation from monocular video[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4966-4975.
22 Zhang Z, Hu L, Deng X, et al. Weakly supervised adversarial learning for 3D human pose estimation from point clouds[J]. IEEE Transactions on Visualization and Computer Graphics, 2020, 26(5): 1851-1859.
23 Hoffman J, Wang D, Yu F, et al. FCNs in the wild: pixel-level adversarial and constraint-based adaptation[J/OL]. [2016-12-10].
24 Zhou X, Zhu M, Pavlakos G, et al. Monocap: monocular human motion capture using a CNN coupled with a geometric prior[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(4): 901-914.
25 Mehta D, Rhodin H, Casas D, et al. Monocular 3D human pose estimation using transfer learning and improved CNN supervision[J/OL]. [2016-12-10].
26 Andriluka M, Pishchulin L, Gehler P,et al.Human pose estimation: new benchmark and state of the art analysis[C]∥Computer Vision and Pattern Recognitio,Columbus,USA,2014.
[1] Ling ZHU,Qiu-cheng WANG. New energy vehicle drive system coordinated control method under spatial geometric constraints [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(7): 1509-1514.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!