吉林大学学报(工学版) ›› 2024, Vol. 54 ›› Issue (1): 251-258.doi: 10.13229/j.cnki.jdxbgxb.20220280

• 通信与控制工程 • 上一篇    

基于弱监督迁移网络的3D人体关节点识别

孙志勇1(),李宏友2,叶俊勇1()   

  1. 1.重庆大学 光电技术教育部重点实验室,重庆 400044
    2.重庆警察学院 基础教研部,重庆 401331
  • 收稿日期:2022-03-21 出版日期:2024-01-30 发布日期:2024-03-28
  • 通讯作者: 叶俊勇 E-mail:sunzhiyong@cqu.edu.cn;ygyocr@cqu.edu.cn
  • 作者简介:孙志勇(1979-),男,高级工程师,博士研究生.研究方向:机器视觉,信号处理,模式识别与智能系统.E-mail:sunzhiyong@cqu.edu.cn
  • 基金资助:
    国家重点研发计划项目(2020YFC1522905);重庆市教委科学技术研究项目(KJQN201901710);重庆市基础研究及前沿技术研究计划项目(cstc2018jcyjAX0633)

3D human joint point recognition based on weakly supervised migration network

Zhi-yong SUN1(),Hong-you LI2,Jun-yong YE1()   

  1. 1.Key Laboratory of Optoelectronic Technology of the Ministry of Education,Chongqing University,Chongqing 400044,China
    2.Department of Basic Education,Chongqing Police College,Chongqing 401331,China
  • Received:2022-03-21 Online:2024-01-30 Published:2024-03-28
  • Contact: Jun-yong YE E-mail:sunzhiyong@cqu.edu.cn;ygyocr@cqu.edu.cn

摘要:

针对2D图像缺少深度信息,行为姿态空间结构信息不完备的问题,提出一种基于弱监督迁移网络的3D人体关节点识别方法。首先,提出一种用于真实图像的端到端3D人体姿态估计框架,使用2D与3D混合标签图像对深度神经网络进行训练,在2D人体姿态识别子网络中,添加深度回归模块对2D人体姿态识别子网络进行改进,解决3D人体姿态识别出现的深度歧义性问题;其次,在3D人体姿态识别子网络中,引入3D几何约束对人体姿态识别进行规范化操作,针对无真实深度标签的情况,可更好地学习深度特征,有效解决存在遮挡情况的人体姿态识别问题。在Human 3.6M和MPII数据集中关节点预测平均误差低于其他方法,具有更好的3D人体姿态识别效果。

关键词: 迁移网络, 姿态识别, 3D关节点, 几何约束, 深度回归

Abstract:

Aiming at the lack of depth information and incomplete spatial structure information of behavior and posture in 2D images, a 3D human joint point recognition method based on weak supervised migration network is proposed. Firstly, an end-to-end 3D human pose estimation framework for real images is proposed. The depth neural network is trained with 2D and 3D mixed label images. In the 2D human pose recognition sub network, the depth regression module is added to improve the 2D human pose recognition sub network to solve the problem of depth ambiguity in 3D human pose recognition; Secondly, in the 3D human pose recognition sub network, 3D geometric constraints are introduced to standardize the human pose recognition. For the case of no real depth label, it can better learn the depth features and effectively solve the problem of human pose recognition with occlusion. In human 3.6m and mpii data sets, the average error of joint point prediction is lower than that of other methods, and has better 3D human posture recognition effect.

Key words: migration network, pose recognition, 3D joint points, geometric constraints, depth regression

中图分类号: 

  • TP391

图1

理想3D标注图像(右)和真实2D标注图像(左)对深度神经网络进行训练示意图"

图2

基于弱监督迁移的3D人体姿态识别网络架构"

图3

沙漏型网络结构"

表1

Human 3.6M数据集2D关节点识别准确率"

方法文献[10文献[11文献[24文献[25无约束有约束
acc/%90.0190.5790.9191.6287.6990.93

图4

不同动作2D与3D人体姿态识别结果"

表2

Human 3.6M数据集上3D关节点MPJPE结果 (mm)"

方法坐立抽烟行走交谈吃东西拍照打招呼打电话购物等候遛狗平均
文献[10133.14106.65114.0597.5789.98139.17107.87107.31136.09106.2187.03114.18
文献[11110.1984.9571.3673.4776.82110.6786.4386.2874.7985.7886.2688.39
文献[24124.52107.4279.36109.3187.05143.32103.16116.1899.78118.09114.2379.9
文献[2596.1970.8282.0369.7460.5585.4268.7776.3675.0468.4554.4174.14
无约束74.7964.3463.9761.1658.1267.2971.7562.5456.3868.7852.2263.76
有约束75.2064.1563.2260.7058.2265.5371.4162.0355.5866.0551.4363.05

表3

MPII数据集上有无约束的左右对称性评价"

无约束有约束
大臂42.4 mm37.8 mm
小臂60.4 mm50.7 mm
大腿43.5 mm43.4 mm
小腿59.4 mm47.8 mm
大臂6.27 px4.80 px
小臂10.11 px6.64 px
大腿6.89 px4.93 px
小腿8.03 px6.22 px
1 Insafutdinov E, Pishchulin L, Andres B, et al. DeeperCut: a deeper, stronger, and faster multi-person pose estimation model[J/OL].[2016-12-10].
2 Bulat A, Tzimiropoulos G. Human pose estimation via convolutional part heat map regression[C]∥The 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 717-732.
3 Chu X, Yang W, Ouyang W, et al. Multi-context attention for human pose estimation[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 1831-1840.
4 Newell A, Yang K, Deng J. Stacked hourglass networks for human pose estimation[C]∥The 14th European Conference Computer Vision, Amsterdam, The Netherlands, 2016: 483-499.
5 Ionescu C, Papava D, Olaru V, et al. Human3.6M: large scale datasets and predictive methods for 3D human sensing in natural environments[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2014, 36(7): 1325-1339.
6 Sigal L, Balan A O, Black M J. Humaneva: synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion[J]. International Journal of Computer Vision, 2010, 87(1/2): 4-27.
7 Zhou X, Sun X, Zhang W, et al. Deep kinematic pose regression[C]∥The 14th European Conference Computer Vision, Amsterdam, The Netherlands, 2016: 186-201.
8 Li S, Chan A B. 3D human pose estimation from monocular images with deep convolutional neural network[C]∥The 12th Asian Conference on Computer Vision, Singapore, 2014: 332-347.
9 Bogo F, Kanazawa A, Lassner C, et al. Keep it SMPL: automatic estimation of 3D human pose and shape from a single image[C]∥The 14th European Conference, Amsterdam, The Netherlands, 2016: 561-578.
10 Chen C H, Ramanan D. 3D human pose estimation= 2d pose estimation+ matching[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA,2017: 7035-7043.
11 Tome D, Russell C, Agapito L. Lifting from the deep: convolutional 3D pose estimation from a single image[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 2500-2509.
12 Wu J, Xue T, Lim J J, et al. Single image 3D interpreter network[C]∥The 14th European Conference, Amsterdam, The Netherlands, 2016: 365-382.
13 Yasin H, Iqbal U, Kruger B, et al. A dual-source approach for 3D pose estimation from a single image[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4948-4956.
14 Zhou X, Zhu M, Leonardos S, et al. Sparseness meets deepness: 3D human pose estimation from monocular video[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4966-4975.
15 Wei S E, Ramakrishna V, Kanade T, et al. Convolutional pose machines[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4724-4732.
16 Akhter I, Black M J. Pose-conditioned joint angle limits for 3D human pose reconstruction[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1446-1455.
17 Ramakrishna V, Kanade T, Sheikh Y. Reconstructing 3D human pose from 2D image landmarks[C]∥The 12th European Conference on Computer Vision, Florence, Italy, 2012: 573-586.
18 Zhou X, Leonardos S, Hu X, et al. 3D shape estimation from 2D landmarks: a convex relaxation approach[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 4447-4455.
19 Wei S E, Ramakrishna V, Kanade T, et al. Convolutional pose machines[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4724-4732.
20 Tome D, Russell C, Agapito L. Lifting from the deep: Convolutional 3D pose estimation from a single image[C]∥Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 2500-2509.
21 Zhou X, Zhu M, Leonardos S, et al. Sparseness meets deepness: 3d human pose estimation from monocular video[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4966-4975.
22 Zhang Z, Hu L, Deng X, et al. Weakly supervised adversarial learning for 3D human pose estimation from point clouds[J]. IEEE Transactions on Visualization and Computer Graphics, 2020, 26(5): 1851-1859.
23 Hoffman J, Wang D, Yu F, et al. FCNs in the wild: pixel-level adversarial and constraint-based adaptation[J/OL]. [2016-12-10].
24 Zhou X, Zhu M, Pavlakos G, et al. Monocap: monocular human motion capture using a CNN coupled with a geometric prior[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(4): 901-914.
25 Mehta D, Rhodin H, Casas D, et al. Monocular 3D human pose estimation using transfer learning and improved CNN supervision[J/OL]. [2016-12-10].
26 Andriluka M, Pishchulin L, Gehler P,et al.Human pose estimation: new benchmark and state of the art analysis[C]∥Computer Vision and Pattern Recognitio,Columbus,USA,2014.
[1] 朱凌,王秋成. 空间几何约束下新能源汽车驱动系统协调控制方法[J]. 吉林大学学报(工学版), 2022, 52(7): 1509-1514.
[2] 李文辉, 孙明玉, 许光星, 曹春红. 基于二部图模型的欠、过约束几何约束系统的识别和处理[J]. 吉林大学学报(工学版), 2017, 47(5): 1583-1590.
[3] 李文辉, 孙明玉, 曹春红. 几何约束求解的扩展C-树分解法[J]. 吉林大学学报(工学版), 2017, 47(4): 1273-1279.
[4] 易荣庆,李文辉,袁华,王铎,郭武 . 几何约束多解问题[J]. 吉林大学学报(工学版), 2008, 38(04): 871-875.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!