吉林大学学报(工学版) ›› 2020, Vol. 50 ›› Issue (5): 1765-1770.doi: 10.13229/j.cnki.jdxbgxb20190755

• 计算机科学与技术 • 上一篇    

基于关键点注意力和通道注意力的服装分类算法

赵宏伟1(),刘晓涵1,张媛1,范丽丽1,龙曼丽2(),臧雪柏1   

  1. 1.吉林大学 计算机科学与技术学院,长春 130012
    2.吉林大学 公共外语教育学院,长春 130012
  • 收稿日期:2019-07-25 出版日期:2020-09-01 发布日期:2020-09-16
  • 通讯作者: 龙曼丽 E-mail:zhaohw@jlu.edu.cn;Longml@jlu.edu.cn
  • 作者简介:赵宏伟(1962-),男,教授,博士生导师.研究方向:嵌入式人工智能.E-mail:zhaohw@jlu.edu.cn
  • 基金资助:
    吉林省省级科技创新专项项目(20190302026GX);吉林省自然科学基金项目(20200201037JC);吉林省高等教育学会高教科研项目(JGJX2018D10)

Clothing classification algorithm based on landmark attention and channel attention

Hong-wei ZHAO1(),Xiao-han LIU1,Yuan ZHANG1,Li-li FAN1,Man-li LONG2(),Xue-bai ZANG1   

  1. 1.College of Computer Science and Technology,Jilin University,Changchun 130012,China
    2.School of Foreign Language Education, Jilin University, Changchun 130012,China
  • Received:2019-07-25 Online:2020-09-01 Published:2020-09-16
  • Contact: Man-li LONG E-mail:zhaohw@jlu.edu.cn;Longml@jlu.edu.cn

摘要:

提出了一个基于关键点注意力机制与通道注意力机制相结合的深度神经网络,用于解决服装关键点检测、类别分类和属性预测等3个方面的问题。网络通过对输入特征图进行卷积提取特征、反卷积恢复特征图大小以及加入非局部连接结构获得关键点之间的联系等一系列操作来预测服装关键点,进而得到关键点注意力。关键点注意力模块强调了服装中有辨别性区域的特征,进而得到新的特征图。此外,通道注意力模块增加了对分类和属性预测影响更大的特征图的权重。在DeepFashion数据集上的实验结果表明:本文方法较当前已有方法有效提高了类别分类的准确率和属性预测的召回率。

关键词: 计算机应用, 服装类别分类, 服装属性预测, 深度学习, 注意力机制

Abstract:

In order to solve the problems of clothing landmark detection, category classification and attribute prediction, a novel deep neural network based on the combination of landmark attention mechanism and channel attention mechanism was proposed. First, the network predicts clothing landmarks by convoluting the input feature map to extract features, deconvoluting to restore the feature map size. Then, it acquires the connection between the landmarks by adding a non-local structure, thus, obtaining the landmark attention. The landmark attention module emphasizes the characteristics of the discriminative area in the clothing, and then new feature maps are generated. In addition, channel attention increases the weight of some feature maps which are more useful for category classification and attribute prediction. The experimental results on the DeepFashion dataset show that the proposed method can improve the accuracy of category classification and the recall rate of attribute prediction compared with the existing methods.

Key words: computer application, clothing category classification, clothing attribute prediction, deep learning, attention mechanism

中图分类号: 

  • TP391

图1

关键点检测,类别分类和属性预测"

图2

本文网络的总体结构"

图3

关键点检测可视化"

图4

服装分类和属性预测结果"

表1

关键点检测的实验结果"

方法左领口右领口左袖口右袖口左腰线右腰线左下摆右下摆平均值
文献[1]0.08540.09020.09730.09350.08540.08450.08120.08230.0872
文献[2]0.06280.06370.06580.06210.07260.07020.06580.06630.0660
文献[14]0.05700.06110.06720.06470.07030.06940.06240.06270.0643
文献[3]0.03320.03460.04870.05190.04420.04290.06200.06390.0474
本文0.03850.03900.05460.05700.04890.05170.05520.05850.0504

表2

类别分类和属性预测的实验结果"

方法分类纹理面料形状部分
top-3top-5top-3top-5top-3top-5top-3top-5top-3top-5
文献[15]43.7366.2624.2132.6525.3836.0623.3931.2626.3133.24
文献[16]59.4879.5836.1548.1536.6448.5235.8946.9339.1750.14
文献[1]82.5890.1737.4649.5239.3049.8439.3748.5944.1354.02
文献[17]86.3092.8053.6063.2039.1048.8050.1059.5038.8048.90
文献[3]91.1696.1256.1765.8343.2053.5258.2867.8046.9757.42
本文91.2495.9457.1166.6244.2554.5259.5668.9247.6058.01
1 Liu Z, Luo P, Qiu S, et al. DeepFashion: powering robust clothes recognition and retrieval with rich annotations[C]∥IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1096-1104.
2 Liu Z, Yan S, Luo P, et al. Fashion landmark detection in the wild[C]∥European Conference on Computer Vision, Amsterdam, Netherlands, 2016: 229-245.
3 Liu J, Lu H. Deep fashion analysis with feature map upsampling and landmark-driven attention[C]∥European Conference on Computer Vision, Munich Germany, 2018: 30-36.
4 Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]∥IEEE Conference on Computer Vision and Pattern Recognition, Munich Germany, 2018: 7132-7141.
5 Buades A, Coll B, Morel J M. A non-local algorithm for image denoising[C]∥2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 2005: 60-65.
6 Wang X, Girshick R, Gupta A, et al. Non-local neural networks[C]∥IEEE Conference on Computer Vision and Pattern Recognition, Munich Germany, 2018: 7794-7803.
7 Shih K J, Singh S, Hoiem D. Where to look: focus regions for visual question answering[C]∥IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4613-4621.
8 Yang Z, He X, Gao J, et al. Stacked attention networks for image question answering[C]∥IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 21-29.
9 纪超, 刘慧英, 孙景峰, 等. 基于空域和频域的图像显著区域检测[J]. 吉林大学学报: 工学版, 2014, 44(1): 117-183.
Ji Chao, Liu Hui-ying, Sun Jing-feng, et al. Image salient region detection based on spatial and frequency domains[J]. Journal of Jilin University(Engineering and Technology Edition), 2014, 44(1): 177-183.
10 董超, 刘晶红, 徐芳, 等. 光学遥感图像舰船目标快速检测方法[J]. 吉林大学学报: 工学版, 2019, 49(4): 1369-1376.
Dong Chao, Liu Jing-hong, Xu Fang, et al. Fast ship detection in optical remote sensing images[J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(4): 1369-1376.
11 Newell A, Yang K, Deng J. Stacked hourglass networks for human pose estimation[C]∥European Conference on Computer Vision, Amsterdam, Netherlands, 2016: 483-499.
12 Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J/OL].[2015-04-10].
13 Lin M, Chen Q, Yan S. Network in network[J].arXiv preprint arXiv:1312.4400, 2013.
14 Yan S, Liu Z, Luo P. Unconstrained fashion landmark detection via hierarchical recurrent transformer networks[C]∥ACM on Multimedia Conference, Silicon Valley, USA, 2017: 172-180.
15 Chen H, Gallagher A, Girod B. Describing clothing by semantic attributes[C]∥European Conference on Computer Vision, Florence Italy, 2012: 609-623.
16 Huang J, Feris R S, Chen Q, et al. Cross-domain image retrieval with a dual attribute-aware ranking network[C]∥IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1062-1070.
17 Corbiere C, Ben-Younes H, Rame A, et al. Leveraging weakly annotated data for fashion image retrieval and label prediction[C]∥IEEE International Conference on Computer Vision Workshop, Venice, Italy, 2017: 2268-2274.
[1] 车翔玖,董有政. 基于多尺度信息融合的图像识别改进算法[J]. 吉林大学学报(工学版), 2020, 50(5): 1747-1754.
[2] 郜峰利,陶敏,李雪妍,何昕,杨帆,王卓,宋俊峰,佟丹. 基于深度学习的CT影像脑卒中精准分割[J]. 吉林大学学报(工学版), 2020, 50(2): 678-684.
[3] 刘洲洲,尹文晓,张倩昀,彭寒. 基于离散优化算法和机器学习的传感云入侵检测[J]. 吉林大学学报(工学版), 2020, 50(2): 692-702.
[4] 王晓辉,吴禄慎,陈华伟. 基于法向量距离分类的散乱点云数据去噪[J]. 吉林大学学报(工学版), 2020, 50(1): 278-288.
[5] 张笑东,夏筱筠,吕海峰,公绪超,廉梦佳. 大数据网络并行计算环境中生理数据流动态负载均衡[J]. 吉林大学学报(工学版), 2020, 50(1): 247-254.
[6] 陈蔓,钟勇,李振东. 隐低秩结合低秩表示的多聚焦图像融合[J]. 吉林大学学报(工学版), 2020, 50(1): 297-305.
[7] 金顺福,郄修尘,武海星,霍占强. 基于新型休眠模式的云虚拟机分簇调度策略及性能优化[J]. 吉林大学学报(工学版), 2020, 50(1): 237-246.
[8] 邓钧忆,刘衍珩,冯时,赵荣村,王健. 基于GSPN的Ad⁃hoc网络性能和安全平衡[J]. 吉林大学学报(工学版), 2020, 50(1): 255-261.
[9] 王铁君,王维兰. 基于本体的唐卡图像标注方法[J]. 吉林大学学报(工学版), 2020, 50(1): 289-296.
[10] 李雄飞,王婧,张小利,范铁虎. 基于SVM和窗口梯度的多焦距图像融合方法[J]. 吉林大学学报(工学版), 2020, 50(1): 227-236.
[11] 王洪雁,邱贺磊,郑佳,裴炳南. 光照变化下基于低秩稀疏表示的视觉跟踪方法[J]. 吉林大学学报(工学版), 2020, 50(1): 268-277.
[12] 周柚,杨森,李大琳,吴春国,王岩,王康平. 基于现场可编程门电路的人脸检测识别加速平台[J]. 吉林大学学报(工学版), 2019, 49(6): 2051-2057.
[13] 赵宏伟,王鹏,范丽丽,胡黄水,刘萍萍. 相似性保持实例检索方法[J]. 吉林大学学报(工学版), 2019, 49(6): 2045-2050.
[14] 沈军,周晓,吉祖勤. 服务动态扩展网络及其结点系统模型的实现[J]. 吉林大学学报(工学版), 2019, 49(6): 2058-2068.
[15] 周炳海,吴琼. 考虑工具和空间约束的机器人装配线平衡优化[J]. 吉林大学学报(工学版), 2019, 49(6): 2069-2075.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!