吉林大学学报(工学版) ›› 2021, Vol. 51 ›› Issue (3): 1026-1033.doi: 10.13229/j.cnki.jdxbgxb20200058

• 计算机科学与技术 • 上一篇    

基于YOLOv3改进的用户界面组件检测算法

刘元宁1,2(),吴迪1,2,朱晓冬1,2(),张齐贤1,3,李双双1,2,郭书君1,2,王超1,3   

  1. 1.吉林大学 符号计算与知识工程教育部重点实验室,长春 130012
    2.吉林大学 计算机科学与技术学院,长春 130012
    3.吉林大学 软件学院,长春 130012
  • 收稿日期:2020-01-23 出版日期:2021-05-01 发布日期:2021-05-07
  • 通讯作者: 朱晓冬 E-mail:lyn@jlu.edu.cn;zhuxd@jlu.edu.cn
  • 作者简介:刘元宁(1962-),男,教授,博士生导师. 研究方向:模式识别,机器学习.E-mail:lyn@jlu.edu.cn
  • 基金资助:
    吉林省产业创新专项项目(2019C053-2);国家自然科学基金项目(61471181)

User interface components detection algorithm based on improved YOLOv3

Yuan-ning LIU1,2(),Di WU1,2,Xiao-dong ZHU1,2(),Qi-xian ZHANG1,3,Shuang-shuang LI1,2,Shu-jun GUO1,2,Chao WANG1,3   

  1. 1.Key Laboratory of Symbolic Computation and Knowledge Engineering,Ministry of Education,Jilin University,Changchun 130012,China
    2.College of Computer Science and Technology,Jilin University,Changchun 130012,China
    3.College of Software,Jilin University,Changchun 130012,China
  • Received:2020-01-23 Online:2021-05-01 Published:2021-05-07
  • Contact: Xiao-dong ZHU E-mail:lyn@jlu.edu.cn;zhuxd@jlu.edu.cn

摘要:

针对传统方法识别用户界面(UI)组件时,无法进行组件分类的问题,本文提出了基于经典目标检测算法YOLOv3改进的算法用于UI组件检测任务,包括识别和分类。特征提取网络采用DenseNet紧密连接结构使提取到的特征能够充分使用;在特征提取网络中加入通道注意力机制和空间注意力机制,使用加权的特征代替原来的特征用于后面的特征融合;构造4个维度的特征金字塔网络完成组件检测任务;使用Focalloss作为分类损失函数。在收集的真实UI数据集上进行实验,实验结果表明:在检测精度上,本文方法的召回率达到了91.97%,平均精度mAP达到了48.21%,相比传统检测方法,本文方法具有更好的性能。

关键词: 计算机应用, 组件检测, 注意力机制, 焦损失函数

Abstract:

When the traditional detection methods are used to identify User Interface (UI) components, there is no way to categorize these components. To solve this problem in order to provide guidance for designers to reconstruct complex UI screenshot examples, this paper proposes an improved method based on improved YOLOv3 for UI components detection. First, the feature extraction network uses tightly connected network (DenseNet) to make full use of the extracted features. Second, channel attention mechanism and spatial attention mechanism are added to the dense layers and transition layers of the feature extraction network, the weighted feature is used to replace the original feature for the later feature fusion. Third, the feature pyramid network is constructed to complete the components detection task in four dimensions. Finally, the focalloss is used as the classification loss function. The data set consists of a large number of real Android application interface screenshots and XML files. The experimental results on the collected data set show that the recall rate of the method is 91.97% and the mAP is 48.21%, which has better performance than the traditional detection methods.

Key words: computer application, components detection, attention mechanism, Focalloss

中图分类号: 

  • TP391.4

图1

通道注意力机制结构"

图2

空间注意力机制结构"

图3

DenseBlock连接方式"

图4

三个DenseBlock的DenseNet结构"

图5

特征提取网络结构"

表1

14类UI组件的表述信息及样例"

组件类别描述样例
ImageView用来在界面上显示图片。
TextView用来在界面上显示文本,是一些组件的父类,如:Button, EditText。
ImageButton以图片形式显示的按钮,可以添加点击事件。
CheckBox复选框可以选择多个项目的按钮。
EditText用来显示文本,文本内容是可编辑的。
RadioButton单选按钮Button的一种,只能选择一项。
Button可以显示文本,并且可以添加点击事件。
ProgressBar用于向用户显示某耗时操作的完成进度。
CompoundButton选中、未选中状态。
SeekBar允许用户拖动滑块改变相应值。
Switch通过点击改变开关状态。
RatingBar通常用于向用户显示等级。
ToggleButton状态开关按钮由Switch派生出来的。
Spinner供用户选择的列表框。

图6

UI截图及其XML文件中TextView组件信息"

表2

分类结果混淆矩阵"

标签为正标签为负
预测为正TPFP
预测为负FNTN

图7

交并比计算方法"

表3

Focalloss中的参数设置"

αγPrecisionRecallmAP
0.2510.25980.66950.3776
0.5010.28870.69340.4147
0.7510.29570.70230.4289
0.2520.25840.64590.3755
0.5020.27700.76080.4251
0.7520.32080.71870.4480
0.2530.25540.67150.3754
0.5030.27010.69260.4226
0.7530.28790.71440.4484
0.7540.28300.72900.4344
0.7550.27720.72510.4273

表4

不同UI组件检测方法的检测结果"

检测方法输入尺寸PrecisionRecallmAP
REMAUI(OCR+CV)原始大小0.10550.1359-
YOLOv2(合成UI数据集)416×4160.31250.7083-
YOLOv2(真实UI数据集)416×4160.22130.57730.4157
YOLOv3416×4160.17040.60310.4231
本文方法416×4160.23150.91970.4821

图8

检测效果"

图9

检测错误的情况"

1 Nguyen T A, Reverse C C. Engineering mobile application user interfaces with REMAUI[C]∥IEEE/ACM International Conference on Automated Software Engineering, Lincoln, United States, 2015: 248-259.
2 Swearngin A, Dontcheva M, Li W, et al. Rewire: interfade design assistance from examples[C]∥Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal QC, Canada, 2018:No.504.
3 White T D, Fraser G, Brown G J. Improving random GUI testing with image-based widget detection[C]∥Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, New York, NY, USA, 2019: 307-317.
4 Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 6517-6525.
5 Redmon J, Farhadi A. YOLOv3: an incremental improvement[DB/OL].[2020-01-12].
6 He Kai-ming, Zhang Xiang-yu, Ren Shao-qing, et al. Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vegas, Nevada, 2016:770-778.
7 Huang Gao, Liu Zhuang, van der Maaten Laurens, et al. Densely connected convolutional networks[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 2261-2269.
8 Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, 2016:779-788.
9 Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Ohio, 2014:580-587.
10 Girshick R. Fast R-CNN[C]∥IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1440-1448.
11 Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6):1137-1149.
12 耿庆田,于繁华,王宇婷,等. 基于特征融合的车型检测新算法[J]. 吉林大学学报:工学版, 2018, 48(3):929-935.
Geng Qing-tian, Yu Fan-hua, Wang Yu-ting, et al. New algorithm for vehicle type detection based on feature fusion[J]. Journal of Jilin University (Engineering and Technology Edition), 2018, 48(3):929-935.
13 Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]∥Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2999-3007.
14 Hu Jie, Shen Li, Albanie S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,1:01507.
15 Sanghyun W, Jongchan P, Joon-Young L, et al. CBAM:convolutional block attention module[J]. Springer, 2018, 11211:3-19.
16 车翔玖,王利,郭晓新. 基于多尺度特征融合的边界检测算法[J]. 吉林大学学报:工学版, 2018, 48(5):1621-1628.
Che Xiang-jiu, Wang Li, Guo Xiao-xin. Improved boundary detection based on multi-scale cues fusion[J]. Journal of Jilin University (Engineering and Technology Edition), 2018, 48(5):1621-1628.
17 Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 936-944.
[1] 朱小龙,谢忠. 基于机器学习的地理空间数据抽取算法[J]. 吉林大学学报(工学版), 2021, 51(3): 1011-1016.
[2] 欧阳丹彤,刘扬,刘杰. 故障响应指导下基于测试集的故障诊断方法[J]. 吉林大学学报(工学版), 2021, 51(3): 1017-1025.
[3] 魏晓辉,周长宝,沈笑先,刘圆圆,童群超. 机器学习加速CALYPSO结构预测的可行性[J]. 吉林大学学报(工学版), 2021, 51(2): 667-676.
[4] 周炳海,吴琼. 基于多目标的机器人装配线平衡算法[J]. 吉林大学学报(工学版), 2021, 51(2): 720-727.
[5] 顾天奇,胡晨捷,涂毅,林述温. 基于移动最小二乘法的稳健重构方法[J]. 吉林大学学报(工学版), 2021, 51(2): 685-691.
[6] 许骞艺,秦贵和,孙铭会,孟诚训. 基于改进的ResNeSt驾驶员头部状态分类算法[J]. 吉林大学学报(工学版), 2021, 51(2): 704-711.
[7] 王小玉,胡鑫豪,韩昌林. 基于生成对抗网络的人脸铅笔画算法[J]. 吉林大学学报(工学版), 2021, 51(1): 285-292.
[8] 方明,陈文强. 结合残差网络及目标掩膜的人脸微表情识别[J]. 吉林大学学报(工学版), 2021, 51(1): 303-313.
[9] 赵海英,周伟,侯小刚,张小利. 基于多任务学习的传统服饰图像双层标注[J]. 吉林大学学报(工学版), 2021, 51(1): 293-302.
[10] 宋元,周丹媛,石文昌. 增强OpenStack Swift云存储系统安全功能的方法[J]. 吉林大学学报(工学版), 2021, 51(1): 314-322.
[11] 赵宏伟,刘晓涵,张媛,范丽丽,龙曼丽,臧雪柏. 基于关键点注意力和通道注意力的服装分类算法[J]. 吉林大学学报(工学版), 2020, 50(5): 1765-1770.
[12] 管乃彦,郭娟利. 基于姿态估计算法的组件感知自适应模型[J]. 吉林大学学报(工学版), 2020, 50(5): 1850-1855.
[13] 李阳,李硕,井丽巍. 基于贝叶斯模型与机器学习算法的金融风险网络评估模型[J]. 吉林大学学报(工学版), 2020, 50(5): 1862-1869.
[14] 周炳海,何朝旭. 基于线边集成超市的混流装配线动态物料配送调度[J]. 吉林大学学报(工学版), 2020, 50(5): 1809-1817.
[15] 蒋磊,管仁初. 基于多目标进化算法的人才质量模糊综合评价系统设计[J]. 吉林大学学报(工学版), 2020, 50(5): 1856-1861.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!