Journal of Jilin University(Engineering and Technology Edition) ›› 2021, Vol. 51 ›› Issue (3): 1026-1033.doi: 10.13229/j.cnki.jdxbgxb20200058

Previous Articles    

User interface components detection algorithm based on improved YOLOv3

Yuan-ning LIU1,2(),Di WU1,2,Xiao-dong ZHU1,2(),Qi-xian ZHANG1,3,Shuang-shuang LI1,2,Shu-jun GUO1,2,Chao WANG1,3   

  1. 1.Key Laboratory of Symbolic Computation and Knowledge Engineering,Ministry of Education,Jilin University,Changchun 130012,China
    2.College of Computer Science and Technology,Jilin University,Changchun 130012,China
    3.College of Software,Jilin University,Changchun 130012,China
  • Received:2020-01-23 Online:2021-05-01 Published:2021-05-07
  • Contact: Xiao-dong ZHU E-mail:lyn@jlu.edu.cn;zhuxd@jlu.edu.cn

Abstract:

When the traditional detection methods are used to identify User Interface (UI) components, there is no way to categorize these components. To solve this problem in order to provide guidance for designers to reconstruct complex UI screenshot examples, this paper proposes an improved method based on improved YOLOv3 for UI components detection. First, the feature extraction network uses tightly connected network (DenseNet) to make full use of the extracted features. Second, channel attention mechanism and spatial attention mechanism are added to the dense layers and transition layers of the feature extraction network, the weighted feature is used to replace the original feature for the later feature fusion. Third, the feature pyramid network is constructed to complete the components detection task in four dimensions. Finally, the focalloss is used as the classification loss function. The data set consists of a large number of real Android application interface screenshots and XML files. The experimental results on the collected data set show that the recall rate of the method is 91.97% and the mAP is 48.21%, which has better performance than the traditional detection methods.

Key words: computer application, components detection, attention mechanism, Focalloss

CLC Number: 

  • TP391.4

Fig.1

Structure of channel attention mechanism"

Fig.2

Structure of spatial attention mechanism"

Fig.3

DenseBlock connection method"

Fig.4

DenseNet structure of three DenseBlocks"

Fig.5

Structure of the feature extraction network"

Table 1

Description information and samples of 14 UI components"

组件类别描述样例
ImageView用来在界面上显示图片。
TextView用来在界面上显示文本,是一些组件的父类,如:Button, EditText。
ImageButton以图片形式显示的按钮,可以添加点击事件。
CheckBox复选框可以选择多个项目的按钮。
EditText用来显示文本,文本内容是可编辑的。
RadioButton单选按钮Button的一种,只能选择一项。
Button可以显示文本,并且可以添加点击事件。
ProgressBar用于向用户显示某耗时操作的完成进度。
CompoundButton选中、未选中状态。
SeekBar允许用户拖动滑块改变相应值。
Switch通过点击改变开关状态。
RatingBar通常用于向用户显示等级。
ToggleButton状态开关按钮由Switch派生出来的。
Spinner供用户选择的列表框。

Fig.6

UI screenshot and TextView componentinformation in its XML file"

Table 2

Confusion matrix for classification results"

标签为正标签为负
预测为正TPFP
预测为负FNTN

Fig.7

IoU calculation method"

Table 3

Parameter settings in Focalloss"

αγPrecisionRecallmAP
0.2510.25980.66950.3776
0.5010.28870.69340.4147
0.7510.29570.70230.4289
0.2520.25840.64590.3755
0.5020.27700.76080.4251
0.7520.32080.71870.4480
0.2530.25540.67150.3754
0.5030.27010.69260.4226
0.7530.28790.71440.4484
0.7540.28300.72900.4344
0.7550.27720.72510.4273

Table 4

Results of different UI componentsdetection methods"

检测方法输入尺寸PrecisionRecallmAP
REMAUI(OCR+CV)原始大小0.10550.1359-
YOLOv2(合成UI数据集)416×4160.31250.7083-
YOLOv2(真实UI数据集)416×4160.22130.57730.4157
YOLOv3416×4160.17040.60310.4231
本文方法416×4160.23150.91970.4821

Fig.8

Result of detection"

Fig.9

Wrong detection result"

1 Nguyen T A, Reverse C C. Engineering mobile application user interfaces with REMAUI[C]∥IEEE/ACM International Conference on Automated Software Engineering, Lincoln, United States, 2015: 248-259.
2 Swearngin A, Dontcheva M, Li W, et al. Rewire: interfade design assistance from examples[C]∥Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal QC, Canada, 2018:No.504.
3 White T D, Fraser G, Brown G J. Improving random GUI testing with image-based widget detection[C]∥Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, New York, NY, USA, 2019: 307-317.
4 Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 6517-6525.
5 Redmon J, Farhadi A. YOLOv3: an incremental improvement[DB/OL].[2020-01-12].
6 He Kai-ming, Zhang Xiang-yu, Ren Shao-qing, et al. Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vegas, Nevada, 2016:770-778.
7 Huang Gao, Liu Zhuang, van der Maaten Laurens, et al. Densely connected convolutional networks[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 2261-2269.
8 Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, 2016:779-788.
9 Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Ohio, 2014:580-587.
10 Girshick R. Fast R-CNN[C]∥IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1440-1448.
11 Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6):1137-1149.
12 耿庆田,于繁华,王宇婷,等. 基于特征融合的车型检测新算法[J]. 吉林大学学报:工学版, 2018, 48(3):929-935.
Geng Qing-tian, Yu Fan-hua, Wang Yu-ting, et al. New algorithm for vehicle type detection based on feature fusion[J]. Journal of Jilin University (Engineering and Technology Edition), 2018, 48(3):929-935.
13 Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]∥Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2999-3007.
14 Hu Jie, Shen Li, Albanie S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,1:01507.
15 Sanghyun W, Jongchan P, Joon-Young L, et al. CBAM:convolutional block attention module[J]. Springer, 2018, 11211:3-19.
16 车翔玖,王利,郭晓新. 基于多尺度特征融合的边界检测算法[J]. 吉林大学学报:工学版, 2018, 48(5):1621-1628.
Che Xiang-jiu, Wang Li, Guo Xiao-xin. Improved boundary detection based on multi-scale cues fusion[J]. Journal of Jilin University (Engineering and Technology Edition), 2018, 48(5):1621-1628.
17 Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 936-944.
[1] Xiao-long ZHU,Zhong XIE. Geospatial data extraction algorithm based on machine learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 1011-1016.
[2] Dan-tong OUYANG,Yang LIU,Jie LIU. Fault diagnosis method based on test set under fault response guidance [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 1017-1025.
[3] Xiao-hui WEI,Chang-bao ZHOU,Xiao-xian SHEN,Yuan-yuan LIU,Qun-chao TONG. Accelerating CALYPSO structure prediction with machine learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(2): 667-676.
[4] Bing-hai ZHOU,Qiong WU. Balancing and bi⁃objective optimization of robotic assemble lines [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(2): 720-727.
[5] Tian-qi GU,Chen-jie HU,Yi TU,Shu-wen LIN. Robust reconstruction method based on moving least squares algorithm [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(2): 685-691.
[6] Qian-yi XU,Gui-he QIN,Ming-hui SUN,Cheng-xun MENG. Classification of drivers' head status based on improved ResNeSt [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(2): 704-711.
[7] Xiao-yu WANG,Xin-hao HU,Chang-lin HAN. Face pencil drawing algorithms based on generative adversarial network [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 285-292.
[8] Ming FANG,Wen-qiang CHEN. Face micro-expression recognition based on ResNet with object mask [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 303-313.
[9] Hai-ying ZHAO,Wei ZHOU,Xiao-gang HOU,Xiao-li ZHANG. Double-layer annotation of traditional costume images based on multi-task learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 293-302.
[10] Yuan SONG,Dan-yuan ZHOU,Wen-chang SHI. Method to enhance security function of OpenStack Swift cloud storage system [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 314-322.
[11] Hong-wei ZHAO,Xiao-han LIU,Yuan ZHANG,Li-li FAN,Man-li LONG,Xue-bai ZANG. Clothing classification algorithm based on landmark attention and channel attention [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1765-1770.
[12] Nai-yan GUAN,Juan-li GUO. Component awareness adaptive model based on attitude estimation algorithms [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1850-1855.
[13] Yang LI,Shuo LI,Li-wei JING. Estimate model based on Bayesian model and machine learning algorithms applicated in financial risk assessment [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1862-1869.
[14] Bing-hai ZHOU,Zhao-xu HE. Dynamic material handling scheduling for mixed⁃model assembly lines based on line⁃integrated supermarkets [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1809-1817.
[15] Lei JIANG,Ren-chu GUAN. Design of fuzzy comprehensive evaluation system for talent quality based on multi⁃objective evolutionary algorithm [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1856-1861.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!