基于YOLOv3改进的用户界面组件检测算法

doi:10.13229/j.cnki.jdxbgxb20200058

Abstract

Abstract:

When the traditional detection methods are used to identify User Interface （UI） components， there is no way to categorize these components. To solve this problem in order to provide guidance for designers to reconstruct complex UI screenshot examples， this paper proposes an improved method based on improved YOLOv3 for UI components detection. First， the feature extraction network uses tightly connected network （DenseNet） to make full use of the extracted features. Second， channel attention mechanism and spatial attention mechanism are added to the dense layers and transition layers of the feature extraction network， the weighted feature is used to replace the original feature for the later feature fusion. Third， the feature pyramid network is constructed to complete the components detection task in four dimensions. Finally， the focalloss is used as the classification loss function. The data set consists of a large number of real Android application interface screenshots and XML files. The experimental results on the collected data set show that the recall rate of the method is 91.97% and the mAP is 48.21%， which has better performance than the traditional detection methods.

Key words: computer application, components detection, attention mechanism, Focalloss

CLC Number:

TP391.4

Yuan-ning LIU,Di WU,Xiao-dong ZHU,Qi-xian ZHANG,Shuang-shuang LI,Shu-jun GUO,Chao WANG. User interface components detection algorithm based on improved YOLOv3[J].Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 1026-1033.

Figures/Tables 13

Fig.1

Fig.2

Fig.3

Fig.4

Fig.5

Table 1

Fig.6

Table 2

Fig.7

Table 3

Table 4

Fig.8

Fig.9

References 17

1	Nguyen T A, Reverse C C. Engineering mobile application user interfaces with REMAUI[C]∥IEEE/ACM International Conference on Automated Software Engineering, Lincoln, United States, 2015: 248-259.
2	Swearngin A, Dontcheva M, Li W, et al. Rewire: interfade design assistance from examples[C]∥Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal QC, Canada, 2018:No.504.
3	White T D, Fraser G, Brown G J. Improving random GUI testing with image-based widget detection[C]∥Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, New York, NY, USA, 2019: 307-317.
4	Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 6517-6525.
5	Redmon J, Farhadi A. YOLOv3: an incremental improvement[DB/OL].[2020-01-12].
6	He Kai-ming, Zhang Xiang-yu, Ren Shao-qing, et al. Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vegas, Nevada, 2016:770-778.
7	Huang Gao, Liu Zhuang, van der Maaten Laurens, et al. Densely connected convolutional networks[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 2261-2269.
8	Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, 2016:779-788.
9	Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Ohio, 2014:580-587.
10	Girshick R. Fast R-CNN[C]∥IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1440-1448.
11	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6):1137-1149.
12	耿庆田,于繁华,王宇婷,等. 基于特征融合的车型检测新算法[J]. 吉林大学学报:工学版, 2018, 48(3):929-935.
	Geng Qing-tian, Yu Fan-hua, Wang Yu-ting, et al. New algorithm for vehicle type detection based on feature fusion[J]. Journal of Jilin University (Engineering and Technology Edition), 2018, 48(3):929-935.
13	Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]∥Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2999-3007.
14	Hu Jie, Shen Li, Albanie S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,1:01507.
15	Sanghyun W, Jongchan P, Joon-Young L, et al. CBAM:convolutional block attention module[J]. Springer, 2018, 11211:3-19.
16	车翔玖,王利,郭晓新. 基于多尺度特征融合的边界检测算法[J]. 吉林大学学报:工学版, 2018, 48(5):1621-1628.
	Che Xiang-jiu, Wang Li, Guo Xiao-xin. Improved boundary detection based on multi-scale cues fusion[J]. Journal of Jilin University (Engineering and Technology Edition), 2018, 48(5):1621-1628.
17	Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 936-944.

Related Articles 15

[1]	Xiao-long ZHU,Zhong XIE. Geospatial data extraction algorithm based on machine learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 1011-1016.
[2]	Dan-tong OUYANG,Yang LIU,Jie LIU. Fault diagnosis method based on test set under fault response guidance [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 1017-1025.
[3]	Xiao-hui WEI,Chang-bao ZHOU,Xiao-xian SHEN,Yuan-yuan LIU,Qun-chao TONG. Accelerating CALYPSO structure prediction with machine learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(2): 667-676.
[4]	Bing-hai ZHOU,Qiong WU. Balancing and bi⁃objective optimization of robotic assemble lines [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(2): 720-727.
[5]	Tian-qi GU,Chen-jie HU,Yi TU,Shu-wen LIN. Robust reconstruction method based on moving least squares algorithm [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(2): 685-691.
[6]	Qian-yi XU,Gui-he QIN,Ming-hui SUN,Cheng-xun MENG. Classification of drivers' head status based on improved ResNeSt [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(2): 704-711.
[7]	Xiao-yu WANG,Xin-hao HU,Chang-lin HAN. Face pencil drawing algorithms based on generative adversarial network [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 285-292.
[8]	Ming FANG,Wen-qiang CHEN. Face micro-expression recognition based on ResNet with object mask [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 303-313.
[9]	Hai-ying ZHAO,Wei ZHOU,Xiao-gang HOU,Xiao-li ZHANG. Double-layer annotation of traditional costume images based on multi-task learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 293-302.
[10]	Yuan SONG,Dan-yuan ZHOU,Wen-chang SHI. Method to enhance security function of OpenStack Swift cloud storage system [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 314-322.
[11]	Hong-wei ZHAO,Xiao-han LIU,Yuan ZHANG,Li-li FAN,Man-li LONG,Xue-bai ZANG. Clothing classification algorithm based on landmark attention and channel attention [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1765-1770.
[12]	Nai-yan GUAN,Juan-li GUO. Component awareness adaptive model based on attitude estimation algorithms [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1850-1855.
[13]	Yang LI,Shuo LI,Li-wei JING. Estimate model based on Bayesian model and machine learning algorithms applicated in financial risk assessment [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1862-1869.
[14]	Bing-hai ZHOU,Zhao-xu HE. Dynamic material handling scheduling for mixed⁃model assembly lines based on line⁃integrated supermarkets [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1809-1817.
[15]	Lei JIANG,Ren-chu GUAN. Design of fuzzy comprehensive evaluation system for talent quality based on multi⁃objective evolutionary algorithm [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1856-1861.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

组件类别	描述	样例
ImageView	用来在界面上显示图片。
TextView	用来在界面上显示文本，是一些组件的父类，如：Button, EditText。
ImageButton	以图片形式显示的按钮，可以添加点击事件。
CheckBox	复选框可以选择多个项目的按钮。
EditText	用来显示文本，文本内容是可编辑的。
RadioButton	单选按钮Button的一种，只能选择一项。
Button	可以显示文本，并且可以添加点击事件。
ProgressBar	用于向用户显示某耗时操作的完成进度。
CompoundButton	选中、未选中状态。
SeekBar	允许用户拖动滑块改变相应值。
Switch	通过点击改变开关状态。
RatingBar	通常用于向用户显示等级。
ToggleButton	状态开关按钮由Switch派生出来的。
Spinner	供用户选择的列表框。

	标签为正	标签为负
预测为正	TP	FP
预测为负	FN	TN

α	γ	Precision	Recall	mAP
0.25	1	0.2598	0.6695	0.3776
0.50	1	0.2887	0.6934	0.4147
0.75	1	0.2957	0.7023	0.4289
0.25	2	0.2584	0.6459	0.3755
0.50	2	0.2770	0.7608	0.4251
0.75	2	0.3208	0.7187	0.4480
0.25	3	0.2554	0.6715	0.3754
0.50	3	0.2701	0.6926	0.4226
0.75	3	0.2879	0.7144	0.4484
0.75	4	0.2830	0.7290	0.4344
0.75	5	0.2772	0.7251	0.4273

检测方法	输入尺寸	Precision	Recall	mAP
REMAUI(OCR+CV)	原始大小	0.1055	0.1359	-
YOLOv2(合成UI数据集)	416×416	0.3125	0.7083	-
YOLOv2(真实UI数据集)	416×416	0.2213	0.5773	0.4157
YOLOv3	416×416	0.1704	0.6031	0.4231
本文方法	416×416	0.2315	0.9197	0.4821

User interface components detection algorithm based on improved YOLOv3

RICH HTML

PDF (PC)