Journal of Jilin University(Engineering and Technology Edition) ›› 2022, Vol. 52 ›› Issue (11): 2662-2668.doi: 10.13229/j.cnki.jdxbgxb20211013

Previous Articles    

Muti⁃Object dishes detection algorithm based on improved YOLOv4

Xiang-jiu CHE(),He-yuan CHEN   

  1. College of Computer Science and Technology,Jilin University,Changchun 130012,China
  • Received:2021-10-06 Online:2022-11-01 Published:2022-11-16

Abstract:

The two-stage object detection algorithm has a slow inference speed, meanwhile, the light-weighted model has a poor performance on small datasets. In view of reasons mentioned above, an improved multi-object detection algorithm based on the algorithm YOLOv4 is proposed. Taking dishes detection as an example, multi-object detection is used to detect and classify the empty dishes. Similar characteristics of the plates' edge are key identity to each completed plate. In order to preserve the salient features, the attention mechanism and pooling methods are added. Furthermore, a light-weighted network can speed up the inference speed and a multi-scale fusion method is able to improve the precision of the model. It turns out that the algorithm proposed improved 4.25% than previous work. According to the comparison with the classic detectors such as Faster RCNN, the FPS is 8-9 times of the latter one. The algorithm in this paper improves the reduced accuracy that the light-weighted models caused, and a more convenient and quick deployment of mobile or embedded devices are able to complete tasks.

Key words: computer application technology, dishes detection, convolutional neural network, multiple object detection

CLC Number: 

  • TP391.4

Fig.1

Architecture of proposed method"

Fig.2

Dataset of empty dishes"

Fig.3

Issue of mosaic"

Fig.4

Max pooling vs soft pooling 1"

Fig.5

Max pooling vs soft pooling 2"

Fig.6

Backbone structure of CA Module"

Table 1

Performance comparison on dishes dataset"

模 型mAP/%Precision/%
Faster RCNN84.1461.87
CenterNet50.0387.68
本文方法76.8080.75
+CA Module80.1482.01
+Soft Module81.0583.83

Table 2

Light-weighted comparison of different models"

模 型FLOPs/BParams/MFPS
Faster RCNN126.5137.117.6
CenterNet34.932.616.7
本文方法6.85.9148.0

Fig.7

Results of detection"

Fig.8

Comparison of feature maps"

Fig.9

Comparison of heatmaps"

1 Matsuda Y, Hoashi H, Yanai K. Recognition of multiplefood images by detecting candidate regions[C]∥IEEE International Conference on Multimedia and Expo, Melbourne, Australia, 2012: 25-30.
2 车翔玖, 王利, 郭晓新. 基于多尺度特征融合的边界检测算法[J]. 吉林大学学报: 工学版, 2018, 48(5): 1621-1628.
Che Xiang-jiu, Wang Li, Guo Xiao-xin. Improved boundary detection based on multiscale cues fusion[J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1621-1628.
3 Redmon J, Divvala S, Girshick R, et al. You only look once: unified, realtime object detection[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 779-788.
4 Ege T, Yanai K. Estimating food calories for multipledish food photos[C]∥4th IAPR Asian Conference on Pattern Recognition, Nanjing, China, 2017: 646-651.
5 车翔玖, 刘华罗, 邵庆彬. 基于Fast RCNN改进的布匹瑕疵识别算法[J]. 吉林大学学报: 工学版, 2019, 49(6): 2038-2044.
Che Xiang-jiu, Liu Hua-luo, Shao Qing-bin. Fabric defect recognition algorithm based on improved Fast RCNN[J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(6): 2038-2044.
6 Shimoda W, Yanai K. Foodness proposal for multiple food detection by training of single food images[C]∥Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, Amsterdam, The Netherlands, 2016: 13-21.
7 Bochkovskiy A, Wang C Y, Liao H Y M. Yolov4: optimal speed and accuracy of object detection[J/OL]. [2020-04-23].
8 He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2016: 770-778.
9 Huang G, Liu S, van der Maaten L, et al. Condensenet: an efficient densenet using learned group convolutions[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 2752-2761.
10 Girshick R, Donahue J, Darrell T, et al. Region-based convolutional networks for accurate object detection and segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 38(1): 142-158.
11 Girshick R. Fast r-cnn[C]∥Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1440-1448.
12 Ren S, He K, Girshick R, et al. Faster r-cnn: towards realtime object detection with region proposal networks[J]. Advances in Neural Information Processing Systems, 2015, 28: 91-99.
13 Redmon J, Farhadi A. Yolov3: an incremental improvement[J/OL]. [2018-04-08].
14 Everingham M, van Gool L, Williams C K I, et al. The pascal visual object classes (voc) challenge[J]. International Journal of Computer Vision, 2010, 88(2): 303-338.
15 Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 4510-4520.
16 Howard A G, Zhu M, Chen B,et al.Mobilenets:efficient convolutional neural networks for mobile vision applications[J].arXiv Preprint arXiv:,2017.
17 Stergiou A, Poppe R, Kalliatakis G. Refining activation downsampling with Softpool[J].[2021-03-18].
18 Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 5: 5998-6008.
19 Hu J, Shen L, Albanie S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023.
20 Woo S, Park J, Lee J Y, et al. CBAM: convolutional block attention module[C]∥Proceedings of the European Conference on Computer Vision, Munich, Germany, 2018: 3-19.
21 Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 13713-13722.
22 Zhou B, Khosla A, Lapedriza A, et al. Learning deep features for discriminative localization[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 2921-2929.
[1] Tian BAI,Ming-wei XU,Si-ming LIU,Ji-an ZHANG,Zhe WANG. Dispute focus identification of pleading text based on deep neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1872-1880.
[2] Fu-heng QU,Tian-yu DING,Yang LU,Yong YANG,Ya-ting HU. Fast image codeword search algorithm based on neighborhood similarity [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1865-1871.
[3] Xuan-jing SHEN,Xue-feng ZHANG,Yu WANG,Yu-bo JIN. Multi⁃focus image fusion algorithm based on pixel⁃level convolutional neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1857-1864.
[4] Gui-he QIN,Jun-feng HUANG,Ming-hui SUN. Text input based on two⁃handed keyboard in virtual environment [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1881-1888.
[5] Huai-jiang YANG,Er-shuai WANG,Yong-xin SUI,Feng YAN,Yue ZHOU. Simplified residual structure and fast deep residual networks [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(6): 1413-1421.
[6] Ming-hua GAO,Can YANG. Traffic target detection method based on improved convolution neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(6): 1353-1361.
[7] Ming LIU,Yu-hang YANG,Song-lin ZOU,Zhi-cheng XIAO,Yong-gang ZHANG. Application of enhanced edge detection image algorithm in multi-book recognition [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(4): 891-896.
[8] Shi-min FANG. Multiple source data selective integration algorithm based on frequent pattern tree [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(4): 885-890.
[9] Xue-zhi WANG,Qing-liang LI,Wen-hui LI. Spatio⁃temporal model of soil moisture prediction integrated with transfer learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 675-683.
[10] Xiang-jun LI,Jie-ying TU,Zhi-bin ZHAO. Validity classification of melting curve based on multi⁃scale fusion convolutional neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 633-639.
[11] Xian-tong LI,Wei QUAN,Hua WANG,Peng-cheng SUN,Peng-jin AN,Yong-xing MAN. Route travel time prediction on deep learning model through spatiotemporal features [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 557-563.
[12] Wen-zhi GAO,Yan-jun WANG,Xin-wei WANG,Pan ZHANG,Yong LI,Yang DONG. Real⁃time diagnosis for misfire fault of diesel engine based on convolutional neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(2): 417-424.
[13] Long ZHANG,Tian-peng XU,Chao-bing WANG,Jian-yu YI,Can-zhuang ZHEN. Gearbox fault diagnosis baed on convolutional gated recurrent network [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(2): 368-376.
[14] Sheng-sheng WANG,Chen-xu LI,Xiang-yu WANG,Zhi-lin YAO,Yi-shen LIU,Jia-qian WU,Qing-ran YANG. Brain tumor image classification based on improved residual capsule network and sparrow search [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(11): 2653-2661.
[15] Jie CAO,Zhi-Dong HE,Ping YU,Jin-hua WANG. Bearing fault diagnosis method under unbalanced data distribution [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(11): 2523-2531.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!