Journal of Jilin University(Engineering and Technology Edition) ›› 2024, Vol. 54 ›› Issue (8): 2329-2337.doi: 10.13229/j.cnki.jdxbgxb.20230034

Previous Articles     Next Articles

Multi-scale normalized detection method for airborne wide-area remote sensing images

Sheng-jie ZHU1,2(),Xuan WANG1(),Fang XU1,Jia-qi PENG3,Yuan-chao WANG4   

  1. 1.Changchun Institute of Optics,Fine Mechanics and Physics,Chinese Academy of Sciences,Changchun 130033,China
    2.Daheng College,University of Chinese Academy of Sciences,Beijing 100049,China
    3.First Military Representative Office in Changchun,Changchun 130033,China
    4.Shanghai Electro-Mechanical Engineering Institute,Shanghai 201109,China
  • Received:2023-01-10 Online:2024-08-01 Published:2024-08-30
  • Contact: Xuan WANG E-mail:shengjie_zhu@foxmail.com;ally637@163.com

Abstract:

Aiming at the difficulty of object detection caused by the large target size variation, complex background noise and dense targets in airborne wide-area remote sensing images, this paper unifies the target pixel size of the input image by optimizing the segmentation method, and proposes a multi-scale normalized convolutional neural networks model (MNNet). To enhance the feature correlation between localities, this paper designs a space global connection block (SGC), which effectively improves the detection accuracy. For the problem that the parameters of the existing NMS algorithm depend on the empirical setting, this paper proposes a self-adaption non-maxima suppression method (DNMS), which reduces the difficulty of model deployment. The test results on the RSF dataset show that the average precision (AP) of the model in this paper is higher than that of other models by more than 5.0%, and the detection speed reaches 57.7 fps, which can meet the detection task of remote sensing images.

Key words: pattern recognition and intelligent system, computer vision, object detection, remote sensing image, convolutional neural network

CLC Number: 

  • TP391

Fig. 1

Overall architecture of the MNNet framework"

Fig. 2

Schematic diagram of multi-scale normalized process"

Fig. 3

A space global connection block (SGC Block)"

Fig. 4

Schematic diagram of target size distribution (RSF Dataset)"

Table 1

Parameters of neural networks for target detection"

模型锚框数检测头数参数量模型大小/MB
YOLOv39361 949 149236.32
YOLOv49363 943 071245.53
YOLOv5m9322 229 35884.80
YOLOv5l9348 384 174184.57
YOLOv5x9389 671 790342.07
SSD3009323 745 90890.58
Faster-RCNN93137 078 239522.91
MNNet3131 443 246119.95

Table 2

Comparison of different NMS methods on RSF dataset"

阈值AP@0.75
Greedy NMS/%Soft NMS/%Softer NMS/%DNMS/%
0.2560.6561.9061.3066.37
0.3562.7063.5063.60
0.4564.1065.5065.60
0.5564.3065.4066.00
0.6564.1165.5566.25
0.7562.9564.7064.00
0.8561.3962.0062.70

Table 3

Comparison of different network methods on RSF dataset"

模型主干网络精度/%召回率/%F1AP@0.50AP@0.50∶0.95帧率/(帧·s-1
HOG+SVM/6.5221.190.099 7//1.3 fps
SSD300VGG-1625.5547.340.331 80.294 60.124 545.5 fps
R-CenterNetHourglass///0.464 00.202 150.2 fps
RRPN23VGG-1648.9354.720.516 60.581 30.247 239.4 fps
SCRDet24VGG-1650.3758.420.541 00.652 70.253 727.6 fps
YOLTDarknet1922.9461.230.333 80.502 20.168 152.3 fps
R-YOLOv3Darknet5321.1868.650.323 70.533 40.203 451.3 fps
R-YOLOv4CSPDarknet5339.7279.250.529 10.653 10.253 856.4 fps
R-YOLOv5sCSPDarknet5334.2880.960.481 70.659 90.272 071.4 fps
R-YOLOv5mCSPDarknet5336.8279.680.503 60.635 60.283 662.1 fps
R-YOLOv5lCSPDarknet5340.1173.210.518 20.603 30.235 448.5 fps
MNNet(without SGC)CSPDarknet5351.3474.120.606 60.653 60.293 468.9 fps
MNNetCSPDarknet5359.7971.170.649 80.717 90.341 257.7 fps

Fig. 5

Curve of loss value (Ltotal) (RSF validation dataset)"

Fig. 6

Detection results of the MNNet model (ITCVD dataset, RSF dataset, DOTA dataset)"

1 成丽波, 陈鹏宇, 李喆, 等. 基于剪切波变换和拟合优度检验的遥感图像去噪[J]. 吉林大学学报:理学版, 2023, 61(5): 1187-1194.
Cheng Li-bo, Chen Peng-yu, Li Zhe, JIA Xiaoning. Remote Sensing Image Denoising Based on Shearlet Transform and Goodness of Fit Test[J]. Journal of Jilin University (Science Edition), 2023, 61(5): 1187-1194.
2 成丽波, 董伦, 李喆, 等. 基于NSST与稀疏先验的遥感图像去模糊方法[J]. 吉林大学学报: 理学版, 2024, 62(1): 106-0115.
Cheng Li-bo, Dong Lun, Li Zhe, et al. Remote Sensing Image Deblurring Method Based on NSST and Sparse Prior[J]. Journal of Jilin University (Science Edition), 2024, 62(1): 106-115.
3 Viola P, Jonus M J. Robust real-time face detection[J]. International Journal of Computer Vision, 2004, 57: 137-154.
4 Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005: 886-893.
5 Girshick R B, Felzenszwalb P F, Mcallester D. Object detection with grammar models[C]//Neural Information Processing Systems, Granada, Spain, 2011: 442-450.
6 Dai J F, Li Y, He K M, et al. R-FCN: object detection via region-based fully convolutional networks[C]//Neural Information Processing Systems (NIPS), Barcelona, Spain, 2016: 379-387.
7 Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
8 高明华, 杨璨. 基于改进卷积神经网络的交通目标检测方法[J]. 吉林大学学报: 工学版, 2022, 52(6): 1353-1361.
Gao Ming-hua, Yang Can. Traffic target detection method based on improved convolution neural network[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(6): 1353-1361.
9 曲优, 李文辉. 基于锚框变换的单阶段旋转目标检测方法[J]. 吉林大学学报: 工学版, 2022, 52(1): 162-173.
Qu You, Li Wen-hui. Single-stage rotated object detection network based on anchor transformation[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(1): 162-173.
10 Carion N, Massa F, Synnaeve G, et al. End-to-end object detection with transformers[C]//16th European ECCV Conference, Glasgow, UK, 2020: 213-229.
11 Yang M Y, Liao W T, Li X B, et al. Vehicle detection in aerial images[J]. Photogrammetric Engineering and Remote Sensing, 2019, 85: 297-304.
12 Xia G S, Bai X, Ding J, et al. Dota: a large-scale dataset for object detection in aerial images[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, 2018: 3974-3983.
13 Van E A. You only look twice: rapid multi-scale object detection in satellite imagery[DB/OL].[2022-12-22]..
14 陈森, 徐伟峰, 王洪涛, 等. 基于改进YOLOv7的麦穗检测算法[J]. 吉林大学学报: 理学版, 2024, 62(4): 886-894.
Chen Sen, Xu Wei-feng, Wang Hong-tao, et al. Wheat Ear Detection Algorithm Based on Improved YOLOv7[J]. Journal of Jilin University (Science Edition), 2024, 62(4): 886-894.
15 黄键, 徐伟峰, 苏攀, 等. 基于YOLOX-S的车窗状态识别算法[J]. 吉林大学学报: 理学版, 2023, 61(4): 875-882.
Huang Jian, Xu Wei-feng, Su Pan, et al. Car Window State Recognition Algorithm Based on YOLOX-S[J]. Journal of Jilin University (Science Edition), 2023, 61(4): 875-882.
16 Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//The 14th European Conference on Computer Vision (ECCV), Amsterdam, Netherlands, 2016: 21-37.
17 He K M, Gkioxari G, Dollár P, et al. Mask R-CNN[C]//IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 2980-2988.
18 Li K, Wan G, Cheng G, et al. Object detection in optical remote sensing images: a survey and a new benchmark[J]. Photogrammetric Engineering and Remote Sensing, 2020, 159: 296-307.
19 Lu X Q, Zhang Y L, Yuan Y, et al. Gated and axis-concentrated localization network for remote sensing object detection[J]. IEEE Transactions on Remote Sensing, 2020, 58: 179-192.
20 Zou Z X, Shi Z W. Random access memories: a new paradigm for target detection in high resolution aerial remote sensing images[J]. IEEE Transactions on Image Processing, 2018, 27(3): 1100-1111.
21 Silverman B W. Density estimation for statistics and data analysis[M]. London: Chapman and Hall/CRC, 2018.
22 Zheng Z H, Wang P, Ren D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2022, 52(8): 8574-8586.
23 Ma J Q, Shao W Y, Ye H, et al. Arbitrary-oriented scene text detection via rotation proposals[J]. IEEE Transactions on Multimedia, 2018, 20(11): 3111-3122.
24 Yang X, Yang J R, Yan J C, et al. Scrdet: towards more robust detection for small, cluttered and rotated objects[C]//IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea(South), 2019: 8231-8240.
[1] Xin-dong YOU,Lei GUO,Jing HAN,Xue-qiang LYU. An character recognition network for imprint character [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(7): 2072-2079.
[2] Xiao-hui WEI,Chen-yang WANG,Qi WU,Xin-yang ZHENG,Hong-mei YU,Heng-shan YUE. Systolic array-based CNN accelerator soft error approximate fault tolerance design [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(6): 1746-1755.
[3] Ming-hui SUN,Hao XUE,Yu-bo JIN,Wei-dong QU,Gui-he QIN. Video saliency prediction with collective spatio-temporal attention [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(6): 1767-1776.
[4] Dian-wei WANG,Chi ZHANG,Jie FANG,Zhi-jie XU. UAV target tracking algorithm based on high resolution siamese network [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(5): 1426-1434.
[5] Yu WANG,Kai ZHAO. Postprocessing of human pose heatmap based on sub⁃pixel location [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(5): 1385-1392.
[6] Yun-long GAO,Ming REN,Chuan WU,Wen GAO. An improved anchor-free model based on attention mechanism for ship detection [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(5): 1407-1416.
[7] Chao XIA,Meng-jia WANG,Jian-yue Zhu,Zhi-gang YANG. Reduced-order modelling of a bluff body turbulent wake flow field using hierarchical convolutional neural network autoencoder [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(4): 874-882.
[8] Yun-zuo ZHANG,Wei GUO,Wen-bo LI. Omnidirectional accurate detection algorithm for dense small objects in remote sensing images [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(4): 1105-1113.
[9] Xiong-fei LI,Zi-xuan SONG,Rui ZHU,Xiao-li ZHANG. Remote sensing change detection model based on multi⁃scale fusion [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(2): 516-523.
[10] Guo-jun YANG,Ya-hui QI,Xiu-ming SHI. Review of bridge crack detection based on digital image technology [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(2): 313-332.
[11] Chun-hua WANG,En-ze LI,Min XIAO. Object detection in high-resolution remote sensing images based on multi-feature fusion and twin attention network [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(1): 240-250.
[12] Zhi-dan CAI,Ming FANG,Zhe LI,Jia-lu XU. Blind remote sensing image deblurring algorithm based on Gaussian curvature and reweighted graph total variation [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(9): 2649-2658.
[13] Xiang-jiu CHE,Huan XU,Ming-yang PAN,Quan-le LIU. Two-stage learning algorithm for biomedical named entity recognition [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(8): 2380-2387.
[14] Zhen-hai ZHANG,Kun JI,Jian-wu DANG. Crack identification method for bridge based on BCEM model [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(5): 1418-1426.
[15] Pei-yong LIU,Jie DONG,Luo-feng XIE,Yang-yang ZHU,Guo-fu YIN. Surface defect detection algorithm of magnetic tiles based on multi⁃branch convolutional neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(5): 1449-1457.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LI Shoutao, LI Yuanchun. Autonomous Mobile Robot Control Algorithm Based on Hierarchical Fuzzy Behaviors in Unknown Environments[J]. 吉林大学学报(工学版), 2005, 35(04): 391 -397 .
[2] Liu Qing-min,Wang Long-shan,Chen Xiang-wei,Li Guo-fa. Ball nut detection by machine vision[J]. 吉林大学学报(工学版), 2006, 36(04): 534 -538 .
[3] Li Hong-ying; Shi Wei-guang;Gan Shu-cai. Electromagnetic properties and microwave absorbing property
of Z type hexaferrite Ba3-xLaxCo2Fe24O41
[J]. 吉林大学学报(工学版), 2006, 36(06): 856 -0860 .
[4] Zhang Quan-fa,Li Ming-zhe,Sun Gang,Ge Xin . Comparison between flexible and rigid blank-holding in multi-point forming[J]. 吉林大学学报(工学版), 2007, 37(01): 25 -30 .
[5] Yang Shu-kai, Song Chuan-xue, An Xiao-juan, Cai Zhang-lin . Analyzing effects of suspension bushing elasticity
on vehicle yaw response character with virtual prototype method
[J]. 吉林大学学报(工学版), 2007, 37(05): 994 -0999 .
[6] . [J]. 吉林大学学报(工学版), 2007, 37(06): 1284 -1287 .
[7] Che Xiang-jiu,Liu Da-you,Wang Zheng-xuan . Construction of joining surface with G1 continuity for two NURBS surfaces[J]. 吉林大学学报(工学版), 2007, 37(04): 838 -841 .
[8] Liu Han-bing, Jiao Yu-ling, Liang Chun-yu,Qin Wei-jun . Effect of shape function on computing precision in meshless methods[J]. 吉林大学学报(工学版), 2007, 37(03): 715 -0720 .
[9] . [J]. 吉林大学学报(工学版), 2007, 37(04): 0 .
[10] Li Yue-ying,Liu Yong-bing,Chen Hua . Surface hardening and tribological properties of a cam materials[J]. 吉林大学学报(工学版), 2007, 37(05): 1064 -1068 .