机载广域遥感图像的尺度归一化目标检测方法

doi:10.13229/j.cnki.jdxbgxb.20230034

Abstract

Abstract:

Aiming at the difficulty of object detection caused by the large target size variation， complex background noise and dense targets in airborne wide-area remote sensing images， this paper unifies the target pixel size of the input image by optimizing the segmentation method， and proposes a multi-scale normalized convolutional neural networks model （MNNet）. To enhance the feature correlation between localities， this paper designs a space global connection block （SGC）， which effectively improves the detection accuracy. For the problem that the parameters of the existing NMS algorithm depend on the empirical setting， this paper proposes a self-adaption non-maxima suppression method （DNMS）， which reduces the difficulty of model deployment. The test results on the RSF dataset show that the average precision （AP） of the model in this paper is higher than that of other models by more than 5.0%， and the detection speed reaches 57.7 fps， which can meet the detection task of remote sensing images.

Key words: pattern recognition and intelligent system, computer vision, object detection, remote sensing image, convolutional neural network

CLC Number:

TP391

Sheng-jie ZHU,Xuan WANG,Fang XU,Jia-qi PENG,Yuan-chao WANG. Multi-scale normalized detection method for airborne wide-area remote sensing images[J].Journal of Jilin University(Engineering and Technology Edition), 2024, 54(8): 2329-2337.

Figures/Tables 9

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Table 1

Table 2

Table 3

Fig. 5

Fig. 6

References 24

1	成丽波, 陈鹏宇, 李喆, 等. 基于剪切波变换和拟合优度检验的遥感图像去噪[J]. 吉林大学学报:理学版, 2023, 61(5): 1187-1194.
	Cheng Li-bo, Chen Peng-yu, Li Zhe, JIA Xiaoning. Remote Sensing Image Denoising Based on Shearlet Transform and Goodness of Fit Test[J]. Journal of Jilin University (Science Edition), 2023, 61(5): 1187-1194.
2	成丽波, 董伦, 李喆, 等. 基于NSST与稀疏先验的遥感图像去模糊方法[J]. 吉林大学学报: 理学版, 2024, 62(1): 106-0115.
	Cheng Li-bo, Dong Lun, Li Zhe, et al. Remote Sensing Image Deblurring Method Based on NSST and Sparse Prior[J]. Journal of Jilin University (Science Edition), 2024, 62(1): 106-115.
3	Viola P, Jonus M J. Robust real-time face detection[J]. International Journal of Computer Vision, 2004, 57: 137-154.
4	Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005: 886-893.
5	Girshick R B, Felzenszwalb P F, Mcallester D. Object detection with grammar models[C]//Neural Information Processing Systems, Granada, Spain, 2011: 442-450.
6	Dai J F, Li Y, He K M, et al. R-FCN: object detection via region-based fully convolutional networks[C]//Neural Information Processing Systems (NIPS), Barcelona, Spain, 2016: 379-387.
7	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
8	高明华, 杨璨. 基于改进卷积神经网络的交通目标检测方法[J]. 吉林大学学报: 工学版, 2022, 52(6): 1353-1361.
	Gao Ming-hua, Yang Can. Traffic target detection method based on improved convolution neural network[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(6): 1353-1361.
9	曲优, 李文辉. 基于锚框变换的单阶段旋转目标检测方法[J]. 吉林大学学报: 工学版, 2022, 52(1): 162-173.
	Qu You, Li Wen-hui. Single-stage rotated object detection network based on anchor transformation[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(1): 162-173.
10	Carion N, Massa F, Synnaeve G, et al. End-to-end object detection with transformers[C]//16th European ECCV Conference, Glasgow, UK, 2020: 213-229.
11	Yang M Y, Liao W T, Li X B, et al. Vehicle detection in aerial images[J]. Photogrammetric Engineering and Remote Sensing, 2019, 85: 297-304.
12	Xia G S, Bai X, Ding J, et al. Dota: a large-scale dataset for object detection in aerial images[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, 2018: 3974-3983.
13	Van E A. You only look twice: rapid multi-scale object detection in satellite imagery[DB/OL].[2022-12-22]..
14	陈森, 徐伟峰, 王洪涛, 等. 基于改进YOLOv7的麦穗检测算法[J]. 吉林大学学报: 理学版, 2024, 62(4): 886-894.
	Chen Sen, Xu Wei-feng, Wang Hong-tao, et al. Wheat Ear Detection Algorithm Based on Improved YOLOv7[J]. Journal of Jilin University (Science Edition), 2024, 62(4): 886-894.
15	黄键, 徐伟峰, 苏攀, 等. 基于YOLOX-S的车窗状态识别算法[J]. 吉林大学学报: 理学版, 2023, 61(4): 875-882.
	Huang Jian, Xu Wei-feng, Su Pan, et al. Car Window State Recognition Algorithm Based on YOLOX-S[J]. Journal of Jilin University (Science Edition), 2023, 61(4): 875-882.
16	Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//The 14th European Conference on Computer Vision (ECCV), Amsterdam, Netherlands, 2016: 21-37.
17	He K M, Gkioxari G, Dollár P, et al. Mask R-CNN[C]//IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 2980-2988.
18	Li K, Wan G, Cheng G, et al. Object detection in optical remote sensing images: a survey and a new benchmark[J]. Photogrammetric Engineering and Remote Sensing, 2020, 159: 296-307.
19	Lu X Q, Zhang Y L, Yuan Y, et al. Gated and axis-concentrated localization network for remote sensing object detection[J]. IEEE Transactions on Remote Sensing, 2020, 58: 179-192.
20	Zou Z X, Shi Z W. Random access memories: a new paradigm for target detection in high resolution aerial remote sensing images[J]. IEEE Transactions on Image Processing, 2018, 27(3): 1100-1111.
21	Silverman B W. Density estimation for statistics and data analysis[M]. London: Chapman and Hall/CRC, 2018.
22	Zheng Z H, Wang P, Ren D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2022, 52(8): 8574-8586.
23	Ma J Q, Shao W Y, Ye H, et al. Arbitrary-oriented scene text detection via rotation proposals[J]. IEEE Transactions on Multimedia, 2018, 20(11): 3111-3122.
24	Yang X, Yang J R, Yan J C, et al. Scrdet: towards more robust detection for small, cluttered and rotated objects[C]//IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea(South), 2019: 8231-8240.

Related Articles 15

[1]	Xin-dong YOU,Lei GUO,Jing HAN,Xue-qiang LYU. An character recognition network for imprint character [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(7): 2072-2079.
[2]	Xiao-hui WEI,Chen-yang WANG,Qi WU,Xin-yang ZHENG,Hong-mei YU,Heng-shan YUE. Systolic array-based CNN accelerator soft error approximate fault tolerance design [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(6): 1746-1755.
[3]	Ming-hui SUN,Hao XUE,Yu-bo JIN,Wei-dong QU,Gui-he QIN. Video saliency prediction with collective spatio-temporal attention [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(6): 1767-1776.
[4]	Dian-wei WANG,Chi ZHANG,Jie FANG,Zhi-jie XU. UAV target tracking algorithm based on high resolution siamese network [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(5): 1426-1434.
[5]	Yu WANG,Kai ZHAO. Postprocessing of human pose heatmap based on sub⁃pixel location [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(5): 1385-1392.
[6]	Yun-long GAO,Ming REN,Chuan WU,Wen GAO. An improved anchor-free model based on attention mechanism for ship detection [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(5): 1407-1416.
[7]	Chao XIA,Meng-jia WANG,Jian-yue Zhu,Zhi-gang YANG. Reduced-order modelling of a bluff body turbulent wake flow field using hierarchical convolutional neural network autoencoder [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(4): 874-882.
[8]	Yun-zuo ZHANG,Wei GUO,Wen-bo LI. Omnidirectional accurate detection algorithm for dense small objects in remote sensing images [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(4): 1105-1113.
[9]	Xiong-fei LI,Zi-xuan SONG,Rui ZHU,Xiao-li ZHANG. Remote sensing change detection model based on multi⁃scale fusion [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(2): 516-523.
[10]	Guo-jun YANG,Ya-hui QI,Xiu-ming SHI. Review of bridge crack detection based on digital image technology [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(2): 313-332.
[11]	Chun-hua WANG,En-ze LI,Min XIAO. Object detection in high-resolution remote sensing images based on multi-feature fusion and twin attention network [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(1): 240-250.
[12]	Zhi-dan CAI,Ming FANG,Zhe LI,Jia-lu XU. Blind remote sensing image deblurring algorithm based on Gaussian curvature and reweighted graph total variation [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(9): 2649-2658.
[13]	Xiang-jiu CHE,Huan XU,Ming-yang PAN,Quan-le LIU. Two-stage learning algorithm for biomedical named entity recognition [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(8): 2380-2387.
[14]	Zhen-hai ZHANG,Kun JI,Jian-wu DANG. Crack identification method for bridge based on BCEM model [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(5): 1418-1426.
[15]	Pei-yong LIU,Jie DONG,Luo-feng XIE,Yang-yang ZHU,Guo-fu YIN. Surface defect detection algorithm of magnetic tiles based on multi⁃branch convolutional neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(5): 1449-1457.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 10

[1]	LI Shoutao, LI Yuanchun. Autonomous Mobile Robot Control Algorithm Based on Hierarchical Fuzzy Behaviors in Unknown Environments[J]. 吉林大学学报(工学版), 2005, 35(04): 391 -397 .
[2]	Liu Qing-min，Wang Long-shan，Chen Xiang-wei，Li Guo-fa. Ball nut detection by machine vision[J]. 吉林大学学报(工学版), 2006, 36(04): 534 -538 .
[3]	Li Hong-ying; Shi Wei-guang;Gan Shu-cai. Electromagnetic properties and microwave absorbing property of Z type hexaferrite Ba_3-xLa_xCo₂Fe₂₄O₄₁[J]. 吉林大学学报(工学版), 2006, 36(06): 856 -0860 .
[4]	Zhang Quan-fa，Li Ming-zhe，Sun Gang，Ge Xin . Comparison between flexible and rigid blank-holding in multi-point forming[J]. 吉林大学学报(工学版), 2007, 37(01): 25 -30 .
[5]	Yang Shu-kai, Song Chuan-xue, An Xiao-juan, Cai Zhang-lin . Analyzing effects of suspension bushing elasticity on vehicle yaw response character with virtual prototype method[J]. 吉林大学学报(工学版), 2007, 37(05): 994 -0999 .
[6]	. [J]. 吉林大学学报(工学版), 2007, 37(06): 1284 -1287 .
[7]	Che Xiang-jiu，Liu Da-you，Wang Zheng-xuan . Construction of joining surface with G¹ continuity for two NURBS surfaces[J]. 吉林大学学报(工学版), 2007, 37(04): 838 -841 .
[8]	Liu Han-bing, Jiao Yu-ling, Liang Chun-yu,Qin Wei-jun . Effect of shape function on computing precision in meshless methods[J]. 吉林大学学报(工学版), 2007, 37(03): 715 -0720 .
[9]	. [J]. 吉林大学学报(工学版), 2007, 37(04): 0 .
[10]	Li Yue-ying，Liu Yong-bing，Chen Hua . Surface hardening and tribological properties of a cam materials[J]. 吉林大学学报(工学版), 2007, 37(05): 1064 -1068 .

模型	锚框数	检测头数	参数量	模型大小/MB
YOLOv3	9	3	61 949 149	236.32
YOLOv4	9	3	63 943 071	245.53
YOLOv5m	9	3	22 229 358	84.80
YOLOv5l	9	3	48 384 174	184.57
YOLOv5x	9	3	89 671 790	342.07
SSD300	9	3	23 745 908	90.58
Faster-RCNN	9	3	137 078 239	522.91
MNNet	3	1	31 443 246	119.95

阈值	AP@0.75
阈值	Greedy NMS/%	Soft NMS/%	Softer NMS/%	DNMS/%
0.25	60.65	61.90	61.30	66.37
0.35	62.70	63.50	63.60
0.45	64.10	65.50	65.60
0.55	64.30	65.40	66.00
0.65	64.11	65.55	66.25
0.75	62.95	64.70	64.00
0.85	61.39	62.00	62.70

模型	主干网络	精度/%	召回率/%	F₁值	AP@0.50	AP@0.50∶0.95	帧率/（帧·s^-1）
HOG+SVM	/	6.52	21.19	0.099 7	/	/	1.3 fps
SSD300	VGG-16	25.55	47.34	0.331 8	0.294 6	0.124 5	45.5 fps
R-CenterNet	Hourglass	/	/	/	0.464 0	0.202 1	50.2 fps
RRPN^［23］	VGG-16	48.93	54.72	0.516 6	0.581 3	0.247 2	39.4 fps
SCRDet^［24］	VGG-16	50.37	58.42	0.541 0	0.652 7	0.253 7	27.6 fps
YOLT	Darknet19	22.94	61.23	0.333 8	0.502 2	0.168 1	52.3 fps
R-YOLOv3	Darknet53	21.18	68.65	0.323 7	0.533 4	0.203 4	51.3 fps
R-YOLOv4	CSPDarknet53	39.72	79.25	0.529 1	0.653 1	0.253 8	56.4 fps
R-YOLOv5s	CSPDarknet53	34.28	80.96	0.481 7	0.659 9	0.272 0	71.4 fps
R-YOLOv5m	CSPDarknet53	36.82	79.68	0.503 6	0.635 6	0.283 6	62.1 fps
R-YOLOv5l	CSPDarknet53	40.11	73.21	0.518 2	0.603 3	0.235 4	48.5 fps
MNNet（without SGC）	CSPDarknet53	51.34	74.12	0.606 6	0.653 6	0.293 4	68.9 fps
MNNet	CSPDarknet53	59.79	71.17	0.649 8	0.717 9	0.341 2	57.7 fps

Multi-scale normalized detection method for airborne wide-area remote sensing images

RICH HTML

PDF (PC)