吉林大学学报(工学版) ›› 2024, Vol. 54 ›› Issue (8): 2329-2337.doi: 10.13229/j.cnki.jdxbgxb.20230034

• 计算机科学与技术 • 上一篇    下一篇

机载广域遥感图像的尺度归一化目标检测方法

朱圣杰1,2(),王宣1(),徐芳1,彭佳琦3,王远超4   

  1. 1.中国科学院 长春光学精密机械与物理研究所, 长春 130033
    2.中国科学院大学 大珩学院, 北京 100049
    3.驻长春地区第一军事代表室, 长春 130033
    4.上海机电工程研究所, 上海 201109
  • 收稿日期:2023-01-10 出版日期:2024-08-01 发布日期:2024-08-30
  • 通讯作者: 王宣 E-mail:shengjie_zhu@foxmail.com;ally637@163.com
  • 作者简介:朱圣杰(1997-),男,博士研究生. 研究方向:航空航天成像,计算机视觉. E-mail: shengjie_zhu@foxmail.com
  • 基金资助:
    国家自然科学基金项目(61905240)

Multi-scale normalized detection method for airborne wide-area remote sensing images

Sheng-jie ZHU1,2(),Xuan WANG1(),Fang XU1,Jia-qi PENG3,Yuan-chao WANG4   

  1. 1.Changchun Institute of Optics,Fine Mechanics and Physics,Chinese Academy of Sciences,Changchun 130033,China
    2.Daheng College,University of Chinese Academy of Sciences,Beijing 100049,China
    3.First Military Representative Office in Changchun,Changchun 130033,China
    4.Shanghai Electro-Mechanical Engineering Institute,Shanghai 201109,China
  • Received:2023-01-10 Online:2024-08-01 Published:2024-08-30
  • Contact: Xuan WANG E-mail:shengjie_zhu@foxmail.com;ally637@163.com

摘要:

针对机载广域遥感图像的目标尺寸变化大、背景噪声复杂以及局部目标密集给目标检测任务带来的困难,本文通过优化分割方法统一输入图像的目标像素尺寸,并以此简化模型结构提出了一种尺度归一化卷积神经网络模型MNNet。为增强局部之间的特征关联,本文设计了全局连接块(SGC),有效提高了检测的精度。针对现有非极大值抑制算法的超参数依赖经验设置的问题,本文提出了一种自适应非极大值抑制方法(DNMS),降低了模型的部署难度。在RSF数据集上的测试结果表明:本文模型的检测平均精度(AP)高于其他模型5.0%以上,在检测速度上达到了57.7 帧/s,可以满足遥感图像的检测任务需求。

关键词: 模式识别与智能系统, 计算机视觉, 目标检测, 遥感图像, 卷积神经网络

Abstract:

Aiming at the difficulty of object detection caused by the large target size variation, complex background noise and dense targets in airborne wide-area remote sensing images, this paper unifies the target pixel size of the input image by optimizing the segmentation method, and proposes a multi-scale normalized convolutional neural networks model (MNNet). To enhance the feature correlation between localities, this paper designs a space global connection block (SGC), which effectively improves the detection accuracy. For the problem that the parameters of the existing NMS algorithm depend on the empirical setting, this paper proposes a self-adaption non-maxima suppression method (DNMS), which reduces the difficulty of model deployment. The test results on the RSF dataset show that the average precision (AP) of the model in this paper is higher than that of other models by more than 5.0%, and the detection speed reaches 57.7 fps, which can meet the detection task of remote sensing images.

Key words: pattern recognition and intelligent system, computer vision, object detection, remote sensing image, convolutional neural network

中图分类号: 

  • TP391

图 1

MNNet框架的整体架构"

图 2

尺度目标归一化流程示意图"

图 3

空间全局连接模块结构示意图"

图 4

目标尺寸分布示意图(RSF数据集)"

表 1

检测模型的参数量对比"

模型锚框数检测头数参数量模型大小/MB
YOLOv39361 949 149236.32
YOLOv49363 943 071245.53
YOLOv5m9322 229 35884.80
YOLOv5l9348 384 174184.57
YOLOv5x9389 671 790342.07
SSD3009323 745 90890.58
Faster-RCNN93137 078 239522.91
MNNet3131 443 246119.95

表 2

NMS算法AP@0.75性能对比图(RSF数据集)"

阈值AP@0.75
Greedy NMS/%Soft NMS/%Softer NMS/%DNMS/%
0.2560.6561.9061.3066.37
0.3562.7063.5063.60
0.4564.1065.5065.60
0.5564.3065.4066.00
0.6564.1165.5566.25
0.7562.9564.7064.00
0.8561.3962.0062.70

表 3

多种模型检测效果对比表(RSF数据集)"

模型主干网络精度/%召回率/%F1AP@0.50AP@0.50∶0.95帧率/(帧·s-1
HOG+SVM/6.5221.190.099 7//1.3 fps
SSD300VGG-1625.5547.340.331 80.294 60.124 545.5 fps
R-CenterNetHourglass///0.464 00.202 150.2 fps
RRPN23VGG-1648.9354.720.516 60.581 30.247 239.4 fps
SCRDet24VGG-1650.3758.420.541 00.652 70.253 727.6 fps
YOLTDarknet1922.9461.230.333 80.502 20.168 152.3 fps
R-YOLOv3Darknet5321.1868.650.323 70.533 40.203 451.3 fps
R-YOLOv4CSPDarknet5339.7279.250.529 10.653 10.253 856.4 fps
R-YOLOv5sCSPDarknet5334.2880.960.481 70.659 90.272 071.4 fps
R-YOLOv5mCSPDarknet5336.8279.680.503 60.635 60.283 662.1 fps
R-YOLOv5lCSPDarknet5340.1173.210.518 20.603 30.235 448.5 fps
MNNet(without SGC)CSPDarknet5351.3474.120.606 60.653 60.293 468.9 fps
MNNetCSPDarknet5359.7971.170.649 80.717 90.341 257.7 fps

图 5

验证集损失值(Ltotal)曲线(RSF验证集)"

图 6

MNNet模型的检测结果(ITCVD数据集、RSF数据集、DOTA数据集)"

1 成丽波, 陈鹏宇, 李喆, 等. 基于剪切波变换和拟合优度检验的遥感图像去噪[J]. 吉林大学学报:理学版, 2023, 61(5): 1187-1194.
Cheng Li-bo, Chen Peng-yu, Li Zhe, JIA Xiaoning. Remote Sensing Image Denoising Based on Shearlet Transform and Goodness of Fit Test[J]. Journal of Jilin University (Science Edition), 2023, 61(5): 1187-1194.
2 成丽波, 董伦, 李喆, 等. 基于NSST与稀疏先验的遥感图像去模糊方法[J]. 吉林大学学报: 理学版, 2024, 62(1): 106-0115.
Cheng Li-bo, Dong Lun, Li Zhe, et al. Remote Sensing Image Deblurring Method Based on NSST and Sparse Prior[J]. Journal of Jilin University (Science Edition), 2024, 62(1): 106-115.
3 Viola P, Jonus M J. Robust real-time face detection[J]. International Journal of Computer Vision, 2004, 57: 137-154.
4 Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005: 886-893.
5 Girshick R B, Felzenszwalb P F, Mcallester D. Object detection with grammar models[C]//Neural Information Processing Systems, Granada, Spain, 2011: 442-450.
6 Dai J F, Li Y, He K M, et al. R-FCN: object detection via region-based fully convolutional networks[C]//Neural Information Processing Systems (NIPS), Barcelona, Spain, 2016: 379-387.
7 Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
8 高明华, 杨璨. 基于改进卷积神经网络的交通目标检测方法[J]. 吉林大学学报: 工学版, 2022, 52(6): 1353-1361.
Gao Ming-hua, Yang Can. Traffic target detection method based on improved convolution neural network[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(6): 1353-1361.
9 曲优, 李文辉. 基于锚框变换的单阶段旋转目标检测方法[J]. 吉林大学学报: 工学版, 2022, 52(1): 162-173.
Qu You, Li Wen-hui. Single-stage rotated object detection network based on anchor transformation[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(1): 162-173.
10 Carion N, Massa F, Synnaeve G, et al. End-to-end object detection with transformers[C]//16th European ECCV Conference, Glasgow, UK, 2020: 213-229.
11 Yang M Y, Liao W T, Li X B, et al. Vehicle detection in aerial images[J]. Photogrammetric Engineering and Remote Sensing, 2019, 85: 297-304.
12 Xia G S, Bai X, Ding J, et al. Dota: a large-scale dataset for object detection in aerial images[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, 2018: 3974-3983.
13 Van E A. You only look twice: rapid multi-scale object detection in satellite imagery[DB/OL].[2022-12-22]..
14 陈森, 徐伟峰, 王洪涛, 等. 基于改进YOLOv7的麦穗检测算法[J]. 吉林大学学报: 理学版, 2024, 62(4): 886-894.
Chen Sen, Xu Wei-feng, Wang Hong-tao, et al. Wheat Ear Detection Algorithm Based on Improved YOLOv7[J]. Journal of Jilin University (Science Edition), 2024, 62(4): 886-894.
15 黄键, 徐伟峰, 苏攀, 等. 基于YOLOX-S的车窗状态识别算法[J]. 吉林大学学报: 理学版, 2023, 61(4): 875-882.
Huang Jian, Xu Wei-feng, Su Pan, et al. Car Window State Recognition Algorithm Based on YOLOX-S[J]. Journal of Jilin University (Science Edition), 2023, 61(4): 875-882.
16 Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//The 14th European Conference on Computer Vision (ECCV), Amsterdam, Netherlands, 2016: 21-37.
17 He K M, Gkioxari G, Dollár P, et al. Mask R-CNN[C]//IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 2980-2988.
18 Li K, Wan G, Cheng G, et al. Object detection in optical remote sensing images: a survey and a new benchmark[J]. Photogrammetric Engineering and Remote Sensing, 2020, 159: 296-307.
19 Lu X Q, Zhang Y L, Yuan Y, et al. Gated and axis-concentrated localization network for remote sensing object detection[J]. IEEE Transactions on Remote Sensing, 2020, 58: 179-192.
20 Zou Z X, Shi Z W. Random access memories: a new paradigm for target detection in high resolution aerial remote sensing images[J]. IEEE Transactions on Image Processing, 2018, 27(3): 1100-1111.
21 Silverman B W. Density estimation for statistics and data analysis[M]. London: Chapman and Hall/CRC, 2018.
22 Zheng Z H, Wang P, Ren D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2022, 52(8): 8574-8586.
23 Ma J Q, Shao W Y, Ye H, et al. Arbitrary-oriented scene text detection via rotation proposals[J]. IEEE Transactions on Multimedia, 2018, 20(11): 3111-3122.
24 Yang X, Yang J R, Yan J C, et al. Scrdet: towards more robust detection for small, cluttered and rotated objects[C]//IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea(South), 2019: 8231-8240.
[1] 才华,寇婷婷,杨依宁,马智勇,王伟刚,孙俊喜. 基于轨迹优化的三维车辆多目标跟踪[J]. 吉林大学学报(工学版), 2024, 54(8): 2338-2347.
[2] 特木尔朝鲁朝鲁,张亚萍. 基于卷积神经网络的无线传感器网络链路异常检测算法[J]. 吉林大学学报(工学版), 2024, 54(8): 2295-2300.
[3] 赵宏伟,武鸿,马克,李海. 基于知识蒸馏的图像分类框架[J]. 吉林大学学报(工学版), 2024, 54(8): 2307-2312.
[4] 张锦洲,姬世青,谭创. 融合卷积神经网络和双边滤波的相贯线焊缝提取算法[J]. 吉林大学学报(工学版), 2024, 54(8): 2313-2318.
[5] 游新冬,郭磊,韩晶,吕学强. 一种工件表面压印字符识别网络[J]. 吉林大学学报(工学版), 2024, 54(7): 2072-2079.
[6] 孙铭会,薛浩,金玉波,曲卫东,秦贵和. 联合时空注意力的视频显著性预测[J]. 吉林大学学报(工学版), 2024, 54(6): 1767-1776.
[7] 魏晓辉,王晨洋,吴旗,郑新阳,于洪梅,岳恒山. 面向脉动阵列神经网络加速器的软错误近似容错设计[J]. 吉林大学学报(工学版), 2024, 54(6): 1746-1755.
[8] 王殿伟,张池,房杰,许志杰. 基于高分辨率孪生网络的无人机目标跟踪算法[J]. 吉林大学学报(工学版), 2024, 54(5): 1426-1434.
[9] 王宇,赵凯. 基于亚像素定位的人体姿态热图后处理[J]. 吉林大学学报(工学版), 2024, 54(5): 1385-1392.
[10] 高云龙,任明,吴川,高文. 基于注意力机制改进的无锚框舰船检测模型[J]. 吉林大学学报(工学版), 2024, 54(5): 1407-1416.
[11] 夏超,王梦佳,朱剑月,杨志刚. 基于分层卷积自编码器的钝体湍流流场降阶分析[J]. 吉林大学学报(工学版), 2024, 54(4): 874-882.
[12] 陈仁祥,胡超超,胡小林,杨黎霞,张军,何家乐. 基于改进YOLOv5的驾驶员分心驾驶检测[J]. 吉林大学学报(工学版), 2024, 54(4): 959-968.
[13] 张云佐,郭威,李文博. 遥感图像密集小目标全方位精准检测算法[J]. 吉林大学学报(工学版), 2024, 54(4): 1105-1113.
[14] 王宏志,宋明轩,程超,解东旋. 基于改进YOLOv4-tiny算法的车距预警方法[J]. 吉林大学学报(工学版), 2024, 54(3): 741-748.
[15] 李雄飞,宋紫萱,朱芮,张小利. 基于多尺度融合的遥感图像变化检测模型[J]. 吉林大学学报(工学版), 2024, 54(2): 516-523.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李寿涛, 李元春. 在未知环境下基于递阶模糊行为的移动机器人控制算法[J]. 吉林大学学报(工学版), 2005, 35(04): 391 -397 .
[2] 刘庆民,王龙山,陈向伟,李国发. 滚珠螺母的机器视觉检测[J]. 吉林大学学报(工学版), 2006, 36(04): 534 -538 .
[3] 李红英;施伟光;甘树才 .

稀土六方Z型铁氧体Ba3-xLaxCo2Fe24O41的合成及电磁性能与吸波特性

[J]. 吉林大学学报(工学版), 2006, 36(06): 856 -0860 .
[4] 张全发,李明哲,孙刚,葛欣 . 板材多点成形时柔性压边与刚性压边方式的比较[J]. 吉林大学学报(工学版), 2007, 37(01): 25 -30 .
[5] 杨树凯,宋传学,安晓娟,蔡章林 . 用虚拟样机方法分析悬架衬套弹性对
整车转向特性的影响
[J]. 吉林大学学报(工学版), 2007, 37(05): 994 -0999 .
[6] 冯金巧;杨兆升;张林;董升 . 一种自适应指数平滑动态预测模型[J]. 吉林大学学报(工学版), 2007, 37(06): 1284 -1287 .
[7] 车翔玖,刘大有,王钲旋 .

两张NURBS曲面间G1光滑过渡曲面的构造

[J]. 吉林大学学报(工学版), 2007, 37(04): 838 -841 .
[8] 刘寒冰,焦玉玲,,梁春雨,秦卫军 . 无网格法中形函数对计算精度的影响[J]. 吉林大学学报(工学版), 2007, 37(03): 715 -0720 .
[9] .

吉林大学学报(工学版)2007年第4期目录

[J]. 吉林大学学报(工学版), 2007, 37(04): 0 .
[10] 李月英,刘勇兵,陈华 . 凸轮材料的表面强化及其摩擦学特性
[J]. 吉林大学学报(工学版), 2007, 37(05): 1064 -1068 .