吉林大学学报(工学版) ›› 2025, Vol. 55 ›› Issue (8): 2681-2692.doi: 10.13229/j.cnki.jdxbgxb.20231299

• 计算机科学与技术 • 上一篇    

基于MFF-STDC网络的室外复杂环境小目标语义分割方法

艾青林(),刘元宵,杨佳豪   

  1. 浙江工业大学 特种装备制造与先进加工技术教育部/浙江省重点实验室,杭州 310023
  • 收稿日期:2023-11-24 出版日期:2025-08-01 发布日期:2025-11-14
  • 通讯作者: 艾青林 E-mail:aqlaql@163.com
  • 基金资助:
    国家自然科学基金项目(52075488);浙江省自然科学基金项目(LY20E050023)

Small target swmantic segmentation method based MFF-STDC network in complex outdoor environments

Qing-lin AI(),Yuan-xiao LIU,Jia-hao YANG   

  1. Key Laboratory of Special Purpose Equipment and Advanced Manufacturing Technology,Ministry of Education and Zhejiang Province,Zhejiang University of Technology,Hangzhou 310023,China
  • Received:2023-11-24 Online:2025-08-01 Published:2025-11-14
  • Contact: Qing-lin AI E-mail:aqlaql@163.com

摘要:

针对轻量化网络在复杂环境中小目标类别物体分割效果较差的问题,本文搭建了基于多层级特征融合的MFF-STDC网络模型。首先,通过多次叠加基于分组卷积的特征提取模块,使网络特征提取能力提升。其次,通过分层权重注意力优化模块与通道注意力(CA)机制,提升多尺度特征信息的融合能力。最后,建立基于自适应复制算法的A-Cityscapes数据集、A-IDD数据集以及Field数据集,增加数据集中小目标类别的数量,并完成训练与测试。MFF-STDC网络与STDC对比,mIoU分别提升了4.01%、3.65%、2.94%,并且对复杂环境中小目标类别的分割效果远好于其他网络。搭建实景测试实验平台,测试结果表明,MFF-STDC网络有效提升了小目标类别的语义分割精度与分类能力,并且满足实时性要求。

关键词: 计算机应用, 小目标类别检测, 多层级特征融合, CA机制, 自适应复制算法

Abstract:

The MFF-STDC network model based on multi-level feature fusion is built in this paper to solve the problem that lightweight networks have weak segmentation effect for small target category objects in complex environment. Firstly,by superimposing the feature extraction module based on group convolution many times, the feature extraction capability of the network is improved. Secondly, the combination ability of multi-scale feature information is improved by hierarchical attention module and CA mechanism. Lastly,A-Cityscapes dataset A-IDD dataset and Field dataset were built based on adaptive replication algorithm, the number of small target categories in the dataset was increased, and training and testing were completed. The MFF-STDC network improves the mIoU by 4.01%, 3.65%, and 2.94% respectively comparing with the STDC, and segmentation of the small target categories in the complex environment is much better than that of other networks. A real-world testing experimental platform is built, and the test results show that the MFF-STDC network effectively improves the semantic segmentation accuracy and classification ability of small target categories, and meets the real-time requirements.

Key words: computer applications, small target category detection, multi-level feature fusion, coordinate attention mechanism, adaptive replication algorithm

中图分类号: 

  • TP391

图1

MFF-STDC网络整体结构"

图2

DLPRM的结构"

图3

HAM的结构"

图4

MIC模块的结构"

图5

改进后的CA-FFM"

图6

空间目标在平面上投影示意图"

图7

同一图像内的复制数据增强效果"

图8

跨图像复制数据增强效果"

图9

野外地形数据集部分图片及其标注"

表1

分类结果混淆矩阵及其参数"

实际情况预测结果
正例反例
正例TP(真正例)FN(假反例)
反例FP(假正例)TN(真反例)

表2

不同网络在不同数据集上测试的mIoU"

模型mIoU/%
A-CityscapesA-IDDField
SegNet44.3630.7229.49
ENet57.9343.1238.64
BiSeNet69.2354.3144.91
DeepLabV3+(MV2)72.1457.7147.79
Segformer71.3256.1347.12
STDC71.8556.3446.84
本文网络(MFF-STDC)75.8659.9949.78

图10

不同网络在A-Cityscapes数据集上的预测结果"

图11

不同网络在A-IDD数据集上的预测结果"

图12

不同网络在Field数据集上的预测结果"

表3

不同网络的精度和模型大小"

模型mIoU/%picAcc/%Params/M
SegNet44.3659.3229.5
ENet57.9366.730.4
BiSeNet69.2378.6559.24
DeepLabV3+(MV2)71.6483.8245.57
Segformer70.3281.143.72
STDC71.8581.327.08
本文网络(MFF-STDC)75.8683.875.43

表4

消融实验"

实验组别DW-STDCCA-FFMHAMMICmIoU/%Params/MFLOPs/G
Exp 1××××71.858.5716.95
Exp 2×××72.675.3613.72
Exp 3×××72.318.57616.953
Exp 4×××72.098.56116.95
Exp 5×××72.358.66117.702
Exp 675.865.4314.48

表5

不同分组数与层数的模型参数量与精度"

分组数各阶段模块层数

参数量

Params/M

计算量

Flops/G

精度

mIoU/%

Stage3Stage4Stage5
12228.5716.9571.85
Cout/162224.3412.0371.33
Cout/164535.3613.7272.67
Cout/167957.0816.2673.02
Cout/41224.4312.2070.89
Cout/42435.5114.0871.98
Cout/42867.3516.9072.42
Cout1224.3412.0369.83
Cout2435.3613.7271.04
Cout2867.0816.2671.78

图13

实景测试系统与测试系统工作现场"

图14

实际环境测试结果"

[1] 张艳, 张明路, 吕晓玲, 等. 深度学习小目标检测算法研究综述[J]. 计算机工程与应用, 2022, 58(15): 1-17.
Zhang Yan, Zhang Ming-lu, Xiao-ling Lyu, et al. Review of research on small target detection based on deep learning[J]. Computer Engineering and Applications, 2022, 58(15): 1-17.
[2] Takikawa T, Acuna D, Jampani V, et al. Gated-SCNN: Gated shape CNN for semantic segmentation[C]∥Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway,N J: IEEE, 2019: 5229-5238.
[3] Chen L C, Papandreou G, Kokkinos I, et al. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
[4] 杨玉敏, 廖育荣, 林存宝, 等. 轻量化卷积神经网络目标检测算法综述[J]. 舰船电子工程, 2021, 41(4): 31-36.
Yang Yu-min, Liao Yu-rong, Lin Cun-bao, et al. A survey of object detection algorithms for lightweight convolutional neural networks[J]. Ship Electronic Engineering, 2021, 41(4): 31-36.
[6] Yu C, Wang J, Peng C, et al. BiSeNet: bilateral segmentation network for real-time semantic segmentation[C]∥Proceedings of the European Conference on Computer Vision (ECCV). Munich: IEEE, 2018: 334-349.
[5] Fan M, Lai S, Huang J, et al. Rethinking bisenet for real-time semantic segmentation[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Piscataway,N J:IEEE, 2021: 9716-9725.
[7] Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway,N J: IEEE, 2021: 13708-13717.
[8] Ioannou Y, Robertson D, Cipolla R, et al. Deep roots: improving CNN efficiency with hierarchical filter groups[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway,N J: IEEE, 2017: 5977-5986.
[9] 霍光,林大为,刘元宁,等.基于多尺度特征和注意力机制的轻量级虹膜分割模型[J].吉林大学学报: 工学版, 2023, 53(9): 2591-2600.
Huo Guang, Lin Da-wei, Liu Yuan-ning, et al. Lightweight iris segmentation model based on multiscale feature and attention mechanism[J]. Journal of Jilin University (Engineering and Technology Edition),2023, 53(9): 2591-2600.
[10] Cordts M, Omran M, Ramos S, et al. The cityscapes dataset for semantic urban scene understanding[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway,N J: IEEE, 2016: 3213-3223.
[11] Varma G, Subramanian A, Namboodiri A, et al. IDD: A dataset for exploring problems of autonomous navigation in unconstrained environments[C]∥Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV). Piscataway,N J:IEEE, 2019: 1743-1751.
[12] Shi Q, Liu M, Li S, et al. A deeply supervised attention metric-based network and an open aerial image dataset for remote sensing change detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1-16.
[13] Badrinarayanan V, Kendall A, Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
[14] Paszke A, Chaurasia A, Kim S, et al. ENet: a deep neural network architecture for real-time semantic segmentation[J]. Arxiv Preprint, 2016, 6: No. 160602147.
[15] Xie E, Wang W, Yu Z, et al. SegFormer: simple and efficient design for semantic segmentation with transformers[J]. Advances in Neural Information Processing Systems, 2021, 34: 12077-12090.
[1] 刘琼昕,王甜甜,王亚男. 非支配排序粒子群遗传算法解决车辆位置路由问题[J]. 吉林大学学报(工学版), 2025, 55(7): 2464-2474.
[2] 车翔玖,李良. 融合全局与局部细粒度特征的图相似度度量算法[J]. 吉林大学学报(工学版), 2025, 55(7): 2365-2371.
[3] 李文辉,杨晨. 基于对比学习文本感知的小样本遥感图像分类[J]. 吉林大学学报(工学版), 2025, 55(7): 2393-2401.
[4] 庄珊娜,王君帅,白晶,杜京瑾,王正友. 基于三维卷积与自注意力机制的视频行人重识别[J]. 吉林大学学报(工学版), 2025, 55(7): 2409-2417.
[5] 赵宏伟,周伟民. 基于数据增强的半监督单目深度估计框架[J]. 吉林大学学报(工学版), 2025, 55(6): 2082-2088.
[6] 车翔玖,孙雨鹏. 基于相似度随机游走聚合的图节点分类算法[J]. 吉林大学学报(工学版), 2025, 55(6): 2069-2075.
[7] 陈海鹏,张世博,吕颖达. 多尺度感知与边界引导的图像篡改检测方法[J]. 吉林大学学报(工学版), 2025, 55(6): 2114-2121.
[8] 周丰丰,郭喆,范雨思. 面向不平衡多组学癌症数据的特征表征算法[J]. 吉林大学学报(工学版), 2025, 55(6): 2089-2096.
[9] 王健,贾晨威. 面向智能网联车辆的轨迹预测模型[J]. 吉林大学学报(工学版), 2025, 55(6): 1963-1972.
[10] 刘萍萍,商文理,解小宇,杨晓康. 基于细粒度分析的不均衡图像分类算法[J]. 吉林大学学报(工学版), 2025, 55(6): 2122-2130.
[11] 王友卫,刘奥,凤丽洲. 基于知识蒸馏和评论时间的文本情感分类新方法[J]. 吉林大学学报(工学版), 2025, 55(5): 1664-1674.
[12] 赵宏伟,周明珠,刘萍萍,周求湛. 基于置信学习和协同训练的医学图像分割方法[J]. 吉林大学学报(工学版), 2025, 55(5): 1675-1681.
[13] 申自浩,高永生,王辉,刘沛骞,刘琨. 面向车联网隐私保护的深度确定性策略梯度缓存方法[J]. 吉林大学学报(工学版), 2025, 55(5): 1638-1647.
[14] 侯越,郭劲松,林伟,张迪,武月,张鑫. 分割可跨越车道分界线的多视角视频车速提取方法[J]. 吉林大学学报(工学版), 2025, 55(5): 1692-1704.
[15] 王军,司昌馥,王凯鹏,付强. 融合集成学习技术和PSO-GA算法的特征提取技术的入侵检测方法[J]. 吉林大学学报(工学版), 2025, 55(4): 1396-1405.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!