基于MFF-STDC网络的室外复杂环境小目标语义分割方法

doi:10.13229/j.cnki.jdxbgxb.20231299

摘要/Abstract

摘要：

针对轻量化网络在复杂环境中小目标类别物体分割效果较差的问题，本文搭建了基于多层级特征融合的MFF-STDC网络模型。首先，通过多次叠加基于分组卷积的特征提取模块，使网络特征提取能力提升。其次，通过分层权重注意力优化模块与通道注意力（CA）机制，提升多尺度特征信息的融合能力。最后，建立基于自适应复制算法的A-Cityscapes数据集、A-IDD数据集以及Field数据集，增加数据集中小目标类别的数量，并完成训练与测试。MFF-STDC网络与STDC对比，mIoU分别提升了4.01%、3.65%、2.94%，并且对复杂环境中小目标类别的分割效果远好于其他网络。搭建实景测试实验平台，测试结果表明，MFF-STDC网络有效提升了小目标类别的语义分割精度与分类能力，并且满足实时性要求。

关键词: 计算机应用, 小目标类别检测, 多层级特征融合, CA机制, 自适应复制算法

Abstract:

The MFF-STDC network model based on multi-level feature fusion is built in this paper to solve the problem that lightweight networks have weak segmentation effect for small target category objects in complex environment. Firstly，by superimposing the feature extraction module based on group convolution many times， the feature extraction capability of the network is improved. Secondly， the combination ability of multi-scale feature information is improved by hierarchical attention module and CA mechanism. Lastly，A-Cityscapes dataset A-IDD dataset and Field dataset were built based on adaptive replication algorithm， the number of small target categories in the dataset was increased， and training and testing were completed. The MFF-STDC network improves the mIoU by 4.01%， 3.65%， and 2.94% respectively comparing with the STDC， and segmentation of the small target categories in the complex environment is much better than that of other networks. A real-world testing experimental platform is built， and the test results show that the MFF-STDC network effectively improves the semantic segmentation accuracy and classification ability of small target categories， and meets the real-time requirements.

Key words: computer applications, small target category detection, multi-level feature fusion, coordinate attention mechanism, adaptive replication algorithm

中图分类号:

TP391

艾青林,刘元宵,杨佳豪. 基于MFF-STDC网络的室外复杂环境小目标语义分割方法[J]. 吉林大学学报(工学版), 2025, 55(8): 2681-2692.

Qing-lin AI,Yuan-xiao LIU,Jia-hao YANG. Small target swmantic segmentation method based MFF-STDC network in complex outdoor environments[J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(8): 2681-2692.

图/表 19

图1

图2

图3

图4

图5

图6

图7

图8

图9

表1

表2

图10

图11

图12

表3

表4

表5

图13

图14

参考文献 15

[1]	张艳, 张明路, 吕晓玲, 等. 深度学习小目标检测算法研究综述[J]. 计算机工程与应用, 2022, 58(15): 1-17.
	Zhang Yan, Zhang Ming-lu, Xiao-ling Lyu, et al. Review of research on small target detection based on deep learning[J]. Computer Engineering and Applications, 2022, 58(15): 1-17.
[2]	Takikawa T, Acuna D, Jampani V, et al. Gated-SCNN: Gated shape CNN for semantic segmentation[C]∥Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway,N J: IEEE, 2019: 5229-5238.
[3]	Chen L C, Papandreou G, Kokkinos I, et al. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
[4]	杨玉敏, 廖育荣, 林存宝, 等. 轻量化卷积神经网络目标检测算法综述[J]. 舰船电子工程, 2021, 41(4): 31-36.
	Yang Yu-min, Liao Yu-rong, Lin Cun-bao, et al. A survey of object detection algorithms for lightweight convolutional neural networks[J]. Ship Electronic Engineering, 2021, 41(4): 31-36.
[6]	Yu C, Wang J, Peng C, et al. BiSeNet: bilateral segmentation network for real-time semantic segmentation[C]∥Proceedings of the European Conference on Computer Vision (ECCV). Munich: IEEE, 2018: 334-349.
[5]	Fan M, Lai S, Huang J, et al. Rethinking bisenet for real-time semantic segmentation[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Piscataway,N J:IEEE, 2021: 9716-9725.
[7]	Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway,N J: IEEE, 2021: 13708-13717.
[8]	Ioannou Y, Robertson D, Cipolla R, et al. Deep roots: improving CNN efficiency with hierarchical filter groups[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway,N J: IEEE, 2017: 5977-5986.
[9]	霍光,林大为,刘元宁,等.基于多尺度特征和注意力机制的轻量级虹膜分割模型[J].吉林大学学报: 工学版, 2023, 53(9): 2591-2600.
	Huo Guang, Lin Da-wei, Liu Yuan-ning, et al. Lightweight iris segmentation model based on multiscale feature and attention mechanism[J]. Journal of Jilin University (Engineering and Technology Edition),2023, 53(9): 2591-2600.
[10]	Cordts M, Omran M, Ramos S, et al. The cityscapes dataset for semantic urban scene understanding[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway,N J: IEEE, 2016: 3213-3223.
[11]	Varma G, Subramanian A, Namboodiri A, et al. IDD: A dataset for exploring problems of autonomous navigation in unconstrained environments[C]∥Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV). Piscataway,N J:IEEE, 2019: 1743-1751.
[12]	Shi Q, Liu M, Li S, et al. A deeply supervised attention metric-based network and an open aerial image dataset for remote sensing change detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1-16.
[13]	Badrinarayanan V, Kendall A, Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
[14]	Paszke A, Chaurasia A, Kim S, et al. ENet: a deep neural network architecture for real-time semantic segmentation[J]. Arxiv Preprint, 2016, 6: No. 160602147.
[15]	Xie E, Wang W, Yu Z, et al. SegFormer: simple and efficient design for semantic segmentation with transformers[J]. Advances in Neural Information Processing Systems, 2021, 34: 12077-12090.

相关文章 15

[1]	刘琼昕,王甜甜,王亚男. 非支配排序粒子群遗传算法解决车辆位置路由问题[J]. 吉林大学学报(工学版), 2025, 55(7): 2464-2474.
[2]	车翔玖,李良. 融合全局与局部细粒度特征的图相似度度量算法[J]. 吉林大学学报(工学版), 2025, 55(7): 2365-2371.
[3]	李文辉,杨晨. 基于对比学习文本感知的小样本遥感图像分类[J]. 吉林大学学报(工学版), 2025, 55(7): 2393-2401.
[4]	庄珊娜,王君帅,白晶,杜京瑾,王正友. 基于三维卷积与自注意力机制的视频行人重识别[J]. 吉林大学学报(工学版), 2025, 55(7): 2409-2417.
[5]	赵宏伟,周伟民. 基于数据增强的半监督单目深度估计框架[J]. 吉林大学学报(工学版), 2025, 55(6): 2082-2088.
[6]	车翔玖,孙雨鹏. 基于相似度随机游走聚合的图节点分类算法[J]. 吉林大学学报(工学版), 2025, 55(6): 2069-2075.
[7]	陈海鹏,张世博,吕颖达. 多尺度感知与边界引导的图像篡改检测方法[J]. 吉林大学学报(工学版), 2025, 55(6): 2114-2121.
[8]	周丰丰,郭喆,范雨思. 面向不平衡多组学癌症数据的特征表征算法[J]. 吉林大学学报(工学版), 2025, 55(6): 2089-2096.
[9]	王健,贾晨威. 面向智能网联车辆的轨迹预测模型[J]. 吉林大学学报(工学版), 2025, 55(6): 1963-1972.
[10]	刘萍萍,商文理,解小宇,杨晓康. 基于细粒度分析的不均衡图像分类算法[J]. 吉林大学学报(工学版), 2025, 55(6): 2122-2130.
[11]	王友卫,刘奥,凤丽洲. 基于知识蒸馏和评论时间的文本情感分类新方法[J]. 吉林大学学报(工学版), 2025, 55(5): 1664-1674.
[12]	赵宏伟,周明珠,刘萍萍,周求湛. 基于置信学习和协同训练的医学图像分割方法[J]. 吉林大学学报(工学版), 2025, 55(5): 1675-1681.
[13]	申自浩,高永生,王辉,刘沛骞,刘琨. 面向车联网隐私保护的深度确定性策略梯度缓存方法[J]. 吉林大学学报(工学版), 2025, 55(5): 1638-1647.
[14]	侯越,郭劲松,林伟,张迪,武月,张鑫. 分割可跨越车道分界线的多视角视频车速提取方法[J]. 吉林大学学报(工学版), 2025, 55(5): 1692-1704.
[15]	王军,司昌馥,王凯鹏,付强. 融合集成学习技术和PSO-GA算法的特征提取技术的入侵检测方法[J]. 吉林大学学报(工学版), 2025, 55(4): 1396-1405.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

实际情况	预测结果
实际情况	正例	反例
正例	TP（真正例）	FN（假反例）
反例	FP（假正例）	TN（真反例）

模型	mIoU/%
模型	A-Cityscapes	A-IDD	Field
SegNet	44.36	30.72	29.49
ENet	57.93	43.12	38.64
BiSeNet	69.23	54.31	44.91
DeepLabV3+（MV2）	72.14	57.71	47.79
Segformer	71.32	56.13	47.12
STDC	71.85	56.34	46.84
本文网络（MFF-STDC）	75.86	59.99	49.78

模型	mIoU/%	picAcc/%	Params/M
SegNet	44.36	59.32	29.5
ENet	57.93	66.73	0.4
BiSeNet	69.23	78.65	59.24
DeepLabV3+（MV2）	71.64	83.82	45.57
Segformer	70.32	81.14	3.72
STDC	71.85	81.32	7.08
本文网络（MFF-STDC）	75.86	83.87	5.43

实验组别	DW-STDC	CA-FFM	HAM	MIC	mIoU/%	Params/M	FLOPs/G
Exp 1	×	×	×	×	71.85	8.57	16.95
Exp 2	√	×	×	×	72.67	5.36	13.72
Exp 3	×	√	×	×	72.31	8.576	16.953
Exp 4	×	×	√	×	72.09	8.561	16.95
Exp 5	×	×	×	√	72.35	8.661	17.702
Exp 6	√	√	√	√	75.86	5.43	14.48

分组数	各阶段模块层数			参数量 Params/M	计算量 Flops/G	精度 mIoU/%
分组数	Stage3	Stage4	Stage5	参数量 Params/M	计算量 Flops/G	精度 mIoU/%
1	2	2	2	8.57	16.95	71.85
C_out/16	2	2	2	4.34	12.03	71.33
C_out/16	4	5	3	5.36	13.72	72.67
C_out/16	7	9	5	7.08	16.26	73.02
C_out/4	1	2	2	4.43	12.20	70.89
C_out/4	2	4	3	5.51	14.08	71.98
C_out/4	2	8	6	7.35	16.90	72.42
C_out	1	2	2	4.34	12.03	69.83
C_out	2	4	3	5.36	13.72	71.04
C_out	2	8	6	7.08	16.26	71.78