吉林大学学报(信息科学版) ›› 2025, Vol. 43 ›› Issue (2): 276-287.

• • 上一篇    下一篇

基于残差网络的特征融合方法

蒲 巍, 李文辉
  

  1. 吉林大学 计算机科学与技术学院, 长春 130012
  • 收稿日期:2024-01-10 出版日期:2025-04-08 发布日期:2025-04-10
  • 通讯作者: 李文辉(1961— ),男,长春人,吉林大学教授,博士生导师,主要从事计算机图形学、图像处理和多媒体技术研究, (Tel)86-431-85166855(E-mail)liwh@ jlu. edu. cn。 E-mail:liwh@ jlu. edu. cn。
  • 作者简介:蒲巍(1999— ), 男, 吉林白山人, 吉林大学硕士研究生, 主要从事计算机视觉研究, ( Tel) 86-13596715670 ( E-mail)1723095751@ qq. com
  • 基金资助:
    吉林省科技发展计划基金资助项目(20230201082GX)

Feature Fusion Method Based on ResNet

PU Wei, LI Wenhui   

  1. College of Computer Science and Technology, Jilin University, Changchun 130012, China
  • Received:2024-01-10 Online:2025-04-08 Published:2025-04-10

摘要: 针对残差网络存在特征冗余、有效感受野不足等问题, 提出了特征融合模块。该模块可以在模型通道扩增过程中, 实现不同尺度特征融合, 从而构建出信息更丰富的多尺度特征, 提高通道利用率。并且该模块还包含了少量大核卷积, 其有助于扩大模型的有效感受野, 实现性能和计算效率的平衡。同时还提出了轻量化的下采样和混排压缩模块, 可充分降低模型的参数, 使整个方法更高效。将特征融合下采样以及混排压缩模块引入残差网络可以构建出特征融合网络(FFNet: Feature Fusion Network)。 其具有更快的收敛速度、更大的有效感受野, 以及更好的性能表现。经在 CIFAR( Canadian Institute for Advanced Research)、 ImageNet 以及 COCO(Microsoft Common Objects in Context)数据集的大量实验结果证明了其能在仅增加少量参数和 FLOPs(Floating Point Operations)的前提下, 在分类、目标检测以及实例分割任务上使其性能显著提升。

关键词: 特征融合, 残差网络, 卷积神经网络, 大核卷积

Abstract: As the most widely adopted backbone network in classification, object detection and instance segmentation tasks, the representation capability of ResNet ( Residual Neural Network) has gained extensive recognition. However, there are still certain limitations that hinder the representation ability of ResNet, including feature redundancy and inadequate effective receptive field. To address these problems, a feature fusion block is proposed, which can fuse features of different scales to construct multi-scale features with richer information and improve channel utilization, when the model channel is increased. The block employs a small number of large kernel convolutions, which is benefit to the expansion of the effective receptive field of the model and the trade-off between performance and computational efficiency. And a lightweight downsampling block and a shuffle compression block are also proposed, which can effectively reduce the parameters of the model and make the entire method more efficient. The feature fusion block, downsampling block and shuffling compression block are introduced to the ResNet can build a FFNet(Feature Fusion Network), which will have faster convergence speed and a larger effective receptive field and better performance. Extensive experimental results on CIFAR (Canadian
Institute for Advanced Research ), ImageNet and COCO ( Microsoft Common Objects in Context ) datasets demonstrate that the feature fusion network can bring significant performance improvements in classification, object detection and instance segmentation tasks while only adding a small number of parameters and FLOPs(Floating Point Operations).

Key words: feature fusion, ResNet, convolutional neural network, large kernel convolution

中图分类号: 

  • TP183