吉林大学学报(理学版) ›› 2024, Vol. 62 ›› Issue (2): 391-0398.

• • 上一篇    下一篇

基于多模态融合的深度神经网络图像复原方法

李伟伟1, 王丽妍2, 傅博2, 王娟1, 黄虹1   

  1. 1. 山东青年政治学院 信息工程学院, 济南 250103; 2. 辽宁师范大学 计算机与人工智能学院, 辽宁 大连 116081
  • 收稿日期:2022-08-07 出版日期:2024-03-26 发布日期:2024-03-26
  • 通讯作者: 傅博 E-mail:fubo@lnnu.edu.cn

Deep Neural Network Image Restoration Method Based on Multimodal Fusion 

LI Weiwei1, WANG Liyan2, FU Bo2, WANG Juan1, HUANG Hong1   

  1. 1. School of Information Engineering, Shandong Youth University of Political Science, Jinan 250103, China;2. School of Computer and Artificial Intelligence, Liaoning Normal University, Dalian 116081, Liaoning Province, China
  • Received:2022-08-07 Online:2024-03-26 Published:2024-03-26

摘要: 针对水下图像成像环境复杂常受偏色等因素干扰而影响后续图像分析的问题, 提出一种基于多尺度特征与三重注意力多模态融合的深度卷积神经网络图像复原方法. 首先, 深度卷积神经网络在抽取图像空间特征的基础上, 引入图像多尺度变换特征; 其次, 通过通道注意力、 监督注意力和非局部注意力, 挖掘图像特征的尺度间相关性、 特征间相关性; 最后, 通过设计多模态特征融合机制, 将上述两类特征有效融合. 在公开的水下图像测试集上进行测试并与当前主流方法进行对比的实验结果表明, 该方法在峰值信噪比、 结构相似性等定量对比以及颜色、 细节等定性对比上都优于对比方法.

关键词: 多模态融合, 深度神经网络, 三重注意力,  , 图像复原

Abstract: Aiming at the problems of the complicated underwater image imaging environment resulted in the subsequent image analysis often being affected by color bias and other factors, we proposed a deep convolutional neural network image restoration method based on multi-scale features and triple attention multimodal fusion. Firstly, the deep convolutional neural network introduced the image multi-scale transformation feature on the basis of extracting the image spatial feature. Secondly, by using channel attention, supervised attention and non-local attention, the scale correlation and feature correlation of image features were mined. Finally, by designing a multimodal feature fusion mechanism, the above two types of features could be effectively fused. The proposed method was tested on the open underwater image test set and compared with the current mainstream methods. The results show that this method is superior to the comparison method in quantitative comparison such as peak signal-to-noise ratio and structural similarity, as well as qualitative comparison such as color and details.

Key words: multimodal fusion, deep neural network, triple attention, image restoration

中图分类号: 

  • TP391