RAUGAN：基于循环生成对抗网络的红外图像彩色化方法

doi:10.13229/j.cnki.jdxbgxb.20240012

摘要/Abstract

摘要：

针对近红外图像彩色化过程中的色彩失真、语义模糊和纹理形状不清晰的问题，提出了一种红外图像彩色化方法（RAUGAN）。该算法首先改进了CycleGAN网络的生成器，设计并融合了一种Res-ASPP-Unet网络，将空洞空间金字塔池化（ASPP）在原始UNet的Skip connection结构处连接，使解码分支中的不同尺度输出特征图都能与编码器中对应的输出特征图相结合；其次，设计了由残差块与通道和空间注意力模块（CBAM）构成的深度瓶颈层块替换UNet网络中的瓶颈层，用于增强局部区域特征，提高其识别能力；最后，在判别网络中引用感知损失函数从而解决色彩恢复失真的问题。实验结果表明：该方法彩色化效果明显优于其他方法。

关键词: 计算机应用, 红外图像彩色化, 循环生成对抗网络, 空洞空间金字塔池化, 注意力模块

Abstract:

In order to solve the problems of color distortion， semantic ambiguity， unclear texture and shape characteristics in the process of near-infrared image colorization， we propose a method of infrared image colorization （RAUGAN） . Firstly， the CycleGAN network generator is improved， and a Res-ASPP-UNet network is designed and fused， which connects atrous spatial pyramid pooling（ASPP） to Skip connection structure of the original UNet， the output characteristic graphs of different scales in the decoding branch can be combined with the corresponding output characteristic graphs in the encoder. Secondly， a deep bottleneck layer composed of residual block， convolutional block attention module （CBAM） is designed to replace the bottleneck layer in UNET network to enhance the local area feature， improve its recognition ability and prevent gradient explosion. Finally， the perceptual loss function is introduced in the discriminant network to solve the problem of color restoration distortion.Experimental results show that the proposed method is superior to the original CycleGAN network.

Key words: computer application, infrared image colorization, cycle-consistent generative adversarial networks, atrous spatial pyramid pooling, attention mechanism

中图分类号:

TP391.41

朴燕,康继元. RAUGAN：基于循环生成对抗网络的红外图像彩色化方法[J]. 吉林大学学报(工学版), 2025, 55(8): 2722-2731.

Yan PIAO,Ji-yuan KANG. RAUGAN：infrared image colorization method based on cycle generative adversarial networks[J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(8): 2722-2731.

图/表 14

图1

图2

表1

生成器网络配置"

network layer	Output	Convolution（Conv）、stride（s）、Padding（p）
Input	256×256
ConvBlock 1	256×256	$3 × 3 C o n v B a t c h N o r m R e L U$ ，s=1，p=1
DBlock 1	64×64	$3 × 3 C o n v B a t c h N o r m L e a k y R e L U$ ，s=2，p=1
DBlock 2	125×128	$3 × 3 C o n v B a t c h N o r m L e a k y R e L U$ ，s=2，p=1
DBlock 3	256×256	$3 × 3 C o n v B a t c h N o r m L e a k y R e L U$ ，s=2，p=1
Output	256×256	$3 × 3 C o n v B a t c h N o r m R e L U$ ，s=2，p=2

表1

图3

图4

图5

图6

图7

图8

图9

表2

表3

表4

图10

参考文献 27

[1]	Reinhard E, Adhikhmin M, Gooch B, et al. Color transfer between images[J]. IEEE Computer Graphics and Applications, 2001, 21(5): 34-41.
[2]	Welsh T, Ashikhmin M, Mueller K. Transferring color to greyscale images[J]. ACM Transactions on Graphics, 2002, 21(3): 277-280.
[3]	Levin A, Lischinski D, Weiss Y. Colorization using optimization[J]. ACM Transactions on Graphics, 2004, 23(3): 689-694.
[4]	Goodfellow I J, Pouget A J, Mirza M, et al.Generative adversarial nets[C]∥Advances in Neural Information Processing Systems, Montréal, Canada, 2014:2672-2680.
[5]	Isola P, Zhu J, Zhou T, et al. Image-to-image translation with conditional adversarial networks[C]∥Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition, Las Vegas, USA, 2016: 1125-1134.
[6]	Park T, Liu M Y, Wang T C, et al. Semantic imagesynthesis with spatially-adaptive normalization[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 2337-2346.
[7]	Wang T C, Liu M Y, Zhu J Y, et al. High-resolutionimage synthesis and semantic manipulation withconditional gans[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 8798-8807.
[8]	Huang X, Liu M Y, Belongie S, et al. Multimodal unsupervised image-to-image translation[C]∥Proceedings of the European Conference on Computer Vision, Munichi, Germany, 2018:172-189.
[9]	Lee H Y, Tseng H Y, Huang J B, et al. Diverse image-to-image translation via disentangled representations[C]∥Proceedings of the European Conference on Computer Vision, Munichi, Germany, 2019:35-51.
[10]	Yi Z, Zhang H, Tan P, et al. Dualgan: unsupervised dual learning for image-to-image translation[C]∥Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2849-2857.
[11]	Suárez P L, Sappa A D, Vintimilla B X. Infrared image colorization based on a triplet dcgan architecture[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, USA, 2017: 212-217.
[12]	王晓宇.基于颜色迁移的图像彩色化算法研究[D]. 长春: 长春理工大学电子信息工程学院, 2020.
	Wang Xiao-yu. Research on image colorization algorithm based on color migration[D]. Changchun: School of Electronic and Information Engineering, Changchun University of Science and Technology, 2020.
[13]	Li S, Han B F, Yu Z J, et al. I2VGAN:unpaired Infrared to-visible video translation[C]∥Proceedings of the 29th ACM International　Conference on　Multimedia, New York,USA, 2021:3061-3069.
[14]	高美玲, 段锦, 莫苏新, 等. 基于空洞循环卷积的近红外图像彩色化方法[J]. 光学技术, 2022, 48(6):742-748.
	Gao Mei-ling, Duan Jin, Mo Su-xin, et al. Near infrared image colorization method based on dilated-cycle convolution[J]. Optical Technique, 2022, 48(6): 742-748.
[15]	Chen J, Chen J, Chao H, et al. Image blind denoising with generative adversarial network based noise modeling[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018:3155-3164.
[16]	陈雪云, 许韬, 黄小巧. 基于条件生成对抗网络的医学细胞图像生成检测方法[J]. 吉林大学学报: 工学版, 2021, 51(4): 1414-1419.
	Chen Xue-yun, Xu Tao, Huang Xiao-qiao. Detection method of medical cell image generation basedon conditional generative adversarial network[J]. Journal of Jilin University (Engineering and Technology Edition), 2021, 51(4): 1414-1419.
[17]	王小玉, 胡鑫豪, 韩昌林. 基于生成对抗网络的人脸铅笔画算法[J]. 吉林大学学报: 工学版, 2021, 51(1): 285-292.
	Wang Xiao-yu, Hu Xin-hao, Han Chang-lin. Face pencil drawing algorithms based on generative adversarial network[J]. Journal of Jilin University (Engineering nd Technology Edition), 2021, 51(1): 285-292.
[18]	Monday H N, Li J, Nneji G U, et al. A wavelet convolutional capsule network with modified super resolution generative adversarial network for fault diagnosis and classification[J]. Complex & Intelligent Systems, 2022, 8: 4831-4847.
[19]	彭晏飞, 张平甲, 高艺, 等. 融合注意力的生成式对抗网络单图像超分辨率重建[J]. 激光与光电子学进展, 2021, 58(20): 182-191.
	Peng Yan-fei, Zhang Ping-jia, Gao Yi, et al. Attention fusion generative adversarial network for single-image super-resolution reconstruction[J]. Laser &Optoelectronics Progress, 2021, 58(20): 182-191.
[20]	Liang C C, George P, Iasonas K, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
[21]	Woo S, Park J, Lee J Y, et al. CBAM: convolutional block attention module[C]∥Proceedings of the 15thEurope an Conference on Computer Visdom, Munich, Germany, 2018: 3-19.
[22]	Li C, Wan D M. Precomputed real-time texture synthesis with markovian generative adversarial networks[C]∥European Conference on Computer Vision of IEEE, Amsterdam, The Netherlands, 2016:702-716.
[23]	Wang S, Park J, Kim N, et al. Multispectral pedestrian detection: benchmark dataset and baseline[C]∥IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1037-1045.
[24]	Park T, Liu M Y, Wang T C, et al. Semantic image synthesis with spatially-adaptive normalization[J]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019:2337-2346.
[25]	Zhu J Y, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial network[C]∥IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2223-2232.
[26]	Kancharagunta B K, Ram S D. CDGAN: cyclic discriminative generative adversarial networks forimage-to-image transformation[J]. Journal of Visual Communication and Image Representation, 2022, 82:103382.
[27]	Gou Y. Multi-feature contrastive learning for unpaired image-to-image translation[J]. Intelligent Systems, 2023, 9: 4111-4122.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

算法	PSNR/dB	SSIM
CycleGAN	27.64	0.82
CycleGAN+ASPP	31.34	0.91
CycleGAN+CBAM	29.65	0.87
CycleGAN+ASPP+CBAM	33.27	0.93
CycleGAN+ASPP+CBAM+感知损失函数	33.12	0.96