吉林大学学报(理学版) ›› 2022, Vol. 60 ›› Issue (4): 897-905.

• • 上一篇    下一篇

基于循环生成对抗网络的人脸素描合成

葛延良, 孙笑笑, 张乔, 王冬梅, 王肖肖   

  1. 东北石油大学 电气信息工程学院, 黑龙江 大庆 163318
  • 收稿日期:2021-08-23 出版日期:2022-07-26 发布日期:2022-07-26
  • 通讯作者: 孙笑笑 E-mail:3076266954@qq.com

Face Sketch Synthesis Based on Cycle-Generative Adversarial Networks

GE Yanliang, SUN Xiaoxiao, ZHANG Qiao, WANG Dongmei, WANG Xiaoxiao   

  1. School of Electrical and Information Engineering, Northeast Petroleum University, Daqing 163318, Heilongjiang Province, China
  • Received:2021-08-23 Online:2022-07-26 Published:2022-07-26

摘要: 针对当前卷积神经网络通常以降低感受野为条件获得多尺度图像特征, 以及很难捕获各特征通道之间重要关系的问题, 结合循环生成对抗网络结构的特点提出一种新的多尺度自注意力机制的循环生成对抗网络. 首先, 在生成器中使用VGG16模块组成U-Net结构网络, 以增强对图像特征信息的提取, 同时对网络中的下采样和上采样进行改进, 以提高特征分辨率,  获取更多的细节信息; 其次, 设计多尺度特征聚合模块, 采用不同采样率的多个并行空洞卷积, 整合了不同尺度上的空间信息, 在保持图像较大感受野的同时, 多比例地捕捉图像信息; 最后, 为捕获空间维度和通道维度中的特征依赖关系, 设计像素自注意力模块对空间维度和通道维度上的语义依赖关系进行建模, 以增强图像特征的表现能力, 提高生成素描图像的质量.

关键词: 深度学习, 循环生成对抗网络, 空洞卷积, 多尺度特征聚合模块, 像素自注意力模块

Abstract: Aiming at the problem that the current convolutional neural networks usually  obtained multi-scale image features on the conditio
n of reducing receptive fields, and it was difficult to capture the important relationship between channels.  Combined with the features of cycle-generative adversarial networks structure, we proposed a new cycle-generative adversarial networks with multi-scale and self-attention mechanism. Firstly, VGG16 module was used to form U-Net structure in the generator to enhance the extraction of image feature information. At the same time, the down-sampling  and up-sampling  in the network were improved to improve the feature resolution and obtain more detailed information. Secondly, a multi-scale feature fusion block was designed. The multiple parallel dilated convolutions with different sampling rates were used to integrate the spatial information on different scales, and capture image information in multiple proportions while maintaining  a large receptive field of the image. Finally, in or
der to capture the feature dependencies in the spatial dimension and channel dimension, the pixel self-attention module was designed to model the semantic dependencies in the spatial dimension and channel dimension, so as to enhance the representation ability of image features and improve the quality of the generated sketch images.

Key words:  deep learning, cycle-generative adversarial networks, dilated convolution, multi-scale feature fusion block, pixel self-attention module

中图分类号: 

  • TP391