吉林大学学报(信息科学版) ›› 2025, Vol. 43 ›› Issue (2): 258-264.

• • 上一篇    下一篇

基于改进 GAN 算法的文本匹配生成图像模型

徐熠玮1, 陈 刚2   

  1. 1. 浙江工商大学 萨塞克斯人工智能学院, 杭州 310018; 2. 武汉大学 国际网络安全学院, 武汉 430072
  • 收稿日期:2023-06-01 出版日期:2025-04-08 发布日期:2025-04-09
  • 通讯作者: 陈刚(1970— ),男,武汉人,武汉大学教授、博士,主要从事网络安全、Web技术与应用、人工智能研究,(Tel)86-15925786868(E-mail)chenzuolin@whu.edu.cn。 E-mail:chenzuolin@whu.edu.cn。
  • 作者简介:徐熠玮(1995— ),男,浙江丽水人,浙江工商大学硕士研究生,主要从事人工智能、计算机视觉与图像处理、AIGC研究( Tel)86-15925786868(E-mail)eddyxu82@gmail.com
  • 基金资助:
    湖北省自然科学基金资助项目(2021CFB299)

Text Matching Image Generation Model Based on Improved GAN Algorithm

XU Yiwei1, CHEN Gang2   

  1. 1. School of Sussex Artificial Intelligence Institute, Zhejiang Gongshang University, Hangzhou 310018, China;2. School of Cyber Science and Engineering-WHU, Wuhan University, Wuhan 430072, China
  • Received:2023-06-01 Online:2025-04-08 Published:2025-04-09

摘要: 为有效提升文本匹配生成图像的视觉效果和匹配程度, 提出一种基于改进 GAN(Generative Adversarial Network)算法的文本匹配生成图像模型。通过混合索引树对文本和图像进行初匹配; 在生成对抗网络(GAN)的基础上对其改进, 形成基于交叉注意力机制编码的对抗生成网络, 采用改进的 GAN 建立文本匹配生成图像模型。通过双向长短期记忆(LSTM: Long Short-Term Memory)网络优化模型中的交叉注意力编码器将文本信息和视觉信息进行翻译和对齐处理, 获取文本和图像之间的跨模态映射关系, 完成文本和图像之间的精细化匹配, 最终生成满足文本需求的图像。实验结果表明, 该模型可以生成图像细节与文本匹配且质量更高的图像。

关键词: 改进GAN算法, 文本匹配, 生成图像模型

Abstract: In order to effectively improve the visual effect and matching degree of text matching generated images, a text matching generated image model based on improved GAN( Generating Adversarial Networks) algorithm is proposed. Initial matching of text and images are unfolded through a mixed index tree. On the basis of GAN, they are improved to form an adversarial generation network based on cross attention mechanism encoding, and the improved GAN is used to establish a text matching image generation model. The cross attention encoder in the bidirectional LSTM( Long Short-Term Memory) network optimization model is used to translate and align text and visual information, obtaining cross modal mapping relationships between text and images, completing fine matching between text and images, and ultimately generating images that meet the requirements of the text. The experimental results show that the proposed model can generate images with higher quality that match image details with text.

Key words: improved generating adversarial networks(GAN) algorithm, text matching, image generation model

中图分类号: 

  • TP183