吉林大学学报(信息科学版) ›› 2026, Vol. 44 ›› Issue (3): 598-608.

• • 上一篇    下一篇

基于 Vision Transformer 的眼睑遮挡虹膜识别

夏志城1a, 刘元宁1a,1b, 朱晓冬1a,1b, 刘 震1a,2, 陈 英3, 郭志民1a   

  1. 1. 吉林大学 a. 计算机科学与技术学院;b. 符号计算与知识工程教育部重点实验室,长春130012; 2. 长崎综合科学大学研究生院工学研究科,长崎851-0193;3. 南昌航空大学 软件学院,南昌330036
  • 收稿日期:2025-04-11 出版日期:2026-06-02 发布日期:2026-06-02
  • 作者简介:夏志城(2000— ), 男, 河南信阳人, 吉林大学硕士研究生, 主要从事生物识别研究, (Tel)86-18240525708(E-mail) xzc201908@163. com; 刘元宁(1962— ), 男, 长春人, 吉林大学教授, 博士生导师, 主要从事生物识别研究, (Tel)86- 13904336786(E-mail)liuyn@ jlu. edu. cn。
  • 基金资助:
    国家自然科学基金资助项目(61471181); 国家重点研发计划基金资助项目(国科发资[2020]151 ); 江西省自然科学 基金资助项目(20242BAB26015)

Iris Recognition with Eyelid Occlusion Based on Vision Transformer

XIA Zhicheng1a, LIU Yuanning1a,1b, ZHU Xiaodong1a,1b, LIU Zhen1a,2, CHEN Ying3, GUO Zhimin1a   

  1. 1a. College of Computer Science and Technology; 1b. Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China; 2. Graduate School of Engineering, Nagasaki Institute of Applied Science, Nagasaki 851-0193, Japan; 3. School of Software, Nanchang Hangkong University, Nanchang 330036, China
  • Received:2025-04-11 Online:2026-06-02 Published:2026-06-02

摘要: 针对虹膜识别过程中存在眼睑遮挡影响识别性能的问题提出基于 ViT(Vision Transformer)的解决方案。首先提出特征融合模块(FFM:Feature Fusion Module), 实现不同尺度特征提取与融合, 解决特征提取过程中信息丢失问题其次用最小化重构损失对局部特征编码器进行预训练避免相同主导特征的异类虹膜构成三元组, 此先验知识使模型参数调整具备一定可解释性同时以 ViT 和残差块为核心构建交互式编码结构,将来自不同虹膜块的信息高效融合形成全面特征表达最后改进传统三元组损失融合阈值概念为训练模型提供更明确的学习方向。实验结果表明所提方法能有效去除遮挡对虹膜识别的负面影响显著提升识别性能。

关键词: 虹膜识别, 视觉Transformer,  特征融合,  三元组损失

Abstract: To address the issue of eyelid occlusion affecting recognition performance in iris recognition, a solution based on ViT(Vision Transformer) is proposed. Firstly, a FFM(Feature Fusion Module) is proposed to achieve feature extraction and fusion at different scales, solving the problem of information loss during feature extraction. Secondly, the local feature encoder is pre-trained by minimizing reconstruction loss to avoid forming triplets with heterogeneous irises sharing the same dominant features. This prior knowledge endows the model parameter adjustment with certain interpretability. An interactive encoding structure is constructed with ViT and residual blocks as the core, efficiently fusing information from different iris blocks to form comprehensive feature representation. Finally, the traditional triplet loss is improved by incorporating the threshold concept, providing a clearer learning direction for model training. Experimental results show that the proposed method can effectively eliminate the negative impact of occlusion on iris recognition and significantly improve recognition performance.

Key words: iris recognition, Vision Transformer, feature fusion, triplet loss

中图分类号: 

  • TP391