吉林大学学报(理学版) ›› 2024, Vol. 62 ›› Issue (6): 1447-1454.

• • 上一篇    下一篇

基于多粒度融合和双注意力的细粒度图像分类

李鹏松1, 周冰倩1, 季芷伊1, 于永平2   

  1. 1. 东北电力大学 理学院, 吉林 吉林 132012; 2. 吉林大学 建设工程学院, 长春 130021
  • 收稿日期:2023-10-13 出版日期:2024-11-26 发布日期:2024-11-26
  • 通讯作者: 周冰倩 E-mail:1099480882@qq.com

Fine-Grained Image Classification Based on Multi Granularity Fusion and Dual Attention

LI Pengsong1, ZHOU Bingqian1, JI Zhiyi1, YU Yongping2   

  1. 1. School of Science, Northeast Electric Power University, Jilin 132012, Jilin Province, China;2. College of Construction Engineering, 
    Jilin University, Changchun 130021, China
  • Received:2023-10-13 Online:2024-11-26 Published:2024-11-26

摘要: 针对现有模型对细粒度图像关键信息精准识别较难, 分类指标较单一且特征利用不充分的问题, 提出一个新的细粒度图像分类网络模型. 该模型在网络训练步骤中嵌入双注意力网络以强化中层特征与深度特征的相关性, 根据网络不同层的感受野大小不同将数据剪裁后再拼接成新的样本数据作为下一层输入, 采用支持向量机分类器将中层和深度特征输出结果一同作为最终分类指标. 在3个经典数据集CUB-200-2011、 Stanford Cars和102 Category Flower上的实验结果表明, 其分类准确率分别达89.56%,95.00%,96.05%, 相比于其他网络模型有较好的分类准确率和泛化能力.

关键词: 细粒度图像分类, 注意力机制, 数据增强, 多粒度特征融合

Abstract: Aiming at the problems that it was difficult to accurately identify the key information of fine-grained images, the classification index was relatively simple and the feature utilization was not sufficient in existing models, we  proposed a new  fine-grained image classification network model. In the network training step, the model embedded a dual attention network to strengthen the correlation between middle-level features and depth features. According to the different receptive field sizes of different layers of the network, the data were trimmed and then spliced into new sample data as the input for the next layer. The support vector machine classifier was used to take the output results of middle-level features and depth features together as the final classification index.  The experimental results  on three classic datasets CUB-200-2011, Stanford Cars and 102 Category Flower show that the classification accuracy reaches 89.56%, 95.00% and 96.05%, respectively. Compared with other network models, it has better classification accuracy and generalization ability.

Key words: fine-grained image classification, attention mechanism, data augmentation, multi granularity feature fusion

中图分类号: 

  • TP391.41