基于细粒度分析的不均衡图像分类算法

doi:10.13229/j.cnki.jdxbgxb.20230991

摘要/Abstract

摘要：

针对细粒度属性图像具有复杂性和多样性，传统的图像分类方法在关注图像细粒度属性方面存在不足，并在处理不均衡数据集时表现不佳的问题，提出了一种基于深度度量学习的细粒度图像阈值分类算法。通过引入度量学习方法增强对图像细粒度属性的关注。同时，通过应用成对损失和代理损失，提高了模型的分类准确性并加快了模型的收敛速度。为了应对数据不均衡问题，设计了一个基于阈值分析的分类器。该分类器利用阈值分析技术实现了对细粒度图像的多级分类，从而改善了在不均衡数据集中少数类别分类准确性较低的问题。实验结果表明，本文所提出的基于深度度量学习的细粒度图像阈值分类算法在分类准确性方面显著优于其他方法。

关键词: 计算机应用, 深度度量学习, 细粒度分类, 不均衡数据, 阈值分类器

Abstract:

Aiming at the complexity and diversity of fine-grained images， where traditional image classification methods exhibit limitations in focusing on fine-grained attributes and perform poorly when handling imbalanced datasets， a threshold-based fine-grained image classification algorithm utilizing deep metric learning was proposed. The focus on fine-grained attributes of images was enhanced by introducing a metric learning approach. Additionally， the classification accuracy was enhanced and the model convergence was expedited by incorporating pairwise loss and agent loss mechanisms. To address the issue of data imbalance， a classifier was devised grounded in threshold analysis techniques. This innovative classifier harnesses threshold analysis to facilitate multi-level classification of fine-grained images， thereby ameliorating the issue of low classification accuracy for certain categories within an imbalanced dataset. The results of these experiments unequivocally demonstrate that the proposed threshold classification algorithm for fine-grained images， based on deep metric learning， outperforms alternative methods in terms of classification accuracy.

Key words: computer application, deep metric learning, fine-grained classification, unbalanced data, threshold classifier

中图分类号:

TP391

刘萍萍,商文理,解小宇,杨晓康. 基于细粒度分析的不均衡图像分类算法[J]. 吉林大学学报(工学版), 2025, 55(6): 2122-2130.

Ping-ping LIU,Wen-li SHANG,Xiao-yu XIE,Xiao-kang YANG. Unbalanced image classification algorithm based on fine⁃grained analysis[J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(6): 2122-2130.

图/表 16

图1

图2

图3

图4

表1

表2

表3

表4

表5

表6

表7

表8

表9

表10

表11

表12

参考文献 20

[1]	He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2016: 770-778.
[2]	Huang Z Z, Zhang J P, Shan H M. When age-invariant face recognition meets face age synthesis: a multi-task learning framework[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Kuala Lumpur, Malaysia, 2021: 7282-7291.
[3]	Ji R, Wen L, Zhang L, et al. Attention convolutional binary neural tree for fine-grained visual categorization[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 10468-10477.
[4]	Wei X S, Xie C W, Wu J, et al. Mask-CNN: localizing parts and selecting descriptors for fine-grained bird species categorization[J]. Pattern Recognition, 2018, 76: 704-714.
[5]	Zheng H, Fu J, Zha Z J, et al. Learning deep bilinear transformation for fine-grained image representation[J]. Advances in Neural Information Processing Systems, 2019, 32: No.03621.
[6]	Chang D, Ding Y, Xie J, et al. The devil is in the channels: Mutual-channel loss for fine-grained image classification[J]. IEEE Transactions on Image Processing, 2020, 29: 4683-4695.
[7]	Bera A, Wharton Z, Liu Y, et al. SR-GNN: spatial relation-aware graph neural network for fine-grained image categorization[J]. IEEE Transactions on Image Processing, 2022, 31: 6017-6031.
[8]	Sundgaard J V, Harte J, Bray P, et al. Deep metric learning for otitis media classification[J]. Medical Image Analysis, 2021, 71: No.102034.
[9]	Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16x16 words: transformers for image recognition at scale[J/OL].[2023-08-11].
[10]	Guo H, Wang S. Long-tailed multi-label visual recognition by collaborative training on uniform and re-balanced samplings[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Kuala Lumpur, Malaysia, 2021: 15089-15098.
[11]	Movshovitz-Attias Y, Toshev A, Leung T K, et al. No fuss distance metric learning using proxies[C]∥Proceedings of the IEEE International Conference on Computer Vision, Hawaii, USA, 2017: 360-368.
[12]	Wang X, Han X, Huang W, et al. Multi-similarity loss with general pair weighting for deep metric learning[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Los Angeles, USA, 2019: 5022-5030.
[13]	International Competition on Ocular Disease Intelligent Recognition[EB/OL]. [2021-11-18].
[14]	Rahman T, Khandakar A, Qiblawey Y, et al. Exploring the effect of image enhancement techniques on COVID-19 detection using chest X-ray images[J]. Computers in Biology and Medicine,2021,132:No.104319.
[15]	Wang J, Yang L, Huo Z, et al. Multi-label classification of fundus images with efficientnet[J]. IEEE Access, 2020, 8: 212499-212508.
[16]	Lin J, Cai Q, Lin M. Multi-label classification of fundus images with graph convolutional network and self-supervised learning[J]. IEEE Signal Processing Letters, 2021, 28: 454-458.
[17]	Li Z, Xu M, Yang X, et al. Multi-label fundus image classification using attention mechanisms and feature fusion[J]. Micromachines, 2022, 13(6): No.947.
[18]	Yang X, Yi S. Multi-classification of fundus diseases based on DSRA-CNN[J]. Biomedical Signal Processing and Control, 2022, 77: No.103763.
[19]	Afshar P, Heidarian S, Naderkhani F, et al. Covid-caps: a capsule network-based framework for identification of COVID-19 cases from X-ray images[J]. Pattern Recognition Letters, 2020, 138: 638-643.
[20]	Panahi A, Askari M R, Akrami M, et al. Deep residual neural network for COVID-19 detection from chest X-ray images[J]. SN Computer Science, 2022, 3(2): No.169.

相关文章 15

[1]	王健,贾晨威. 面向智能网联车辆的轨迹预测模型[J]. 吉林大学学报(工学版), 2025, 55(6): 1963-1972.
[2]	车翔玖,孙雨鹏. 基于相似度随机游走聚合的图节点分类算法[J]. 吉林大学学报(工学版), 2025, 55(6): 2069-2075.
[3]	周丰丰,郭喆,范雨思. 面向不平衡多组学癌症数据的特征表征算法[J]. 吉林大学学报(工学版), 2025, 55(6): 2089-2096.
[4]	赵宏伟,周伟民. 基于数据增强的半监督单目深度估计框架[J]. 吉林大学学报(工学版), 2025, 55(6): 2082-2088.
[5]	陈海鹏,张世博,吕颖达. 多尺度感知与边界引导的图像篡改检测方法[J]. 吉林大学学报(工学版), 2025, 55(6): 2114-2121.
[6]	申自浩,高永生,王辉,刘沛骞,刘琨. 面向车联网隐私保护的深度确定性策略梯度缓存方法[J]. 吉林大学学报(工学版), 2025, 55(5): 1638-1647.
[7]	王友卫,刘奥,凤丽洲. 基于知识蒸馏和评论时间的文本情感分类新方法[J]. 吉林大学学报(工学版), 2025, 55(5): 1664-1674.
[8]	赵宏伟,周明珠,刘萍萍,周求湛. 基于置信学习和协同训练的医学图像分割方法[J]. 吉林大学学报(工学版), 2025, 55(5): 1675-1681.
[9]	侯越,郭劲松,林伟,张迪,武月,张鑫. 分割可跨越车道分界线的多视角视频车速提取方法[J]. 吉林大学学报(工学版), 2025, 55(5): 1692-1704.
[10]	王军,司昌馥,王凯鹏,付强. 融合集成学习技术和PSO-GA算法的特征提取技术的入侵检测方法[J]. 吉林大学学报(工学版), 2025, 55(4): 1396-1405.
[11]	徐涛,孔帅迪,刘才华,李时. 异构机密计算综述[J]. 吉林大学学报(工学版), 2025, 55(3): 755-770.
[12]	赵孟雪,车翔玖,徐欢,刘全乐. 基于先验知识优化的医学图像候选区域生成方法[J]. 吉林大学学报(工学版), 2025, 55(2): 722-730.
[13]	蔡晓东,周青松,张言言,雪韵. 基于动静态和关系特征全局捕获的社交推荐模型[J]. 吉林大学学报(工学版), 2025, 55(2): 700-708.
[14]	车翔玖,武宇宁,刘全乐. 基于因果特征学习的有权同构图分类算法[J]. 吉林大学学报(工学版), 2025, 55(2): 681-686.
[15]	郭晓然,王铁君,闫悦. 基于局部注意力和本地远程监督的实体关系抽取方法[J]. 吉林大学学报(工学版), 2025, 55(1): 307-315.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Embedding Size	Multi-similarity		Proxy-NCA
Embedding Size	ODIR5K	COVID Radiography	ODIR5K	COVID Radiography
16	94.63	93.82	90.87	92.85
32	94.24	93.37	91.03	93.82
64	93.77	94.24	91.10	93.96
128	94.94	95.62	91.13	94.35
256	93.31	94.56	90.28	94.21
512	93.85	94.27	91.59	93.62
1 024	94.01	93.66	90.86	93.24

损失函数	准确率/%	收敛迭代次数
Triplet+CE	91.05	2.0 k
Contrastive+CE	93.31	1.9 k
Multi-similarity+CE	93.52	2.5 k
Proxy-NCA+CE	92.81	1.3 k
Proxy-Anchor+CE	91.13	1.0 k
本文方法	95.63	1.5 k

损失函数	准确率	F₁	精准率	召回率
Cross Entropy	92.22	92.50	93.20	92.00
Triplet+CE	91.05	90.50	95.50	87.00
Contrastive+CE	93.31	93.70	95.20	92.30
Multi-Similarity+CE	93.52	94.73	97.32	92.82
Proxy-NCA+CE	92.81	93.60	96.80	91.40
Proxy-Anchor+CE	91.13	91.60	93.40	90.10
本文方法	95.63	95.30	97.60	93.30

损失函数	准确率	F₁	精准率	召回率
Cross Entropy	94.19	93.47	96.01	94.10
Triplet+CE	92.94	91.98	96.93	90.12
Contrastive+CE	95.21	94.97	96.24	95.64
Multi-Similarity+CE	95.62	96.52	98.17	95.79
Proxy-NCA+CE	94.35	95.62	97.78	95.92
Proxy-Anchor+CE	92.66	93.61	95.21	93.62
本文方法	96.57	93.40	95.60	91.30

类别	准确率
类别	本文	MLP	NN
年龄相关性黄斑变性	99.9	95.9	94.8
白内障	100	99.1	96.5
糖尿病	96.8	88.2	91.5
青光眼	99.5	93.0	88.7
高血压	99.8	83.7	92.7
近视	99.9	97.0	99.0
正常	95.2	93.6	93.8
其他疾病	98.2	89.3	92.5