Journal of Jilin University(Engineering and Technology Edition) ›› 2025, Vol. 55 ›› Issue (6): 2122-2130.doi: 10.13229/j.cnki.jdxbgxb.20230991

Previous Articles     Next Articles

Unbalanced image classification algorithm based on fine⁃grained analysis

Ping-ping LIU1,2(),Wen-li SHANG3,Xiao-yu XIE1,Xiao-kang YANG3   

  1. 1.College of Computer Science and Technology,Jilin University,Changchun 130012,China
    2.Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education,Jilin University,Changchun 130012,China
    3.College of Software,Jilin University,Changchun 130012,China
  • Received:2023-09-15 Online:2025-06-01 Published:2025-07-23

Abstract:

Aiming at the complexity and diversity of fine-grained images, where traditional image classification methods exhibit limitations in focusing on fine-grained attributes and perform poorly when handling imbalanced datasets, a threshold-based fine-grained image classification algorithm utilizing deep metric learning was proposed. The focus on fine-grained attributes of images was enhanced by introducing a metric learning approach. Additionally, the classification accuracy was enhanced and the model convergence was expedited by incorporating pairwise loss and agent loss mechanisms. To address the issue of data imbalance, a classifier was devised grounded in threshold analysis techniques. This innovative classifier harnesses threshold analysis to facilitate multi-level classification of fine-grained images, thereby ameliorating the issue of low classification accuracy for certain categories within an imbalanced dataset. The results of these experiments unequivocally demonstrate that the proposed threshold classification algorithm for fine-grained images, based on deep metric learning, outperforms alternative methods in terms of classification accuracy.

Key words: computer application, deep metric learning, fine-grained classification, unbalanced data, threshold classifier

CLC Number: 

  • TP391

Fig.1

Overall training structure of the threshold classification framework"

Fig.2

Schematic diagram of threshold classification method"

Fig.3

Different types of images in the dataset"

Fig.4

Number of images of each type in the dataset"

Table 1

Classification performance of different embedding size under each metric loss function"

Embedding SizeMulti-similarityProxy-NCA
ODIR5K

COVID

Radiography

ODIR5K

COVID

Radiography

1694.6393.8290.8792.85
3294.2493.3791.0393.82
6493.7794.2491.1093.96
12894.9495.6291.1394.35
25693.3194.5690.2894.21
51293.8594.2791.5993.62
1 02494.0193.6690.8693.24

Table 2

Performance and convergence rate of different metric losses on ODIR5K dataset"

损失函数准确率/%收敛迭代次数
Triplet+CE91.052.0 k
Contrastive+CE93.311.9 k
Multi-similarity+CE93.522.5 k
Proxy-NCA+CE92.811.3 k
Proxy-Anchor+CE91.131.0 k
本文方法95.631.5 k

Table 3

Performance of different metric loss functions on the ODIR5K dataset"

损失函数准确率F1精准率召回率
Cross Entropy92.2292.5093.2092.00
Triplet+CE91.0590.5095.5087.00
Contrastive+CE93.3193.7095.2092.30
Multi-Similarity+CE93.5294.7397.3292.82
Proxy-NCA+CE92.8193.6096.8091.40
Proxy-Anchor+CE91.1391.6093.4090.10
本文方法95.6395.3097.6093.30

Table 4

Performance of different metric loss functions on the COVID Radiography dataset"

损失函数准确率F1精准率召回率
Cross Entropy94.1993.4796.0194.10
Triplet+CE92.9491.9896.9390.12
Contrastive+CE95.2194.9796.2495.64
Multi-Similarity+CE95.6296.5298.1795.79
Proxy-NCA+CE94.3595.6297.7895.92
Proxy-Anchor+CE92.6693.6195.2193.62
本文方法96.5793.4095.6091.30

Table 5

Performance of different classifiers on the ODIR5K dataset"

类别准确率
本文MLPNN
年龄相关性黄斑变性99.995.994.8
白内障10099.196.5
糖尿病96.888.291.5
青光眼99.593.088.7
高血压99.883.792.7
近视99.997.099.0
正常95.293.693.8
其他疾病98.289.392.5

Table 6

Performance of different classifiers on the COVID Radiography dataset"

类别准确率
本文MLPNN
正常95.5893.397.3
肺部浑浊96.8696.597.8
新冠肺炎97.5090.794.6
病毒性肺炎96.6096.889.8

Table 7

Classifier performance under different thresholds on the ODIR5K dataset"

阈值准确率F1精准率召回率
0.4687.5089.2985.0092.31
0.4789.3690.4187.9493.65
0.4895.6395.3097.6093.30
0.4992.6094.3291.8796.21
0.5090.9693.6291.1896.43
0.5180.9785.6480.1592.67

Table 8

Classifier performance under different thresholds on the COVID Radiography dataset"

阈值准确率F1精准率召回率
0.4689.5190.6187.2094.30
0.4791.3892.6189.8795.54
0.4896.5793.4095.6091.30
0.4994.5695.5692.8598.44
0.5092.9195.1092.2198.19
0.5182.8287.9182.1394.58

Table 9

Outcomes of metrics under different categories in the fundus dataset"

类别准确率F1精准率召回率
年龄相关性黄斑变性99.998.9100.097.9
白内障100.0100100.0100.0
糖尿病96.892.396.988.2
青光眼99.591.8100.084.8
高血压99.892.795.090.5
近视99.999.0100.098.0
正常95.295.291.699.0
其他疾病98.292.697.388.3

Table 10

Outcomes of metrics under different categories in the pneumonia dataset"

类别准确率F1精准率召回率
正常95.5897.8399.1296.58
肺部浑浊96.8696.3695.6097.15
新冠肺炎97.5098.9498.3399.63
病毒性肺炎96.6093.3392.7293.96

Table 11

Performance of different methods on the ODIR5K dataset"

方法准确率F1精准率召回率
EfficientNetB31592.00-71.0066.00
MCGS-Net16-89.6665.8861.60
BFPC-Net1794.2394.1697.0993.23
DSRACNN1887.9088.1688.50-
本文95.6395.3097.6093.30

Table 12

Performance of different methods on the COVID Radiography dataset"

方法准确率F1精准率召回率
COVID-CAPS1995.70---
DRNN2092.191.193.01-
本文96.5793.4095.6091.30
[1] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2016: 770-778.
[2] Huang Z Z, Zhang J P, Shan H M. When age-invariant face recognition meets face age synthesis: a multi-task learning framework[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Kuala Lumpur, Malaysia, 2021: 7282-7291.
[3] Ji R, Wen L, Zhang L, et al. Attention convolutional binary neural tree for fine-grained visual categorization[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 10468-10477.
[4] Wei X S, Xie C W, Wu J, et al. Mask-CNN: localizing parts and selecting descriptors for fine-grained bird species categorization[J]. Pattern Recognition, 2018, 76: 704-714.
[5] Zheng H, Fu J, Zha Z J, et al. Learning deep bilinear transformation for fine-grained image representation[J]. Advances in Neural Information Processing Systems, 2019, 32: No.03621.
[6] Chang D, Ding Y, Xie J, et al. The devil is in the channels: Mutual-channel loss for fine-grained image classification[J]. IEEE Transactions on Image Processing, 2020, 29: 4683-4695.
[7] Bera A, Wharton Z, Liu Y, et al. SR-GNN: spatial relation-aware graph neural network for fine-grained image categorization[J]. IEEE Transactions on Image Processing, 2022, 31: 6017-6031.
[8] Sundgaard J V, Harte J, Bray P, et al. Deep metric learning for otitis media classification[J]. Medical Image Analysis, 2021, 71: No.102034.
[9] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16x16 words: transformers for image recognition at scale[J/OL].[2023-08-11].
[10] Guo H, Wang S. Long-tailed multi-label visual recognition by collaborative training on uniform and re-balanced samplings[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Kuala Lumpur, Malaysia, 2021: 15089-15098.
[11] Movshovitz-Attias Y, Toshev A, Leung T K, et al. No fuss distance metric learning using proxies[C]∥Proceedings of the IEEE International Conference on Computer Vision, Hawaii, USA, 2017: 360-368.
[12] Wang X, Han X, Huang W, et al. Multi-similarity loss with general pair weighting for deep metric learning[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Los Angeles, USA, 2019: 5022-5030.
[13] International Competition on Ocular Disease Intelligent Recognition[EB/OL]. [2021-11-18].
[14] Rahman T, Khandakar A, Qiblawey Y, et al. Exploring the effect of image enhancement techniques on COVID-19 detection using chest X-ray images[J]. Computers in Biology and Medicine,2021,132:No.104319.
[15] Wang J, Yang L, Huo Z, et al. Multi-label classification of fundus images with efficientnet[J]. IEEE Access, 2020, 8: 212499-212508.
[16] Lin J, Cai Q, Lin M. Multi-label classification of fundus images with graph convolutional network and self-supervised learning[J]. IEEE Signal Processing Letters, 2021, 28: 454-458.
[17] Li Z, Xu M, Yang X, et al. Multi-label fundus image classification using attention mechanisms and feature fusion[J]. Micromachines, 2022, 13(6): No.947.
[18] Yang X, Yi S. Multi-classification of fundus diseases based on DSRA-CNN[J]. Biomedical Signal Processing and Control, 2022, 77: No.103763.
[19] Afshar P, Heidarian S, Naderkhani F, et al. Covid-caps: a capsule network-based framework for identification of COVID-19 cases from X-ray images[J]. Pattern Recognition Letters, 2020, 138: 638-643.
[20] Panahi A, Askari M R, Akrami M, et al. Deep residual neural network for COVID-19 detection from chest X-ray images[J]. SN Computer Science, 2022, 3(2): No.169.
[1] Jian WANG,Chen-wei JIA. Trajectory prediction model for intelligent connected vehicle [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(6): 1963-1972.
[2] Xiang-jiu CHE,Yu-peng SUN. Graph node classification algorithm based on similarity random walk aggregation [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(6): 2069-2075.
[3] Feng-feng ZHOU,Zhe GUO,Yu-si FAN. Feature representation algorithm for imbalanced classification of multi⁃omics cancer data [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(6): 2089-2096.
[4] Hai-peng CHEN,Shi-bo ZHANG,Ying-da LYU. Multi⁃scale context⁃aware and boundary⁃guided image manipulation detection method [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(6): 2114-2121.
[5] Zi-hao SHEN,Yong-sheng GAO,Hui WANG,Pei-qian LIU,Kun LIU. Deep deterministic policy gradient caching method for privacy protection in Internet of Vehicles [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(5): 1638-1647.
[6] You-wei WANG,Ao LIU,Li-zhou FENG. New method for text sentiment classification based on knowledge distillation and comment time [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(5): 1664-1674.
[7] Hong-wei ZHAO,Ming-zhu ZHOU,Ping-ping LIU,Qiu-zhan ZHOU. Medical image segmentation based on confident learning and collaborative training [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(5): 1675-1681.
[8] Yue HOU,Jin-song GUO,Wei LIN,Di ZHANG,Yue WU,Xin ZHANG. Multi-view video speed extraction method that can be segmented across lane demarcation lines [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(5): 1692-1704.
[9] Jun WANG,Chang-fu SI,Kai-peng WANG,Qiang FU. Intrusion detection method based on ensemble learning and feature selection by PSO-GA [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(4): 1396-1405.
[10] Tao XU,Shuai-di KONG,Cai-hua LIU,Shi LI. Overview of heterogeneous confidential computing [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(3): 755-770.
[11] Meng-xue ZHAO,Xiang-jiu CHE,Huan XU,Quan-le LIU. A method for generating proposals of medical image based on prior knowledge optimization [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(2): 722-730.
[12] Xiao-dong CAI,Qing-song ZHOU,Yan-yan ZHANG,Yun XUE. Social recommendation based on global capture of dynamic, static and relational features [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(2): 700-708.
[13] Xiang-jiu CHE,Yu-ning WU,Quan-le LIU. A weighted isomorphic graph classification algorithm based on causal feature learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(2): 681-686.
[14] Xiao-ran GUO,Tie-jun WANG,Yue YAN. Entity relationship extraction method based on local attention and local remote supervision [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(1): 307-315.
[15] Hao WANG,Bin ZHAO,Guo-hua LIU. Temporal and motion enhancement for video action recognition [J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(1): 339-346.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Liu Xu-zong,Liu Shu-bin,Zheng Wei,An Qi . Mulitichannel synchronous serial transmission method of hit information
in Beijing spectrometer Ⅲ TOF trigger system
[J]. 吉林大学学报(工学版), 2008, 38(02): 483 -0488 .
[2] SUN En-chang TIAN Bin, ZHANG Dong-ying, YI Ke-chu . Performance analysis of STBC-QOTDM over spatially correlated channels
[J]. 吉林大学学报(工学版), 2009, 39(02): 514 -0518 .
[3] MO Xiu-Ling, MIAO Yu, ZHAO Xiao-Hui. Improvement algorithm of exclusion region in MAC for UWB local network[J]. 吉林大学学报(工学版), 2010, 40(02): 560 -0565 .
[4] GAO Rong,YE Pei-qing,JIANG Ke-rong,LI Wen. Vibration signal processing of motor spindle based on wavelet singularity[J]. 吉林大学学报(工学版), 2010, 40(04): 1025 -1028 .
[5] WEI Ke-xin,DU Ming-xing. Temperature calculation method of IGBT modules based on inverse heat conduction problem[J]. 吉林大学学报(工学版), 2011, 41(6): 1743 -1747 .
[6] WANG Xiao-wei, WANG Dian-hai, JIANG Sheng, JIN Sheng. Isolated intersection control based on hybrid optimization model[J]. 吉林大学学报(工学版), 2012, 42(增刊1): 170 -174 .
[7] Wang Chao,Song Ke-zhu,Tang Jin . High performance data acquisition system for marine seismic survey[J]. 吉林大学学报(工学版), 2007, 37(01): 168 -172 .
[8] Liu Han-bing;Zhang Miao;Wei Jian.

Multigrid method for structure dynamic response analysis

[J]. 吉林大学学报(工学版), 2008, 38(03): 619 -0623 .
[9] . [J]. 吉林大学学报(工学版), 2005, 35(02): 122 -0126 .
[10] . [J]. 吉林大学学报(工学版), 2005, 35(01): 1 -0006 .