吉林大学学报(工学版) ›› 2021, Vol. 51 ›› Issue (5): 1785-1791.doi: 10.13229/j.cnki.jdxbgxb20200532

• 计算机科学与技术 • 上一篇    

基于滑动特征向量的小样本图像分类方法

曹洁1,2(),屈雪3,李晓旭1()   

  1. 1.兰州理工大学 计算机与通信学院,兰州 730050
    2.甘肃省城市轨道交通智能运营工程研究中心,兰州 730050
    3.兰州理工大学 电气工程与信息工程学院,兰州 730050
  • 收稿日期:2020-07-15 出版日期:2021-09-01 发布日期:2021-09-16
  • 通讯作者: 李晓旭 E-mail:caoj@lut.edu.cn;lixiaoxu@lut.edu.cn
  • 作者简介:曹洁(1966-),女,教授,硕士.研究方向:人工智能.E-mail:caoj@lut.edu.cn
  • 基金资助:
    国家自然科学基金项目(61906080)

Few⁃shot image classification method based on sliding feature vectors

Jie CAO1,2(),Xue QU3,Xiao-xu LI1()   

  1. 1.School of Computer and Communication,Lanzhou University of Technology,Lanzhou 730050,China
    2.Engineering Research Center of Urban Railway Transportation of Gansu Province,Lanzhou 730050,China
    3.School of Electrical and Information Engineering,Lanzhou University of Technology,Lanzhou 730050,China
  • Received:2020-07-15 Online:2021-09-01 Published:2021-09-16
  • Contact: Xiao-xu LI E-mail:caoj@lut.edu.cn;lixiaoxu@lut.edu.cn

摘要:

针对在小样本图像分类中,几个样本的特征图不足以描述整个类特征空间,导致误分类的问题,提出了滑动特征向量神经网络(SFV),该方法通过集合同类样本的滑动特征向量构建类特征空间,并利用样本-类的特征向量度量方式分类查询样本。SFV融合了特征块的边缘信息以及位置结构的相关性,最大限度地利用深层特征信息的同时扩充了类特征空间。实验表明:在各数据集中SFV均能取得不错的效果,在细粒度数据集上,达到了最佳精度。

关键词: 计算机应用技术, 计算机视觉, 小样本学习, 局部特征, 度量学习

Abstract:

In the task of few-shot image classification, the extremely limited number of labeled examples per class can hardly represent the real class distribution effectively, which is the main reason for misclassification. To tackle this problem, we propose a method which named Sliding Feature Vectors Neural Network (SFV). The method aims to assemble all the local sliding feature vectors of samples from the same class to construct the class-level feature spaces, and then it utilized the image-to-class measure to classify the query samples. That means on the measure stage, SFV compare the similarity between the class and the query sample. SFV expands the class feature space by adding the edge information of feature blocks and correlation of their position and structures to maximize the utilization of the deep feature maps when the sample is extremely limited, which can ease overfitting problem caused by small sample data. Experimental study on benchmark datasets consistently shows its superiority over the related other framework, especially on fine-grained datasets, it achieves state-of-the-art.

Key words: computer application technology, computer vision, few-shot learning, local features, metric learning

中图分类号: 

  • TP183

图1

5-way 1-shot下滑动特征向量神经网络结构图"

表1

各算法5-way K-shot在Mini-ImageNet上的平均精度 (%)"

模型嵌入模块5-way 1-shot5-way 5-shot
基线实验DN4Conv-64F51.24±0.7471.02±0.64
DN4*Conv-64F51.78±0.8070.15±0.73
全局特征KNNConv-64F44.54±0.7649.52±0.75
基于度量学习方法匹配网络*Conv-64F43.56±0.8455.31±0.73
原型网络*Conv-64F48.45±0.9666.53±0.51
关系网络*Conv-64F50.44±0.8265.32±0.70
SFVConv-64F53.81±0.8171.98±0.68
基于元学习方法Baseline*Conv-64F36.34±0.5854.88±0.67
Meta-Learner LSTMConv-3243.44±0.7760.60±0.71
MAML*Conv-32F48.70±1.8463.11±0.92
TADAMResNet-1258.50±0.3076.70±0.30
MM-NetConv-32F53.37±0.4866.97±0.35
LEOWRN-28-1061.76±0.0877.59±0.12

表2

各算法5-way K-shot在各细粒度数据集上的平均精度 (%)"

模型Stanford CarsStanford DogsCUB-200
5-way 1-shot5-way 5-shot5-way 1-shot5-way 5-shot5-way 1-shot5-way 5-shot
基于度量学习方法全局KNN39.08±0.7741.61±0.7141.92±0.8245.91±0.8241.53±0.8149.01±0.80
DN459.84±0.8088.65±0.4445.41±0.7663.51±0.6246.84±0.8174.92±0.64
DN4*60.73±0.8588.45±0.4948.90±0.8269.63±0.7454.28±0.9274.66±0.75
匹配网络*34.80±0.9844.70±1.0335.80±0.9947.50±1.0345.30±1.0359.50±1.01
原型网络*40.90±1.0152.93±1.0337.59±1.0048.19±1.0337.36±1.0045.28±1.03
SFV65.18±0.8588.39±0.4852.19±0.8669.86±0.7257.81±0.9375.59±0.74
基于元学习方法MAML*44.34±0.8161.42±0.7143.64±0.8556.62±0.7546.55±0.8863.20±0.79
Baseline*28.53±0.5240.41±0.5829.58±0.4942.24±0.6332.00±0.5851.01±0.69

表3

SFV的消融实验在CUB-200上5-way K-shot的平均精度 (%)"

模型CUB-200
5-way 1-shot5-way 5-shot
全局KNN网络(I)41.53±0.9949.01±0.80
全局KNN-全局平均 池化网络(II)41.04±0.9947.35±0.92
特征块网络(III)54.55±0.9469.75±0.80
滑动特征块网络(IV)56.25±0.9274.37±0.76
滑动特征向量网络SFV(V)57.81±0.9375.59±0.74

表4

不同超参数a值下SFV在CUB-200上5-way K-shot的平均精度 (%)"

a的取值5-way 1-shot5-way 5-shot
a=154.28±0.9274.66±0.75
a=357.81±0.9375.59±0.74
a=557.58±0.9573.46±0.75

表5

不同支持集数目下各框架在CUB-200数据集上关于5-way K-shot的平均精度 (%)"

模型5-way 1-shot5-way 3-shot5-way 5-shot
全局KNN*41.53±0.8148.71±0.9149.01±0.80
DN4*54.28±0.9268.72±0.8274.66±0.75
SFV57.81±0.9370.90±0.8375.59±0.74
1 车翔玖, 董有政. 基于多尺度信息融合的图像识别改进算法[J]. 吉林大学学报: 工学版, 2020, 50(5): 1747-1754.
Che Xiang-jiu, Dong You-zheng. Improved image recognition algorithm based on multi⁃scale information fusion[J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1747-1754.
2 Wang Y, Yao Q, Kwok J T, et al. Generalizing from a few examples: a survey on few-shot learning[J].ACM Computing Surveys (CSUR), 2020, 53(3): 1-34.
3 Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks[C]∥International Conference on Machine Learning, Sydney, 2017: 1126-1135.
4 Ravi S, Larochelle H. Optimization as a model for few-shot learning[C]∥International Conference on Learning Representations, San Juan, 2016: 1-11.
5 Chen W Y, Liu Y C, Kira Z, et al. A closer look at few-shot classification[C]∥International Conference on Learning Representations, New Orleans, 2019: 04232.
6 Cai Q, Pan Y, Yao T, et al. Memory matching networks for one-shot image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018: 4080-4088.
7 Koch G, Zemel R, Salakhutdinov R. Siamese neural networks for one-shot image recognition[C]∥International Conference on Machine Learning, Lille, France, 2015.
8 Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning[C]∥Neural Information Processing Systems, Barcelona, 2016: 3630-3638.
9 Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning[C]∥Neural Information Processing Systems, Long Beach, 2017: 4077-4087.
10 Sung F, Yang Y, Zhang L, et al. Learning to compare: relation network for few-shot learning[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018: 1199-1208.
11 刘萍萍, 赵宏伟, 耿庆田, 等. 基于局部特征和视皮层识别机制的图像分类[J]. 吉林大学学报: 工学版, 2011, 41(5): 1401-1406.
Liu Ping-ping, Zhao Hong-wei, Geng Qing-tian, et al. Image classification method based on local feature and visual cortex recognition mechanism[J]. Journal of Jilin University(Engineering and Technology Edition), 2011, 41(5): 1401-1406.
12 Li W, Wang L, Xu J, et al. Revisiting local descriptor based image-to-class measure for few-shot learning [C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 2019: 7260-7268.
13 Russakovsky O, Deng J, Su H, et al. Imagenet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252.
14 Welinder P, Branson S, Mita T, et al. Caltech-UCSD birds 200[R]. Technical Report CNS-TR-2010-001, California Institute of Technology, 2010: 1-15.
15 Krause J, Stark M, Deng J, et al. 3D object representations for fine-grained categorization[C]∥Proceedings of the IEEE International Conference on Computer Vsion Workshops, Sydney, 2013: 554-561.
16 Khosla A, Jayadevaprakash N, Yao B, et al. Novel dataset for fine-grained image categorization: stanford dogs[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, USA, 2012: 3181866.
17 Oreshkin B N, Rodriguez P, Lacoste A. TADAM: task dependent adaptive metric for improved few-shot learning[C]∥Neural Information Processing Systems, Canada, 2018: 721-731.
18 Rusu A A, Rao D, Sygnowski J, et al. Meta-learning with latent embedding optimization[C]∥International Conference on Learning Representations, New Orleans, 2019: 05960.
[1] 王春波,底晓强. 基于标签分类的云数据完整性验证审计方案[J]. 吉林大学学报(工学版), 2021, 51(4): 1364-1369.
[2] 钱榕,张茹,张克君,金鑫,葛诗靓,江晟. 融合全局和局部特征的胶囊图神经网络[J]. 吉林大学学报(工学版), 2021, 51(3): 1048-1054.
[3] 周炳海,吴琼. 基于多目标的机器人装配线平衡算法[J]. 吉林大学学报(工学版), 2021, 51(2): 720-727.
[4] 许骞艺,秦贵和,孙铭会,孟诚训. 基于改进的ResNeSt驾驶员头部状态分类算法[J]. 吉林大学学报(工学版), 2021, 51(2): 704-711.
[5] 徐涛,马克,刘才华. 基于深度学习的行人多目标跟踪方法[J]. 吉林大学学报(工学版), 2021, 51(1): 27-38.
[6] 宋元,周丹媛,石文昌. 增强OpenStack Swift云存储系统安全功能的方法[J]. 吉林大学学报(工学版), 2021, 51(1): 314-322.
[7] 车翔玖,董有政. 基于多尺度信息融合的图像识别改进算法[J]. 吉林大学学报(工学版), 2020, 50(5): 1747-1754.
[8] 赵宏伟,李明昭,刘静,胡黄水,王丹,臧雪柏. 基于自然性和视觉特征通道的场景分类[J]. 吉林大学学报(工学版), 2019, 49(5): 1668-1675.
[9] 车翔玖, 王利, 郭晓新. 基于多尺度特征融合的边界检测算法[J]. 吉林大学学报(工学版), 2018, 48(5): 1621-1628.
[10] 许岩岩, 陈辉, 刘家驹, 袁金钊. CELL处理器并行实现立体匹配算法[J]. 吉林大学学报(工学版), 2017, 47(3): 952-958.
[11] 胡冠宇, 乔佩利. 基于云群的高维差分进化算法及其在网络安全态势预测上的应用[J]. 吉林大学学报(工学版), 2016, 46(2): 568-577.
[12] 张培林, 陈彦龙, 王怀光, 李胜. 考虑信号特点的合成量子启发结构元素[J]. 吉林大学学报(工学版), 2015, 45(4): 1181-1188.
[13] 杨焱, 刘飒, 廉世彬, 朱晓冬. 基于计算机视觉的果树害虫的形态特征分析[J]. 吉林大学学报(工学版), 2013, 43(增刊1): 235-238.
[14] 佟金, 王亚辉, 樊雪梅, 张书军, 陈东辉. 生鲜农产品冷链物流状态监控信息系统[J]. 吉林大学学报(工学版), 2013, 43(06): 1707-1711.
[15] 吴迪, 曹洁. 智能环境下基于核相关权重鉴别分析算法的多特征融合人脸识别[J]. 吉林大学学报(工学版), 2013, 43(02): 439-443.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!