基于双层数据增强的监督对比学习文本分类模型

Abstract

Abstract: Aiming at the non-selective expansion and training deficiencies of the DoubleMix algorithm during data augmentation, we proposed a supervised contrastive learning text classification model based on double-layer data augmentation, which effectively improved the accuracy of text classification when training data was scarce. Firstly, keyword-based data augmentation was applied to the original data at the input layer, while selectively enhancing the data without considering sentence structure. Secondly, we interpolated the original and augmented data in the BERT hidden layers, and then send them to the TextCNN for further feature extraction. Finally, the model was trained by using Wasserstein distance and double contrastive loss to enhance text classification accuracy. The comparative experimental results on SST-2, CR, TREC, and PC datasets show that the classification accuracy of the proposed method is 93.41%, 93.55%, 97.61%, and 95.27% respectively, which is superior to classical algorithms.

Key words: data augmentation, text classification, comparative learning, supervised learning

CLC Number:

WU Liang, ZHANG Fangfang, CHENG Chao, SONG Shinan. Supervised Contrastive Learning Text Classification Model Based on Double-Layer Data Augmentation[J].Journal of Jilin University Science Edition, 2024, 62(5): 1179-1187.

References

Metrics

Viewed

Full text

161

HTML			PDF

Just accepted	Online first	Issue	Just accepted	Online first	Issue
0	0	0	0	0	161

From	Others	local

Times	65	96
Rate	40%	60%

Abstract

249

Just accepted	Online first	Issue

0	0	249

From	Others	local

Times	247	2
Rate	99%	1%

Cited

Web of Science	Crossref	ScienceDirect	Search for Citations in Google Scholar >>


This page requires you have already subscribed to WoS.

Shared

[1]	QIAN Zheng, YAN Liang, SUN Shunyuan. Semi-supervised Manifold Constraint Localization Method with Multi-feature Fusion [J]. Journal of Jilin University Science Edition, 2024, 62(5): 1219-1227.
[2]	WU Yehui, LI Rujia, JI Rongbiao, LI Yadong, SUN Xiaohai, CHEN Jiaojiao, YANG Jianping. Maize Disease Recognition and Application Based on Random Augmentation Swin-Tiny Transformer [J]. Journal of Jilin University Science Edition, 2024, 62(2): 381-0390.
[3]	HOU Guangzhe, QIN Guihe, LIANG Yanhua. Self-supervised Point Cloud Denoising Method Based on Downsampling [J]. Journal of Jilin University Science Edition, 2024, 62(1): 100-0105.
[4]	LI Fang, QU Yubin, LI Long, LI Meng’ao. A Deep Intelligent Teaching Evaluation Method Based on Compensation for Feature Deviation [J]. Journal of Jilin University Science Edition, 2022, 60(3): 697-704.
[5]	LI Fang, QU Yubin, CHEN Xiang, LI Long, YANG Fan. A Sentiment Analysis Method Based on Class Imbalance Learning [J]. Journal of Jilin University Science Edition, 2021, 59(4): 929-935.
[6]	YAO Yanqiu, ZHENG Yawen, LV Yanxin. Emotional Text Classification Method Based on LS-SO Algorithm [J]. Journal of Jilin University Science Edition, 2019, 57(2): 375-379.
[7]	DONG Liyan, ZHU Qi, SUI Peng, SUN Peng, LI Yongli. Confidence Evaluation Algorithm Based on the Maximum Distinction [J]. Journal of Jilin University Science Edition, 2015, 53(06): 1217-1222.
[8]	PENG Tao, DAI Yaokang, ZHU Fengtong, ZHANG Bangzuo, LIU Lu, YAN Zhao, QIAN Feng. RuleBased Method for Unsupervised PartofSpeech Tagging [J]. Journal of Jilin University Science Edition, 2015, 53(05): 956-962.
[9]	GUO Xinchen, XI Xiantian, FAN Xiuling, HAN Xiao. Fuzzy C-Means Clustering Algorithm Based onSemi-supervised Learning [J]. Journal of Jilin University Science Edition, 2015, 53(04): 705-709.
[10]	ZHOU Xiaotang, OUYANG Jihong, LI Ximing. Centroid Classifier Based on Empirical Risk for Text Categorization [J]. Journal of Jilin University Science Edition, 2013, 51(05): 876-880.
[11]	HAO Lizhu, ZHAO Shishun, HAO Lili. Text Automatic Classification Based on Multiple HypothesisTesting in the Mayor’s Public Access Line Project [J]. J4, 2008, 46(06): 1101-1104.

Supervised Contrastive Learning Text Classification Model Based on Double-Layer Data Augmentation

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 11

Metrics

Comments

Recommended 0