Journal of Jilin University Science Edition ›› 2024, Vol. 62 ›› Issue (5): 1179-1187.
Previous Articles Next Articles
WU Liang, ZHANG Fangfang, CHENG Chao, SONG Shinan
Received:
Online:
Published:
Abstract: Aiming at the non-selective expansion and training deficiencies of the DoubleMix algorithm during data augmentation, we proposed a supervised contrastive learning text classification model based on double-layer data augmentation, which effectively improved the accuracy of text classification when training data was scarce. Firstly, keyword-based data augmentation was applied to the original data at the input layer, while selectively enhancing the data without considering sentence structure. Secondly, we interpolated the original and augmented data in the BERT hidden layers, and then send them to the TextCNN for further feature extraction. Finally, the model was trained by using Wasserstein distance and double contrastive loss to enhance text classification accuracy. The comparative experimental results on SST-2, CR, TREC, and PC datasets show that the classification accuracy of the proposed method is 93.41%, 93.55%, 97.61%, and 95.27% respectively, which is superior to classical algorithms.
Key words: data augmentation, text classification, comparative learning, supervised learning
CLC Number:
WU Liang, ZHANG Fangfang, CHENG Chao, SONG Shinan. Supervised Contrastive Learning Text Classification Model Based on Double-Layer Data Augmentation[J].Journal of Jilin University Science Edition, 2024, 62(5): 1179-1187.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://xuebao.jlu.edu.cn/lxb/EN/
http://xuebao.jlu.edu.cn/lxb/EN/Y2024/V62/I5/1179
Cited