Journal of Jilin University Science Edition ›› 2024, Vol. 62 ›› Issue (5): 1179-1187.

Previous Articles     Next Articles

Supervised Contrastive Learning Text Classification Model Based on Double-Layer Data Augmentation

WU Liang, ZHANG Fangfang, CHENG Chao, SONG Shinan   

  1. College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China
  • Received:2023-08-04 Online:2024-09-26 Published:2024-09-26

Abstract: Aiming at  the non-selective expansion  and training deficiencies of the DoubleMix algorithm during data augmentation, we proposed a supervised contrastive learning text classification model based on double-layer data augmentation, which effectively improved the accuracy of text classification when training data was scarce. Firstly, keyword-based data augmentation was applied to the original data at the input layer, while selectively enhancing the data without considering sentence structure. Secondly, we  interpolated  the original and augmented data in the BERT hidden layers, and  then send them to the TextCNN for further feature extraction. Finally, the model was trained by using Wasserstein distance and double contrastive loss to enhance text classification accuracy. The comparative experimental results on SST-2, CR, TREC, and PC datasets show that the classification accuracy of the proposed method is 93.41%, 93.55%, 97.61%, and 95.27% respectively, which is superior to classical algorithms.

Key words: data augmentation, text classification, comparative learning, supervised learning

CLC Number: 

  •