基于 BERT-BiGRU-CNN 模型的短文本分类研究

吉林大学学报(信息科学版) ›› 2023, Vol. 41 ›› Issue (6): 1048-1053.

基于 BERT-BiGRU-CNN 模型的短文本分类研究

陈雪松, 邹梦

东北石油大学电气信息工程学院, 黑龙江大庆 163318

收稿日期:2022-11-11 出版日期:2023-11-30 发布日期:2023-12-01
作者简介:陈雪松(1972— ), 女, 黑龙江大庆人, 东北石油大学教授, 硕士生导师, 主要从事信息安全、信息隐藏、数字水印、信号与信息处理等研究, (Tel)86-13946990816(E-mail)cxsnepu@ 163. com
基金资助:
国家自然科学基金资助项目(61402099; 61702093)

Research on Short Text Classification Based on BERT-BiGRU-CNN Model

CHEN Xuesong, ZOU Meng

School of Electrical and Information Engineering, Northeast Petroleum University, Daqing 163318, China

Received:2022-11-11 Online:2023-11-30 Published:2023-12-01

摘要/Abstract

摘要： 针对传统语言模型不能解决深层双向表征和分类模型不能充分捕获文本显著特征的问题, 提出了一种基于 BERT-BiGRU-CNN( Bidirectional Encoder Representation from Transformers-Bidirectional Gating Recurrent Unit- Convolutional Neural Networks)的文本分类模型。首先, 该模型使用 BERT 预训练语言模型进行文本表示; 其次, 将 BERT 的输出数据输入 BiGRU 中, 以捕获文本的全局语义信息; 然后, 将 BiGRU 层的结果输入 CNN 中, 捕获文本局部语义特征; 最后, 将特征向量输入 Softmax 层得到分类结果。实验采用中文新闻文本标题数据集, 结果表明, 基于 BERT-BiGRU-CNN 的文本分类模型在数据集上的 F1 值达到 0. 948 5, 优于其他基线模型, 证明了 BERT-BiGRU-CNN 模型可提升短文本的分类性能。

关键词: 文本分类, BERT 预训练模型, 双向门控循环单元, 卷积神经网络

Abstract: To address the problem that traditional language models can not solve the problem of deep bidirectional representation and the problem that classification models can not adequately capture salient features of text, a text classification model based on BERT-BiGRU-CNN ( Bidirectional Encoder Representation from Transformers-Bidirectional Gating Recurrent Unit-Convolutional Neural Networks) is proposed. Firstly, the BERT pre-training model is used for text representation; secondly, the output data of BERT is input into BiGRU to capture the global semantic information of text. The results of BiGRU layer again are input into CNN to capture the local semantic features of text. Finally, the feature vectors are input into Softmax layer to obtain the classification results. The Chinese news text headlines dataset is used, and the experimental results show that the BERT-BiGRU-CNN based text classification model achieves an F1 value of 0. 948 5 on the dataset, which is better than other baseline models, proving that the BERT-BiGRU-CNN model can improve theshort text classification performance.

Key words: text classification, bidirectional encoder representation from transformers(BERT)word embedding, bidirectional gating recurrent unit(BiGRU), convolutional neural networks(CNN)

中图分类号:

TP391. 1

陈雪松, 邹梦. 基于 BERT-BiGRU-CNN 模型的短文本分类研究 [J]. 吉林大学学报(信息科学版), 2023, 41(6): 1048-1053.

CHEN Xuesong, ZOU Meng . Research on Short Text Classification Based on BERT-BiGRU-CNN Model[J]. Journal of Jilin University (Information Science Edition), 2023, 41(6): 1048-1053.

[1]	张强, 李志溢, 邓彬. 基于 HDCNN-BIGRU-Attention 油田措施效果预测模型 [J]. 吉林大学学报(信息科学版), 2023, 41(4): 631-638.
[2]	张璐, 马子睿, 王岳, 马翠玲 . 面向高中化学试题的命名实体识别[J]. 吉林大学学报(信息科学版), 2023, 41(4): 608-620.