吉林大学学报(信息科学版) ›› 2023, Vol. 41 ›› Issue (5): 866-875.

• • 上一篇    下一篇

融合 SikuBERT 模型与 MHA 的古汉语命名实体识别

陈雪松a , 詹子依a , 王浩畅b   

  1. 东北石油大学 a. 电气信息工程学院; b. 计算机与信息技术学院, 黑龙江 大庆 163318
  • 收稿日期:2022-09-28 出版日期:2023-10-09 发布日期:2023-10-10
  • 作者简介:陈雪松(1972— ), 女, 黑龙江大庆人, 东北石油大学教授, 主要从事信息安全、 信息隐藏、 数字水印、 信号与信息处理 等研究, (Tel)86-13946990816(E-mail)cxsnepu@ 163. com。
  • 基金资助:
    国家自然科学基金资助项目(61402099; 61702093)

Ancient Chinese Named Entity Recognition Based on SikuBERT Model and MHA

CHEN Xuesong a , ZHAN Ziyi a , WANG Haochang b   

  1. a. School of Electrical and Information Engineering; b. School of Computer and Information Technology, Northeast Petroleum University, Daqing 163318, China
  • Received:2022-09-28 Online:2023-10-09 Published:2023-10-10

摘要: 针对传统的命名实体识别方法无法充分学习古汉语复杂的句子结构信息以及在长序列特征提取过程中 容易带来信息损失的问题, 提出一种融合 SikuBERT( Siku Bidirectional Encoder Representation from Transformers) 模型与 MHA(Multi-Head Attention)的古汉语命名实体识别方法。 首先, 利用 SikuBERT 模型对古汉语语料进行预训练, 将训练得到的信息向量输入 BiLSTM(Bidirectional Long Short-Term Memory) 网络中提取特征, 再将 BiLSTM层的输出特征通过 MHA 分配不同的权重减少长序列的信息损失, 最后通过 CRF(Conditional Random Field)解码得到预测的序列标签。 实验表明, 与常用的 BiLSTM-CRF BERT-BiLSTM-CRF 等模型相比, 该方法 的 F1 值有显著提升, 证明了该方法能有效提升古汉语命名实体识别的效果。

关键词: 古汉语, 命名实体识别, SikuBERT 模型, 多头注意力机制

Abstract:

Aiming at the problem that the traditional named entity recognition method can not fully learn the complex sentence structure information of ancient Chinese and it is easy to cause information loss in the process of long sequence feature extraction, an ancient Chinese fusion of SikuBERT ( Siku Bidirectional Encoder Representation from Transformers) model and MHA (Multi-Head Attention) is proposed. First, the SikuBERT model is used to pre-train the ancient Chinese corpus, the information vector obtained from the training into the BiLSTM (Bidirectional Long Short-Term Memory) network is input to extract features, and then the output features of the BiLSTM layer are assigned different weights through MHA to reduce the information loss problem of long sequences. And finally the predicted sequence labels are obtained through CRF (Conditional Random Field) decoding. Experiments show that compared with commonly used BiLSTM-CRF, BERT-BiLSTM-CRF and other models, the F1 value of this method has been significantly improved, which verifies that this method can effectively improve the effect of ancient Chinese named entity recognition.

Key words: ancient Chinese, named entity recognition, siku bidirectional encoder representation from transformers(SikuBERT) model, multi-head attention mechanism

中图分类号: 

  • TP391. 1