Journal of Jilin University(Engineering and Technology Edition) ›› 2023, Vol. 53 ›› Issue (5): 1427-1434.doi: 10.13229/j.cnki.jdxbgxb.20210856

Previous Articles    

Chinese named entity recognition method based on Transformer and hidden Markov model

Jian LI1,2(),Qi XIONG1,Ya-ting HU1(),Kong-yu LIU1   

  1. 1.College of Information Technology,Jilin Agricultural University,Changchun 130118,China
    2.Jilin Bioinformatics Research Center,Changchun 130118,China
  • Received:2021-08-31 Online:2023-05-01 Published:2023-05-25
  • Contact: Ya-ting HU E-mail:liemperor@163.com;huyating79@163.com

Abstract:

A new method for Chinese named entity recognition at word level based on transformer and hidden Markov model is proposed. The position coding calculation function of transformer model is improved, so that the modified position coding function can express the relative position information and directivity between characters. The character sequence encoded by transformer model is used to calculate the transfer matrix and emission matrix, and a hidden Markov model is established to generate a group of named entity soft labels. The soft label generated by hidden Markov model is brought into Bert-NER model, the divergence loss function is used to update the parameters of Bert-NER model, and the final named entity strong label is output to find the named entity. Through comparative experiments, the F1 value of the proposed method in Chinese cluster-2020 data set and Weibo data set reaches 75.11% and 68%, which improves the effect of Chinese named entity recognition.

Key words: artificial intelligence, chinese named entity recognition, HMM, transformer encoder, position coding

CLC Number: 

  • TP391.1

Fig.1

Transformer-HMM-Bert-NER model"

Fig.2

HMM's architecture"

Table 1

CLUENER-2020 dataset"

数据集实体类别
组织人名地址公司政府书籍游戏电影职位景点总数
训练集51124963315442116211547321165416799820
验证集67178702261201221452262191310
测试集81192811671243231512622541482

Table 2

Weibo dataset"

实体类别名称名义总数
地缘政治2430243
地址8838126
机构组织22431255
人物7216361357

Table 3

Parameter setting"

参 数数量
HMM Epoch5
Transformer-Encoder层数6
Transformer-Self-Attention头数12
Bert层数12
Bert-Self-Attention头数12
Batch Size32
优化函数Adam
Dropout系数0.4
最大句子长度180

Table 4

Comparison and recognition effect of CLUENER-2020 dataset"

模型实体类别效果(F1值)
组织人名地址公司政府书籍游戏电影职位景点均值
BERT-NER68.7679.4575.6468.1469.1879.6567.7671.5678.6577.5673.63
Bert-NER*69.1279.6977.4569.1469.5480.1268.4572.5679.4578.5674.41
BiLSTM-CRF67.6475.4968.5965.4866.5672.5967.5670.2575.3674.6570.41
Transformer-CRF67.2375.6968.1264.5964.1471.5864.2368.5670.4572.5668.71
RoBERTa70.2180.4880.5969.1569.9880.1267.4970.9879.8780.1174.89
ALBERT70.2579.5880.6970.2570.0380.2467.5971.4880.1680.1275.03
Ours70.1280.4581.2569.7570.1280.2667.7971.5980.1379.6575.11

Table 5

Comparison and recognition effect of Weibo dataset"

模型PRF1
Bert-NER*67.2565.7466.48
RoBERTa68.3766.9467.64
ALBERT68.3467.0167.66
Ouns68.7567.2768.00

Table 6

Examples of named entity recognition"

句子小强同学即将成为微软中国的算法工程师
Bert-NER小强同学即将成为微软Com中国的算法工程师Pos
BiLSTM-CRF小强同学即将成为微软Com中国的算法工程师Pos
Ours小强同学即将成为微软中国Com的算法工程师Pos
1 Babych B, Hartley A. Improving machine translation quality with automatic named entity recognition[C]∥ Proceedings of the 7th International EAMT Workshop on MT and other Language Technology Tools, Improving MT Through other Language Technology Tools, Resource and Tools for Building MT at EACL, Budapest, Hungary, 2003: 1-8.
2 Zelenko D, Aone C, Richardella A. Kernel methods for relation extraction[J]. Journal of Machine Learning Research, 2003(3): 1083-1106.
3 Kumaran G, Allan J. Text classification and named entities for new event detection[C]∥ Proceedings of the 27th Annual International ACM SIGIR conference on Research and Development In Information Retrieval, Sheffield, UK, 2004: 297-304.
4 Ramshaw L, Marcus M. Text chunking using transformation-based learning[J/OL]. (1996-03-23). [2021-09-12]. .
5 徐智婷, 薛向阳.融合多特征的最大熵汉语命名实体识别模型[J].计算机研究与发展, 2008(6): 1004-1010.
Xu Zhi-ting, Xue Xiang-yang. Fusion of multiple features for chinese named entity recognition based on maximum entropy model[J]. Journal of Computer Research and Development, 2008(6): 1004-1010.
6 王路路, 艾山·吾买尔, 买合木提·买买提, 等.基于CRF和半监督学习的维吾尔文命名实体识别[J]. 中文信息学报, 2018, 32(11): 16-26+33.
Wang Lu-lu, Wumaier Aishan, Maimaiti Maihemuti, et al. A semi-supervised approach to uyghur named entity recognition based on CRF[J]. Journal of Chinese Information Processing, 2018, 32(11): 16-26+33.
7 燕杨, 文敦伟, 王云吉, 等.基于层叠条件随机场的中文病历命名实体识别[J].吉林大学学报: 工学版, 2014, 44(6): 1843-1848.
Yan Yang, Wen Dun-wei, Wang Yun-ji, et al. Name entity recognition in chinese medical records based on cascaded conditional random field[J]. Journal of Jilin University(Engineering and Technology Edition), 2014, 44(6): 1843-1848.
8 Morwal S, Jahan N, Chopra D. Named entity recognition using hidden markov model[J]. International Journal on Natural Language Computing, 2012, 1(4): 15-23.
9 李抵非, 田地, 胡雄伟.基于深度学习的中文标准文献语言模型[J].吉林大学学报: 工学版, 2015, 45(2): 596-599.
Li Di-fei, Tian Di, Hu Xiong-wei. Standard literature language model based on deep learning[J]. Journal of Jilin University(Engineering and Technology Edition), 2015, 45(2): 596-599.
10 Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: probabilistic models for segmenting and labeling sequence data[C]∥ Proceedings of the Eighteenth International Conference on Machine Learning, Williamstown, USA, 2001: 282-289.
11 郭晓然, 罗平, 王维兰.基于Transformer编码器的中文命名实体识别[J].吉林大学学报: 工学版, 2021, 51(3): 989-995.
Guo Xiao-ran, Luo Ping, Wang Wei-lan. Chinese named entity recognition based on Transformer encoder[J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 989-995.
12 Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J/OL]. [2017-12-06].
13 Radford A, Narasimhan K, Salimans T, et al. Improving language understanding by generative pre-training[J], 2018.
14 Rei M, Crichton G, Pyysalo S. Attending to characters in neural sequence labeling models[C]∥International Conference on Computational Linguistics, Osaka, Japan, 2016: 309-318.
15 李明扬, 孔芳.融入自注意力机制的社交媒体命名实体识别[J].清华大学学报: 自然科学版, 2019, 59 (6):461-467.
Li Ming-yang, Kong Fang. Combined self-attention mechanism for named entity recognition in social media[J]. Journal of Tsinghua University(Science and Technology), 2019, 59(6): 461-467
16 Yan H, Deng B, Li X N, et al. TENER: adapting transformer encoder for named entity recognition[J/OL]. [2019-12-10]. .
17 Li Y, Shetty P, Liu L, et al. BERtifying the hidden markov model for multi-source weakly supervised named entity recognition[J/OL]. [2021-03-30]. .
18 Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J/OL]. [2019-03-24]. 。
19 Christian T. Classification on soft labels is robust against label noise[C]∥ Proceedings of the 12th International Conference on Knowledge based Intelligent Information and Engineering Systems, Part I, KES ' 08, Heidelberg, Berlin, 2008: 65-73.
20 Liang C, Yu Y, Jiang H M, et al. Bond: bert-assisted open-domain named entity recognition with distant supervision[C]∥KDD'20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, New York, USA, 2020: 1054-1064.
21 Xu L, Dong Q Q, Yu C, et al. Cluener2020: fine-grained name entity recognition for chinese[J/OL]. [2020-01-20]. .
22 Peng N Y, Dredze M. Named entity recognition for chinese social media with jointly trained embeddings[C]∥Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015: 548-554.
[1] Tian BAI,Ming-wei XU,Si-ming LIU,Ji-an ZHANG,Zhe WANG. Dispute focus identification of pleading text based on deep neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1872-1880.
[2] Sheng-sheng WANG,Lin-yan JIANG,Yong-bo YANG. Transfer learning of medical image segmentation based on optimal transport feature selection [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(7): 1626-1638.
[3] Hao-yu TIAN,Xin MA,Yi-bin LI. Skeleton-based abnormal gait recognition: a survey [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(4): 725-737.
[4] Yong LIU,Lei XU,Chu-han ZHANG. Deep reinforcement learning model for text games [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 666-674.
[5] Jing-pei LEI,Dan-tong OUYANG,Li-ming ZHANG. Relation domain and range completion method based on knowledge graph embedding [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(1): 154-161.
[6] Zhi-hua LI,Ye-chao ZHANG,Guo-hua ZHAN. Realtime mosaic and visualization of 3D underwater acoustic seabed topography [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(1): 180-186.
[7] Yan-lei XU,Run HE,Yu-ting ZHAI,Bin ZHAO,Chen-xiao LI. Weed identification method based on deep transfer learning in field natural environment [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(6): 2304-2312.
[8] Yong YANG,Qiang CHEN,Fu-heng QU,Jun-jie LIU,Lei ZHANG. SP⁃k⁃means-+ algorithm based on simulated partition [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(5): 1808-1816.
[9] Ya-hui ZHAO,Fei-yang YANG,Zhen-guo ZHANG,Rong-yi CUI. Korean text structure discovery based on reinforcement learning and attention mechanism [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(4): 1387-1395.
[10] Yan-hua DONG,Jing-wei LIU,Jing-hua ZHAO,Liang LI,Fang-xi XIE. Real-time torque tracking control based on BPNN online learning prediction model [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(4): 1405-1413.
[11] Fu LIU,Yi-xin LIANG,Tao HOU,Yang SONG,Bing KANG,Yun LIU. Improvement of fuzzy c-harmonic mean algorithm on unbalanced data [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(4): 1447-1453.
[12] Xiao-ran GUO,Ping LUO,Wei-lan WANG. Chinese named entity recognition based on Transformer encoder [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 989-995.
[13] Fu-hua SHANG,Mao-jun CAO,Cai-zhi WANG. Local outlier data mining based on artificial intelligence technology [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(2): 692-696.
[14] Hai-ying ZHAO,Wei ZHOU,Xiao-gang HOU,Xiao-li ZHANG. Double-layer annotation of traditional costume images based on multi-task learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 293-302.
[15] Dan-tong OUYANG,Cong MA,Jing-pei LEI,Sha-sha FENG. Knowledge graph embedding with adaptive sampling [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(2): 685-691.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!