Journal of Jilin University(Engineering and Technology Edition) ›› 2022, Vol. 52 ›› Issue (6): 1428-1433.doi: 10.13229/j.cnki.jdxbgxb20210443

Previous Articles    

Automatic completion algorithm for missing links in nowledge graph considering data sparsity

Wen-jun WANG1(),Yin-feng YU2,3,4   

  1. 1.College of Computer and Network Engineering,Shanxi Datong University,Datong 037009,China
    2.Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China
    3.State Key Laboratory of Intelligent Technology and Systems,Tsinghua University,Beijing 100084,China
    4.School of Information Science and Engineering,Xinjiang University,Urumqi 830046,China
  • Received:2021-05-19 Online:2022-06-01 Published:2022-06-02

Abstract:

Aiming at the Yungang Grottoes corpus with sparse data, the completion of sparse data is realized through the artificial understanding of semantic rules, which affects the accuracy of the follow-up work to a certain extent. Therefore, an automatic completion algorithm for missing connection of knowledge atlas considering data sparsity is proposed, and some corresponding contents are also proposed in the text. By setting the neighborhood structure of the data, the knowledge graph embedded representation model based on data sparsity is constructed to extract the unknown relationship of the data. Then, the long-short-term memory network model is used to automatically complete the missing data in the variable-length sequence, and the automatic completion of the missing connection of the knowledge graph is realized. Finally, the algorithm is applied to the construction of the knowledge graph of Yungang Grottoes, it can be seen from the experimental results that on the same database set, considering the data sparsity, the accuracy of the algorithm is up to 95.4%,much higher than other traditional algorithms.

Key words: computer application, data sparsity, knowledge graph, completion algorithm, entity relationship

CLC Number: 

  • TP391.1

Fig.1

Construction flow of entity vector structure"

Fig.2

Overall structure of attention network model"

Fig.3

Structure diagram of relation extractionmodel in knowledge graph"

Table 1

Related parameter values of experimental data set"

参数数值参数数值
实体关系数量24 987路径平均长度4.5
查询关系数量32数据缺失/个320 000
实体数389 115实验数据/个640 000
实体属性数2 137

Table 2

Comparison of test results of differentcompletion algorithms"

算 法准确率/%回收率/%F1-score
本文95.410.90.0277
基于LFM的传统方法79.77.90.0253
基于SVD的传统方法80.26.90.0231
基于UserCF的传统方法81.19.70.0195
1 赵一鸣, 吴林容, 任笑笑. 基于多知识图谱的中文文本语义图构建研究[J]. 情报科学, 2021, 39(4): 23-29.
Zhao Yi-ming, Wu Lin-rong, Ren Xiao-xiao. Chinese text semantic graph construction based on multiple knowledge graphs[J]. Information Science, 2021, 39(4): 23-29.
2 王娜娜. 混合云存储中网络稀疏大数据渗透迁移算法[J]. 计算机工程与设计, 2021, 42(3): 719-725.
Wang Na-na. Network sparse big data infiltration migration algorithm in hybrid cloud storage[J]. Computer Engineering and Design, 2021, 42(3): 719-725.
3 翟社平, 郭琳, 高山, 等. 一种采用贝叶斯推理的知识图谱补全方法[J]. 小型微型计算机系统, 2018, 39(5): 995-999.
Zhai She-ping, Guo Lin, Gao Shan, et al. Method for knowledge graph completion based on Bayesian reasoning[J]. Journal of Chinese Computer Systems, 2018, 39(5): 995-999.
4 Goel R, Kazemi S M, Brubaker M, et al. Diachronic embedding for temporal knowledge graph completion[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(4): 3988-3995.
5 Zhang Z, Cai J, Wang J. Duality-induced regularizer for tensor factorization based knowledge graph completion[J]. Advances in Neural Information Processing Systems, 2020, 33(2): 21604-21615.
6 王子涵, 邵明光, 刘国军, 等. 基于实体相似度信息的知识图谱补全算法[J].计算机应用, 2018, 38(11): 3089-3093.
Wang Zi-han, Shao Ming-guang, Liu Guo-jun, et al. Knowledge graph completion algorithm based on similarity between entities[J]. Journal of Computer Applications, 2018, 38(11): 3089-3093.
7 Shen Y, Ding N, Zheng H T, et al. Modeling relation paths for knowledge graph completion[J]. IEEE Transactions on Knowledge and Data Engineering, 2020, 33(11): 3607-3617.
8 陈锦霞, 张婷. 基于数据稀疏特征的交互设计智能推送仿真[J]. 计算机仿真, 2020, 37(12): 166-170.
Chen Jin-xia, Zhang Ting. Intelligent push simulation of interactive design based on data sparse feature[J]. Computer Simulation, 2020, 37(12): 166-170.
9 张天杭, 李婷婷, 张永刚. 基于知识图谱嵌入的多跳中文知识问答方法[J]. 吉林大学学报:理学版, 2022, 60(1): 119-126.
Zhang Tian-hang, Li Ting-ting, Zhang Yong-gang. Multi-hop chinese knowledge question answering method based on knowledge graph embedding[J]. Journal of Jilin University(Science Edition), 2022, 60(1): 119-126.
10 潘承瑞, 何灵敏, 胥智杰, 等. 融合知识图谱的双线性图注意力网络推荐算法[J].计算机工程与应用, 2021, 57(1): 29-37.
Pan Cheng-rui, He Ling-min, Xu Zhi-jie, et al. Fusion knowledge graph and bilinear graph attention network recommendation algorithm[J]. Computer Engineering and Applications, 2021, 57(1): 29-37.
11 陆万荣, 许江淳, 李玉惠. 考虑边界稀疏样本的非平衡数据处理方法[J]. 重庆邮电大学学报: 自然科学版, 2020, 32(3): 495-502.
Lu Wan-rong, Xu Jiang-chun, Li Yu-hui. Unbalanced data processing method considering boundary sparse samples[J]. Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition), 2020, 32(3): 495-502.
12 侯位昭, 张欣海, 宋凯磊, 等. 融合知识图谱及贝叶斯网络的智能推荐方法[J]. 中国电子科学研究院学报, 2020, 15(5): 488-494.
Hou Wei-zhao, Zhang Xin-hai, Song Kai-lei, et al. Intelligent recommendation method combining knowledge graph and Bayesian network[J]. Journal of China Academy of Electronics and Information Technology, 2020, 15(5): 488-494.
13 岳希, 唐聃, 舒红平, 等. 基于数据稀疏性的协同过滤推荐算法改进研究[J]. 工程科学与技术, 2020, 52(1): 198-202.
Yue Xi, Tang Dan, Shu Hong-ping, et al. Research on improvement of collaborative filtering recommendation algorithm based on data sparseness[J]. Advanced Engineering Sciences, 2020, 52(1): 198-202.
14 刘静, 刘涵, 黄开宇, 等. 基于自动秩估计的黎曼优化矩阵补全算法及其在图像补全中的应用[J]. 电子与信息学报, 2019, 41(11): 2787-2794.
Liu Jing, Liu Han, Huang Kai-yu, et al. Automatic rank estimation based Riemannian optimization matrix completion algorithm and application to image completion[J]. Journal of Electronics & Information Technology, 2019, 41(11): 2787-2794.
15 朱小龙, 谢忠. 基于海量文本数据的知识图谱自动构建算法[J].吉林大学学报:工学版, 2021, 51(4): 1358-1363.
Zhu Xiao-long, Xie Zhong. Automatic construction of know ledge graph based on massive text data[J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(4): 1358-1363.
[1] Yao-long KANG,Li-lu FENG,Jing-an ZHANG,Fu CHEN. Outlier mining algorithm for high dimensional categorical data streams based on spectral clustering [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(6): 1422-1427.
[2] Xue-yun CHEN,Xue-yu BEI,Qu YAO,Xin JIN. Pedestrian segmentation and detection in multi-scene based on G-UNet [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(4): 925-933.
[3] Shi-min FANG. Multiple source data selective integration algorithm based on frequent pattern tree [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(4): 885-890.
[4] Da-xiang LI,Meng-si CHEN,Ying LIU. Spontaneous micro-expression recognition based on STA-LSTM [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(4): 897-909.
[5] Ming LIU,Yu-hang YANG,Song-lin ZOU,Zhi-cheng XIAO,Yong-gang ZHANG. Application of enhanced edge detection image algorithm in multi-book recognition [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(4): 891-896.
[6] Xiao-hui WEI,Yan-wei MIAO,Xing-wang WANG. Rhombus sketch: adaptive and more accurate sketch for streaming data [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(4): 874-884.
[7] Xue WANG,Zhan-shan LI,Ying-da LYU. Medical image segmentation based on multi⁃scale context⁃aware and semantic adaptor [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 640-647.
[8] Ji-hong OUYANG,Ze-qi GUO,Si-guang LIU. Dual⁃branch hybrid attention decision net for diabetic retinopathy classification [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 648-656.
[9] Lin MAO,Feng-zhi REN,Da-wei YANG,Ru-bo ZHANG. Two⁃way feature pyramid network for panoptic segmentation [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 657-665.
[10] Xue-zhi WANG,Qing-liang LI,Wen-hui LI. Spatio⁃temporal model of soil moisture prediction integrated with transfer learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 675-683.
[11] Su-ming KANG,Ye-e ZHANG. Hadoop⁃based local timing link prediction algorithm across social networks [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 626-632.
[12] Jing-pei LEI,Dan-tong OUYANG,Li-ming ZHANG. Relation domain and range completion method based on knowledge graph embedding [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(1): 154-161.
[13] You QU,Wen-hui LI. Single-stage rotated object detection network based on anchor transformation [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(1): 162-173.
[14] Hong-wei ZHAO,Dong-sheng HUO,Jie WANG,Xiao-ning LI. Image classification of insect pests based on saliency detection [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(6): 2174-2181.
[15] Zhou-zhou LIU,Qian-yun ZHANG,Xin-hua MA,Han PENG. Compressed sensing signal reconstruction based on optimized discrete differential evolution algorithm [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(6): 2246-2252.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!