吉林大学学报(工学版) ›› 2025, Vol. 55 ›› Issue (11): 3673-3685.doi: 10.13229/j.cnki.jdxbgxb.20240177

• 计算机科学与技术 • 上一篇    

基于实体可靠路径与语义增强的知识图谱对齐

王红斌1,2(),唐浩东1,2,线岩团1,2(),刘博3,顾新亮3   

  1. 1.昆明理工大学 信息工程与自动化学院 昆明 650500
    2.昆明理工大学 云南省人工智能重点实验室 昆明 650500
    3.富奥汽车零部件股份有限公司,长春,130012
  • 收稿日期:2024-02-23 出版日期:2025-11-01 发布日期:2026-02-03
  • 通讯作者: 线岩团 E-mail:whbin2007@126.com;195426286@qq.com
  • 作者简介:王红斌(1983-),男,教授,博士. 研究方向:自然语言处理,信息检索和机器学习.E-mail: whbin2007@126.com
  • 基金资助:
    云南省重点研发计划项目(202202AD080003)

Knowledge graph alignment based on entity reliable path and semantic aggregates

Hong-bin WANG1,2(),Hao-dong TANG1,2,Yan-tuan XIAN1,2(),Bo LIU3,Xin-liang GU3   

  1. 1.Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China
    2.Key Laboratory of Artificial Intelligence in Yunnan Province,Kunming University of Science and Technology,Kunming 650500,China
    3.Fawer Automotive Parts Limited Company,Changchun 130012,China
  • Received:2024-02-23 Online:2025-11-01 Published:2026-02-03
  • Contact: Yan-tuan XIAN E-mail:whbin2007@126.com;195426286@qq.com

摘要:

针对知识图谱实体之间存在大量的多步骤关系路径以表明实体之间的语义关系,现有的方法很少考虑路径信息,也忽略了知识图谱关系结构与属性结构之间的邻域异质性问题。本文提出了实体可靠路径信息语义增强模型,该模型集成了知识图谱中的关系和路径结构信息以及异构信息,设计了一种初始可靠路径推理算法,用于捕获和聚合对齐实体及其异构邻居的多源信息,并将实体的关系结构、属性结构、实体名称路径集信息通过模型聚合进行语义增强,由此解决知识图谱对齐邻域异质性问题。本文在3个数据集(WK31-15K、DBP-15K和DWY-100K)上评估了实体可靠路径信息语义增强模型,将该模型与其他较先进的实体对齐方法进行比较,Hits@1提升了1.5%~3.2%,表明本文方法具有更好的性能。

关键词: 知识图谱, 实体对齐, 可靠路径信息, 属性结构, 语义增强

Abstract:

There are numerous multi-step relationship paths between entities in the knowledge graph to indicate semantic relationships between entities, as well as the neighborhood heterogeneity between relationship structures and attribute structures. In response to this problem, An entity reliable path information semantic augmentation model is proposed in this paper, which simultaneously captures and aggregates multi-source information of aligned entities and their heterogeneous neighbors, an initial reliable path reasoning algorithm to generate. The model aggregates the relationship structure, attribute structure, and entity name information reliable path of the entities for semantic augmentation, which solves the problem of domain heterogeneity in knowledge graph alignment. The paper evaluated the entity reliable path information semantic augmentation model on three datasets(WK31-15K, DBP-15K and DWY-100K)show that this model is improved by 1.5%~3.2%compared with the state-of-art entity alignment method Hits@1, which shows that the proposed method has better performance.

Key words: knowledge graphs, entity alignment, reliable path information, attribute structures, semantic augmentation

中图分类号: 

  • TP391.1

图1

不同知识图谱的相同实体对比"

表1

符号与说明"

符号说 明
Xe_init实体名称嵌入初始矩阵
Xpt实体路径集输出嵌入矩阵
Xv_init属性值嵌入初始矩阵
Xat基于注意力感知属性三元组的输出嵌入矩阵
Rdd维矩阵空间
LLinear连接函数
Xe实体名称嵌入矩阵
Xr关系嵌入矩阵
Xv属性值嵌入矩阵
Xein关系三元组与属性三元组语义增强输入矩阵
乘法运算
σReLU函数
Xp实体路径嵌入矩阵
Xrt基于注意力感知关系三元组的输出嵌入矩阵
Xm属性嵌入矩阵
Xeout基于多头注意力感知聚合的语义增强输出矩阵
矢量连接
ηLeakyReLU非线性函数

图 2

EPSA的总体架构"

图 3

实体路径匹配架构图"

图 4

预对齐实体之间的路径邻域匹配示意图"

表2

数据集统计"

数据集实体关系关系三元组属性属性值属性三元组
WK31-15KEN-DE(V1)EN15 00021547 67628649 956837 555
DE15 00013150 41919457 661156 150
EN-DE(V2)EN15 00016984 86717145 80581 988
DE15 0009692 63211657 652186 335
EN-FR(V1)EN15 00026747 33430845 78373 121
FR15 00021040 86440439 77267 167
EN-FR(V2)EN15 00019396 31818936 39166 899
FR15 00016680 11222132 41268 779
DBP-15KJA-EN(DBP)JA19 8141 29977 2145 31184 814216 841
EN19 7801 15393 4844 79171 948199 917
FR-EN(DBP)FR19 661903105 9984 13693 680212 609
EN19 9931 208115 7225 84488 202259 496
DWY-100KDBP-WDDBpedia100 000330463 294356310 728622 331
WikiPedia100 000220448 774730765 610990 517
DBP-YGDBpedia100 000302428 952341365 617760 062
YAsGO3100 00031502 56333567 597827 671

表3

所有模型在WK31-15K和DBP-15K数据集上的交叉验证结果 (%)"

模型EN-DE(V1)EN-DE(V2)EN-FR(V1)
Hits@1Hits@5MRRHits@1Hits@5MRRHits@1Hits@5MRR
MTransE30.7051.8540.7319.3735.2127.4224.7746.7735.16
IPTranE35.0051.5443.0047.6467.8257.1516.9232.0024.32
JAPE28.8051.2739.4516.7232.9325.0026.2449.7137.20
BootEA67.5082.0074.0083.3591.2586.9950.7071.8260.34
AttrE57.1268.7459.7465.0981.6572.6248.1667.1256.90
RDGCN81.9887.5684.6081.6186.9884.1080.5387.6683.70
NMN85.5790.4587.7085.1889.5787.1085.1290.7487.66
RAGA87.9094.2890.8081.3489.1584.982.7191.5586.70
RHGT92.1896.3294.4093.8097.2095.3090.9295.5493.00
EPSA94.9397.1796.0997.0698.8497.8893.5796.2294.84
模型EN-FR(V2)JA-EN(DBP)FR-EN(DBP)
Hits@1Hits@5MRRHits@1Hits@5MRRHits@1Hits@5MRR
MTransE24.0243.6033.6920.4140.5230.3619.7440.3729.72
IPTranE23.6344.9133.9727.9252.7239.6131.2257.4243.46
JAPE29.2452.4840.2323.8644.5934.0022.9845.2233.66
BootEA66.0085.0074.5452.7171.8961.6857.6177.2766.62
AttrE53.5174.6263.1035.9660.3147.5240.2166.0952.22
RDGCN87.1292.8889.8081.2287.9884.4080.8888.0884.20
NMN89.2994.2891.5784.2990.4787.0083.4690.1086.40
RAGA88.9595.3691.9079.2989.1283.8085.2793.1788.90
RHGT94.9598.0096.3088.6494.3091.2088.9295.5991.90
EPSA96.5798.6097.6689.5695.9392.2889.8295.6492.25

表4

DWY-100K数据集上所有模型的总体性能 (%)"

模型DBP-WDDBP-YG
Hits@1Hits@5MRRHits@1Hits@5MRR
MulitiKE91.8696.2693.5588.0395.3290.68
RDGCN97.9099.1394.7597.34
NMN98.1299.2096.0098.27
COTSAE92.6897.8694.5794.3998.7496.14
RSA98.5499.2997.2097.90
RHGT99.2699.8699.5096.5898.8697.40
EPSA99.4299.2799.7798.7298.9199.25

表5

EPSA不同模块的消融实验 (%)"

模型EN-DE(V1)EN-DE(V2)EN-FR(V1)
Hits@1Hits@5MRRHits@1Hits@5MRRHits@1Hits@5s
EPSA94.9397.1796.0997.0698.8497.8893.5796.2294.84
RHGT92.1896.3294.4093.8097.2095.3090.9295.5493.00
(w/o EP)93.3196.5195.5095.2197.8096.9092.2196.1594.50
(w/o SA)93.7596.8595.0895.7398.4796.9692.4595.8994.62
模型EN-FR(V2)JA-EN(DBP)FR-EN(DBP)
Hits@1Hits@5MRRHits@1Hits@5MRRHits@1Hits@5MRR
EPSA96.5798.6097.6689.5695.9392.2889.8295.6492.25
RHGT94.9598.0096.3088.6494.3091.2088.9295.5991.90
(w/o EP)95.1098.1496.8086.9592.6989.6587.2393.3190.50
(w/o SA)95.2598.5897.1187.3292.8689.8287.6593.7291.25
模型DBP-WDDBP-YG
Hits@1Hits@5MRRHits@1Hits@5MRR
EPSA99.4299.2799.7798.7298.9199.25
RHGT99.2699.8699.5096.5898.8697.40
(w/o EP)98.4999.2799.6098.9498.9998.45
(w/o SA)98.2199.5999.5798.5599.2698.36

图5

EPSA与所有模型在EN-DE(V2)上的所有性能Hits@k比较"

图6

EPSA与在EN-FR (V2)上不同比例训练数据集的Hits@[1,5]性能。"

[1] 刘峤, 李杨, 段宏, 等. 知识图谱构建技术综述[J].计算机研究与发展, 2016, 53(3): 582-600.
Liu Q, Li Y, Duan H, et al. Knowledge graph construction techniques[J]. Jounraal of Computer Research and Development, 2016, 53(3): 582-600.
[2] Lehmann J, Isele R, Jakob M, et al. DBpedia—a large-scale, multilingual knowledge base extracted from wikipedia[J]. Semantic Web, 2015, 6(2): 167-195.
[3] Suchanek F, Kasneci G, Weikum G. YAGO: a core of semantic knowledge unifying WordNet and wikipedia[C]∥Proceedings of the 16th International Conference on World Wide Web, New York, USA, 2007:697-706.
[4] Bollacker K, Evans C, Paritosh P, et al. Freebase: a collaboratively created graph database for structuring human knowledge[C]∥Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, New York, USA, 2008: 1247-1250.
[5] Xu B, Xu Y, Liang J, et al. CN-dbpedia: a never-ending Chinese knowledge extraction system[C]∥Proceedings of the International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems.Berlin: Springer, 2017: 428-438.
[6] 张婷婷, 欧阳丹彤, 孙成林, 等. 融合协同过滤的神经Bandits推荐算法[J]. 吉林大学学报: 理学版, 2024, 62(1): 92-99.
Zhang Ting-ting, Ouyang Dan-tong, Sun Cheng-lin, et al. Neural bandits recommendation algorithm based on collaborative filtering[J]. Journal of Jilin University (Science Edition), 2024, 62(1): 92-99.
[7] 孟令鑫, 才华, 付强, 等. 基于关系记忆与路径信息的多跳知识图谱问答算法[J]. 吉林大学学报: 理学版, 2024, 62(6): 1391-1400.
Meng Ling-xin, Cai Hua, Fu Qiang, et al. Multi-hop knowledge graph question answering algorithm based on relational memory and path information[J]. Journal of Jilin University (Science Edition), 2024, 62(6): 1391-1400.
[8] 李鑫, 王文迪, 张伟, 等. 基于知识嵌入技术的制度文件推荐算法[J]. 吉林大学学报: 理学版, 2024, 62(6): 1377-1383.
Li Xin, Wang Wen-di, Zhang Wei, et al. Recommendation algorithm for institutional documents based on knowledge embedding technology[J]. Journal of Jilin University (Science Edition), 2024, 62(6): 1377-1383.
[9] 化青远, 彭涛, 崔海, 毕海嘉. 基于知识图谱中路径推理的多轮对话模型[J]. 吉林大学学报: 理学版, 2025, 63(1): 76-82.
Hua Qing-yuan, Peng Tao, Cui Hai, et al. Multi round conversational model based on path reasoning in knowledge graph[J]. Journal of Jilin University (Science Edition), 2025, 63(1): 76-0082.
[10] 何山, 肖晰, 张嘉玲. 面向领域知识图谱的实体关系抽取模型仿真[J]. 吉林大学学报: 理学版, 2025, 63(2): 465-471.
He Shan, Xiao Xi, Zhang Jia-ling. Simulation of entity relationship extraction model for domain knowledge graph[J]. Journal of Jilin University (Science Edition), 2025, 63(2): 465-471.
[11] 费敏学, 黄东岩, 郭晓新. 改进蜣螂算法优化机器学习模型[J]. 吉林大学学报: 理学版, 2025, 63(4): 1117-1121.
Fei Min-xue, Huang Dong-yan, Guo Xiao-xin. Improve dung beetle algorithm to optimize machine learning model[J]. Journal of Jilin University (Science Edition), 2025, 63(4): 1117-1121.
[12] 汪雨竹, 彭涛, 朱蓓蓓, 等. 基于元学习的小样本知识图谱补全[J]. 吉林大学学报: 理学版, 2023, 61(3): 623-630.
Wang Yu-zhu, Peng Tao, Zhu Bei-bei, et al. Few-shot knowledge graph completion based on meta learning[J]. Journal of Jilin University (Science Edition), 2023, 61(3): 623-630.
[13] Lu W, Wang P, Ma X, et al. Enrich cross-lingual entity links for inline wikis via multi-modal semantic matching[J]. Information Processing & Management, 2020, 57(5): 102271.
[14] 王雪鹏, 刘康, 何世柱, 等. 基于网络语义标签的多源知识库实体对齐算法[J].计算机学报, 2017, 40(3): 701-711.
Wang Xue-peng, Liu Kang, He Shi-zhu, et al. Multi-source knowledge bases entity alignment by leveraging semantic tags[J]. Chinese Journal of Computers, 2017, 40(3): 701-711.
[15] Zhang C, Song D, Huang C, et al. Heterogeneous graph neural network[C]∥Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Anchorage, USA, 2019: 793-803.
[16] 庄严, 李国良, 冯建华. 知识库实体对齐技术综述[J]. 计算机研究与发展, 2016, 53(1): 165-192.
Zhuang Yan, Li Guo-liang, Feng Jian-hua. A survey on entity alignment of knowledge base[J]. Journal of Computer Research and Development, 2016, 53(1): 165-192.
[17] 乔晶晶, 段利国, 李爱萍. 融合多种特征的实体对齐算法[J]. 计算机工程与设计, 2018, 39(11): 3395-3400.
Qiao Jing-jing, Duan Li-guo, Li Ai-ping. Entity alignment algorithm based on multi-features[J]. Computer Engineering and Design, 2018, 39(11): 3395-3400.
[18] Kipf N, Welling M. Semi-supervised classification with graph convolutional networks[C]∥Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2016: 160902907.
[19] Veličković P, Cucurull G, Casanova A, et al. Graph attention networks[C]∥Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018: 171010903v3.
[20] Wu Y, Liu X, Feng Y, et al. Jointly learning entity and relation representations for entity alignment[C]∥Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, 2019: 240-249.
[21] Wu Y, Liu X, Feng Y, et al. Relation-aware entity alignment for heterogeneous knowledge graphs[C]∥Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, 2019: 5278-5284.
[22] Wu Y, Liu X, Feng Y, et al. Neighborhood matching network for entity alignment[C]∥Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Virtual, 2020: 6477-6487.
[23] Bordes A, Usunier N, García D A, et al. Translating embeddings for modeling multi-relational data[C]∥Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, 2013: 2787-2795.
[24] Chen M, Tian Y, Chang K. Co-training embeddings of knowledge graphs and entity descriptions for cross-lingual entity alignment[C]∥Proceedings of the 27th International Joint Conference on Artificial Intelligence, Jeju, South Korea, 2018: 3998-4004.
[25] Zhu Q, Wei H, Sisman B, et al. Collective multi-type entity alignment between knowledge graphs[C]∥Proceedings of the 2020 World Wide Web Conference, Taipei, China, 2020: 2241-2252.
[26] Kipf T, Welling M. Semi-supervised classification with graph convolutional networks[C]∥Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017: 160902907.
[27] Chen L, Gu W, Tian X, et al. AHAB: aligning heterogeneous knowledge bases via iterative blocking[J]. Information Processing & Management, 2019, 56(1): 1-13.
[28] Cao Y, Liu Z, Li C, et al. Multi-channel graph neural network for entity alignment[C]∥Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 2019: 1452-1461.
[29] Wu Y, Liu X, Feng Y, et al. Neighborhood matching network for entity alignment[C]∥Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Virtual, 2020: 6477-6487.
[30] Sun Z. A benchmarking study of embedding-based entity alignment for knowledge graphs[C]∥Proceedings of the VLDB Endowment, Tokyo, Japan, 2020: 2326-2340.
[31] Wang Z, Lan X, Zhang Y, et al. Cross-lingual knowledge graph alignment via graph convolutional networks[C]∥Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 2018: 349-357.
[32] Yang H W, Zou Y, Shi P, et al. Aligning cross-lingual entities with multi-aspect information[C]∥Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, 2019: 4430-4440.
[33] Ye R, Li X, Fang Y, et al. A vectorized relational graph convolutional network for multi-relational network alignment[C]∥Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, 2019: 4135-4141.
[34] Xu K, Wang L, Yu M, et al. Cross-lingual knowledge graph alignment via graph matching neural network[C]∥Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 2019: 3156-3161.
[35] Zhang Q, Sun Z, Hu W, et al. Multi-view knowledge graph embedding for entity alignment[C]∥Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, 2019: 5429-5435.
[36] Sun Z, Hu W. Cross-lingual entity alignment via joint attribute-preserving embedding[C]∥Proceedings of the International Semantic Web Conference, Vienna, Austria, 2017: 628-644.
[37] Trisedya B, Qi J, Zhang R. Entity alignment between knowledge graphs using attribute embeddings[C]∥Proceedings of the 33th AAAI Conference on Artificial Intelligence, Hawaii, USA, 2019: 297-304.
[38] Liu Z, Cao Y, Pan L, et al. Exploring and evaluating attributes, values, and structures for entity alignment[C]∥Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, Virtual, 2020: 6355-6364.
[39] Yang K, Liu S, Zhao J, et al. COTSAE: co-training of structure and attribute embeddings for entity alignment[C]∥Proceedings of 34th AAAI Conference on Artificial Intelligence, New York, USA, 2020: 3025-3032.
[40] Chen B, Zhang J, Tang X, et al. JarKA: modeling attribute interactions for cross-lingual knowledge alignment[C]∥Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Singapore, Singapore, 2020: 845-856.
[41] Chen M, Tian Y, Chang K, et al. Co-training embeddings of knowledge graphs and entity descriptions for cross-lingual entity alignment[C]∥Proceedings of the 27th International Joint Conference on Artificial Intelligence, Jeju, South Korea, 2018: 3998-4004.
[42] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]∥Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, 2013: 3111-3119.
[43] Rahimi A, Cohn T, Baldwin T. Semi-supervised user geolocation via graph convolutional networks[C]∥Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 2018: 2009-2019.
[44] Liu P, Li H, Wang Z, et al. Multi-features based semantic augmentation networks for named entity recognition in threat intelligence[J]. International Conference on Pattern Recognition, 2022, 7: 250243626.
[45] Sun Z, Hu W, Zhang Q, et al. Bootstrapping entity alignment with knowledge graph embedding[C]∥Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 2018: 4396-4402.
[46] Zhu R B, Ma M, Wang P. RAGA: relation-aware graph attention networks for global entity alignment[C]∥Advances in Knowledge Discovery and Data Mining: 25th Pacific-Asia Conference, Virtual, 2021: 501-513.
[47] Cai W S, Ma W J, Zhan J Y, et al. Entity alignment with reliable path reasoning and relation-aware heterogeneous graph transformer[C]∥Proceedings of the 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence, Vienna, Austria, 2022: 1930-1937.
[1] 金志刚,苏仁鋆,赵晓芳. 基于异质图网络的心理评估方法[J]. 吉林大学学报(工学版), 2024, 54(4): 1078-1085.
[2] 姜文明,齐立忠,张苏,荣经国,武宏波,左超,张晓龙. 基于知识图谱的电网工程建筑信息模型质量检查方法[J]. 吉林大学学报(工学版), 2024, 54(3): 807-814.
[3] 王文军,余银峰. 考虑数据稀疏的知识图谱缺失连接自动补全算法[J]. 吉林大学学报(工学版), 2022, 52(6): 1428-1433.
[4] 黎才茂,陈少凡,林成蓉,林昊,陈秋红. 基于知识图谱的多粒度社交网络用户画像构建方法[J]. 吉林大学学报(工学版), 2022, 52(12): 2947-2953.
[5] 黎才茂,陈少凡,林成蓉,候玉权,李浩. 基于循环知识图谱的虚拟社区知识动态推荐方法[J]. 吉林大学学报(工学版), 2022, 52(10): 2385-2390.
[6] 雷景佩,欧阳丹彤,张立明. 基于知识图谱嵌入的定义域值域约束补全方法[J]. 吉林大学学报(工学版), 2022, 52(1): 154-161.
[7] 朱小龙,谢忠. 基于海量文本数据的知识图谱自动构建算法[J]. 吉林大学学报(工学版), 2021, 51(4): 1358-1363.
[8] 段阳,侯力,冷松. 金属切削加工知识图谱构建及应用[J]. 吉林大学学报(工学版), 2021, 51(1): 122-133.
[9] 欧阳丹彤,马骢,雷景佩,冯莎莎. 知识图谱嵌入中的自适应筛选[J]. 吉林大学学报(工学版), 2020, 50(2): 685-691.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!