知识图谱嵌入中的自适应筛选

doi:10.13229/j.cnki.jdxbgxb20180791

吉林大学学报(工学版) ›› 2020, Vol. 50 ›› Issue (2): 685-691.doi: 10.13229/j.cnki.jdxbgxb20180791

• 计算机科学与技术 • 上一篇

知识图谱嵌入中的自适应筛选

欧阳丹彤^1,²(),马骢^1,²,雷景佩^1,²(),冯莎莎^1,²

^1.吉林大学计算机科学与技术学院，长春 130012
^2.吉林大学符号计算与知识工程教育部重点实验室，长春 130012

收稿日期:2018-07-29 出版日期:2020-03-01 发布日期:2020-03-08
通讯作者: 雷景佩 E-mail:ouyd@jlu.edu.cn;378666306@qq.com
作者简介:欧阳丹彤(1968-),女,教授,博士生导师.研究方向:基于模型诊断，语义网.E-mail: ouyd@jlu.edu.cn
基金资助:
国家自然科学基金项目(61872159)

Knowledge graph embedding with adaptive sampling

Dan-tong OUYANG^1,²(),Cong MA^1,²,Jing-pei LEI^1,²(),Sha-sha FENG^1,²

^1.College of Computer Science and Technology, Jilin University, Changchun 130012, China
^2.Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China

Received:2018-07-29 Online:2020-03-01 Published:2020-03-08
Contact: Jing-pei LEI E-mail:ouyd@jlu.edu.cn;378666306@qq.com

摘要/Abstract

摘要：

针对知识图谱数据类别不平衡与训练难度不同，随机进行训练数据采样可能导致嵌入模型不能快速收敛的问题，提出了一种自适应的筛选训练数据方法。对训练数据按照关系类别进行分组，采样过程中首先根据概率选择关系类别，然后从选定的分组中随机选择一个实例进行训练。根据训练效果对每组实例被选择的概率进行自适应调整。实验结果表明：自适应的分组筛选在链接预测任务上取得了更好的结果，使嵌入模型更快、更好地收敛。

关键词: 人工智能, 知识图谱嵌入, 基于翻译的嵌入模型, 自适应筛选, 链接预测

Abstract:

Due to the imbalance of KG data and the difficulty of training, that random sampling of training data may make it difficult for embedded models to converge rapidly. Therefore, in this paper, an adaptive method for sampling of training data is proposed. The training data are grouped according to the different relationships. In the sampling process, a group is determined according to the probability, and then an instance is randomly selected from the determined group for training. At the same time, according to the training effect, the probability of each selected instance is adjusted adaptively. Experimental results show that adaptive grouping filter achieves better results in link prediction tasks, and enables the embedded model to converge faster and better.

Key words: artificial intelligence, knowledge graph embedding, translation-based embedding models, adaptive sampling, link prediction

中图分类号:

TP391

欧阳丹彤,马骢,雷景佩,冯莎莎. 知识图谱嵌入中的自适应筛选[J]. 吉林大学学报(工学版), 2020, 50(2): 685-691.

Dan-tong OUYANG,Cong MA,Jing-pei LEI,Sha-sha FENG. Knowledge graph embedding with adaptive sampling[J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(2): 685-691.

图/表 4

表1

表2

图1

图2

参考文献 16

1	Bollacker K, Evans C, Paritosh P, et al. Freebase:a collaboratively created graph database for structuring human knowledge[C]∥ Proceedings of the SIGMOD Conference, Vancouver, Canada, 2008: 1247- 1250.
2	Miller G A. WordNet: a lexical database for English[J]. Communications of the Acm, 1995, 38( 11): 39- 41.
3	Lehmann J, Isele R, Jakob M, et al. DBpedia-a large-scale, multilingual knowledge base extracted from wikipedia[J]. Semantic Web, 2015, 6( 2): 167- 195.
4	Daiber J, Jakob M, Hokamp C, et al. Improving efficiency and accuracy in multilingual entity extraction[C]∥ Proceedings of the 9th International Conference on Semantic Systems, Graz, Austria, 2013: 121- 124.
5	Zhang Y, Dai H, Kozareva Z, et al. Variational reasoning for question answering with knowledge graph[C]∥ Proceedings of the 32nd AAAI, New Orleans, 2018: 6069- 6076.
6	Wang Q, Mao Z, Wang B, et al. Knowledge graph embedding: a survey of approaches and applications[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29( 12), 2724- 2743.
7	刘知远, 孙茂松, 林衍凯, 等. 知识表示学习研究进展[J]. 计算机研究与发展, 2016, 53( 2): 247- 261.
	Liu Zhi-yuan, Sun Mao-song, Lin Yan-kai, et al. Knowledge representation learning: a review[J]. Journal of Computer Research and Development, 2016, 53 ( 2): 247- 261.
8	Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data[C]∥ Proceedings of the 27th Annual Conference on Neural Information Processing System, Lake Tahoe, 2013: 2787- 2795.
9	Wang Z, Zhang J, Feng J, et al. Knowledge graph embedding by translating on hyperplanes[C]∥ Proceedings of the 28th AAAI Conference on Artificial Intelligence, Québec City, Canada, 2014: 1112- 1119.
10	Lin Y, Liu Z, Zhu X, et al. Learning entity and relation embeddings for knowledge graph completion[C]∥ Proceedings of the 29th AAAI Conference on Artificial Intelligence, Austin, Texas, USA, 2015: 2181- 2187.
11	Ji G, He S, Xu L, et al. Knowledge graph embedding via dynamic mapping matrix[C]∥ Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, China, 2015: 687- 696.
12	Ji G, Liu K, He S, et al. Knowledge graph completion with adaptive sparse transfer matrix[C]∥ Proceedings of the 30th AAAI Conference on Artificial Intelligence, Phoenix, Arizona, USA, 2016: 985- 991.
13	Liu H, Wu Y, Yang Y. Analogical inference for multi-relational embeddings[C]∥ Proceedings of the 34th International Conference on Machine Learning, Sydney, NSW, Australia, 2017: 2168- 2178.
14	Wang P, Li S, Pan R. Incorporating GAN for negative sampling in knowledge representation learning[C]∥ Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA, 2018: 2005- 2012.
15	Liu H, Wu Y, Yang Y. Adaptive sampling for SGD by exploiting side information[C]∥ Proceedings of the 33rd International Conference on Machine Learning, New York, 2016: 364- 372.
16	刘峤, 韩明皓, 杨晓慧, 等. 基于表示学习和语义要素感知的关系推理算法[J]. 计算机研究与发展, 2017, 54( 8): 1682- 1692.
	Liu Qiao, Han Ming-hao, Yang Xiao-hui, et al. Representation learning based relational inference algorithm with semantical aspect awareness[J]. Journal of Computer Research and Development, 2017, 54( 8): 1682- 1692.

相关文章 15

[1]	李贻斌,郭佳旻,张勤. 人体步态识别方法与技术[J]. 吉林大学学报(工学版), 2020, 50(1): 1-18.
[2]	徐谦,李颖,王刚. 基于深度学习的行人和车辆检测[J]. 吉林大学学报(工学版), 2019, 49(5): 1661-1667.
[3]	高万夫,张平,胡亮. 基于已选特征动态变化的非线性特征选择方法[J]. 吉林大学学报(工学版), 2019, 49(4): 1293-1300.
[4]	欧阳丹彤,肖君,叶育鑫. 基于实体对弱约束的远监督关系抽取[J]. 吉林大学学报(工学版), 2019, 49(3): 912-919.
[5]	顾海军, 田雅倩, 崔莹. 基于行为语言的智能交互代理[J]. 吉林大学学报(工学版), 2018, 48(5): 1578-1585.
[6]	董飒, 刘大有, 欧阳若川, 朱允刚, 李丽娜. 引入二阶马尔可夫假设的逻辑回归异质性网络分类方法[J]. 吉林大学学报(工学版), 2018, 48(5): 1571-1577.
[7]	王旭, 欧阳继红, 陈桂芬. 基于垂直维序列动态时间规整方法的图相似度度量[J]. 吉林大学学报(工学版), 2018, 48(4): 1199-1205.
[8]	张浩, 占萌苹, 郭刘香, 李誌, 刘元宁, 张春鹤, 常浩武, 王志强. 基于高通量数据的人体外源性植物miRNA跨界调控建模[J]. 吉林大学学报(工学版), 2018, 48(4): 1206-1213.
[9]	李雄飞, 冯婷婷, 骆实, 张小利. 基于递归神经网络的自动作曲算法[J]. 吉林大学学报(工学版), 2018, 48(3): 866-873.
[10]	刘杰, 张平, 高万夫. 基于条件相关的特征选择方法[J]. 吉林大学学报(工学版), 2018, 48(3): 874-881.
[11]	黄岚, 纪林影, 姚刚, 翟睿峰, 白天. 面向误诊提示的疾病-症状语义网构建[J]. 吉林大学学报(工学版), 2018, 48(3): 859-865.
[12]	王旭, 欧阳继红, 陈桂芬. 基于多重序列所有公共子序列的启发式算法度量多图的相似度[J]. 吉林大学学报(工学版), 2018, 48(2): 526-532.
[13]	刘雪娟, 袁家斌, 许娟, 段博佳. 量子k-means算法[J]. 吉林大学学报(工学版), 2018, 48(2): 539-544.
[14]	杨欣, 夏斯军, 刘冬雪, 费树岷, 胡银记. 跟踪-学习-检测框架下改进加速梯度的目标跟踪[J]. 吉林大学学报(工学版), 2018, 48(2): 533-538.
[15]	李嘉菲, 孙小玉. 基于谱分解的不确定数据聚类方法[J]. 吉林大学学报(工学版), 2017, 47(5): 1604-1611.

Metrics

Viewed

Full text

363

HTML			PDF

Just accepted	Online first	Issue	Just accepted	Online first	Issue
0	0	14	0	0	349

From	Others	local

Times	68	295
Rate	19%	81%

Abstract

664

Just accepted	Online first	Issue

0	0	664

From	Others	local

Times	662	2
Rate	100%	0%

Cited

Web of Science	Crossref	ScienceDirect	Search for Citations in Google Scholar >>


This page requires you have already subscribed to WoS.

Shared

Discussed

数据集	实体数	关系数	训练集	验证集	测试集
FB15k	14 951	1 345	483 142	50 000	59 071
WN18	40 943	18	141 442	5 000	5 000
FB15k-237	14 541	237	272 115	17 535	20 466

数据集	Metric	TransE	AST	TransE_NZL	AST_NZL
FB15k	Mean Rank	142	134	144	117
FB15k	Hits@10	0.714	0.716	0.733	0.795
WN18	Mean Rank	490	457	456	425
WN18	Hits@10	0.932	0.939	0.926	0.946
FB15k-237	Mean Rank	252	255	319	308
FB15k-237	Hits@10	0.422	0.423	0.443	0.458

知识图谱嵌入中的自适应筛选

Knowledge graph embedding with adaptive sampling

RICH HTML

PDF (PC)

摘要/Abstract

引用本文

使用本文

图/表 4

参考文献 16

相关文章 15

Metrics

本文评价

推荐阅读 0