吉林大学学报(工学版) ›› 2019, Vol. 49 ›› Issue (3): 912-919.doi: 10.13229/j.cnki.jdxbgxb20180042

• • 上一篇    下一篇

基于实体对弱约束的远监督关系抽取

欧阳丹彤1,2(),肖君1,2,叶育鑫2()   

  1. 1. 吉林大学 计算机科学与技术学院,长春130012
    2. 吉林大学 符号计算与知识工程教育部重点实验室,长春 130012
  • 收稿日期:2018-01-12 出版日期:2019-05-01 发布日期:2019-07-12
  • 通讯作者: 叶育鑫 E-mail:ouyd@jlu.edu.cn;yeyx@jlu.edu.cn
  • 作者简介:欧阳丹彤(1968?),女,教授,博士生导师. 研究方向:基于模型诊断,语义网. E?mail:ouyd@jlu.edu.cn
  • 基金资助:
    国家自然科学基金项目(61672261,61502199,61402196);浙江省自然科学基金项目(LY16F020004)

Distant supervision for relation extraction with weakconstraints of entity pairs

Dan⁃tong OUYANG1,2(),Jun XIAO1,2,Yu⁃xin YE2()   

  1. 1. College of Computer Science and Technology, Jilin University, Changchun 130012, China
    2. Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun 130012, China
  • Received:2018-01-12 Online:2019-05-01 Published:2019-07-12
  • Contact: Yu?xin YE E-mail:ouyd@jlu.edu.cn;yeyx@jlu.edu.cn

摘要:

为缓解远监督关系抽取中的假阳性问题并进一步提高关系抽取的准确率和召回率,提出基于实体对弱约束的远监督关系抽取模型。首先,从知识库和文本中获取实体对的约束信息,约束信息由实体对关键词和实体类型两部分组成;然后,通过训练神经网络模型自动获取不同关系所对应的实体对约束信息的特征;最后,将这些特征用作弱约束联合语句特征一起进行关系预测。在对比实验中,基于实体对弱约束的模型达到了更高的准确率和召回率,表明了实体对弱约束能有效缓解假阳性问题、加强关系抽取。

关键词: 人工智能, 远监督关系抽取, 神经网络, 实体对弱约束, 注意力机制

Abstract:

In order to alleviate the false positive problem in distant supervision for relation extraction and improve the precision and recall rate, this paper presents a distant supervision model with weak constraints of entity pairs for relation extraction. This approach first gains constraint information of entity pairs from knowledge base and plain text, which contains key words of entity pairs and entity types. Then the model can obtain features of constraint information automatically by training neural networks. Then these features are used as weak constraints during relation prediction in company with the features of sentences. In contrast experiments, the model with weak constraints of entity pairs achieves higher precision and recall rate. Results show that weak constraints of entity pairs can effectively alleviate the false positive problem and enhance relation extraction.

Key words: artificial intelligence, distant supervision for relation extraction, neural networks, weak constraints of entity pairs, attention mechanism

中图分类号: 

  • TP391

图1

实体对关键词的定义"

表1

实体类型信息"

[Neville Chamberlain]:/base/uk_parliament/topic/people/person/soccer/football_player/government/politician
[Germany]:/sports/sports_team_location/film/film_location/base/languages_for_domain_names/topic/location/country/government/government/location/statistical_region/location/location

图2

弱约束的处理过程"

图3

语句的处理过程"

图4

基于弱约束的注意力机制结构图"

图5

Held?out评价结果"

表2

人工评价结果"

方 法 准确率/%
前100 前200 前500 平均值
Mintz 0.77 0.71 0.55 0.676
MultiR 0.83 0.74 0.59 0.720
MIML 0.85 0.75 0.61 0.737
PCNN+ONE 0.86 0.80 0.69 0.783
PCNN+ATT 0.86 0.81 0.71 0.793
PCNN+ATT+D 0.86 0.82 0.74 0.806
PCNN+ATT+C 0.87 0.83 0.75 0.816
1 Zhou G D , Zhang M , Ji D H , et al . Tree kernel⁃based relation extraction with context⁃sensitive structured parse tree information[C]∥Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning,Prague,Czech Republic,2007:728⁃736.
2 Mintz M , Bills S , Snow R , et al . Distant supervision for relation extraction without labeled data[C]∥Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP,Stroudsburg, PA, USA,2009:1003⁃1011.
3 Riedel S , Yao L M , Mccallum A . Modeling relations and their mentions without labeled text[C]∥European Conference on Machine Learning and Knowledge Discovery in Databases, Barcelona, Spain, 2010:148⁃163.
4 Hoffmann R , Zhang C L , Ling X , et al . Knowledge⁃based weak supervision for information extraction of overlapping relations[C]∥The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies,Portland, Oregon, USA, 2011:541⁃550.
5 Surdeanu M , Tibshirani J , Nallapati R , et al . Multi⁃instance multi⁃label learning for relation extraction[C]∥Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language,Jeju Island, Korea,2012:455⁃465.
6 Lin Yan⁃kai , Shen Shi⁃qi , Liu Zhi⁃yuan , et al . Neural relation extraction with selective attention over instances[C]∥Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,Berlin, Germany,2016:2124⁃2133.
7 Socher R , Huval B , Manning C D , et al . Semantic compositionality through recursive matrix⁃vector spaces[J/OL].[2017⁃12⁃28].http:∥ai.stanford.edu/~ang/papers/emnlp12⁃SemanticCompositionalityRecursiveMatrixVectorSpaces.pdf.
8 Zeng Dao⁃jian , Liu Kang , Lai Si⁃wei ,et al . Relation classification via convolutional deep neural network[C]∥The 25th International Conference on Computational Linguistics,Dublin,Ireland,2014:2335⁃2344.
9 dos Santos C N , Xiang B , Zhou B . Classifying relations by ranking with convolutional neural networks[C]∥The 7th International Joint Conference on Natural Language Processing,Beijing, China,2015:626⁃634.
10 Zeng Dao⁃jian , Liu Kang , Chen Yu⁃bo , et al . Distant supervision for relation extraction via piecewise convolutional neural networks[C]∥Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing,Lisbon,Portugal,2015:1753⁃1762.
11 Ji Guo⁃liang , Liu Kang , He Shi⁃zhu ,et al . Distant supervision for relation extraction with sentence⁃level attention and entity descriptions[J/OL].[2017⁃12⁃28].http:∥⁃JiG⁃14491.pdf.
12 Xie R B , Liu Z Y , Sun M S . Representation learning of knowledge graphs with hierarchical types[C]∥Proceedings of the Twenty⁃Fifth International Joint Conference on Artificial Intelligence,New York, USA,2016:2965⁃2971.
13 Bollacker K , Evans C , Paritosh P , et al . Freebase:a collaboratively created graph database for structuring human knowledge[C]∥Proceedings of the ACM SIGMOD International Conference on Management of Data,Vancouver, BC, Canada,2008:1247⁃1250.
14 Auer S , Bizer C , Kobilarov G , et al . DBpedia: a nucleus for a web of open data[J/OL].[2018⁃01⁃04].http:∥.
15 Mikolov T , Chen K , Corrado G , et al . Efficient estimation of word representations in vector space[J/OL].[2018⁃01⁃04]. https:∥.
16 Srivastava N , Hinton G , Krizhevsky A , et al . Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research,2014,15(1):1929⁃1958.
17 Hinton G E , Srivastava N , Krizhevsky A , et al . Improving neural networks by preventing co⁃adaptation of feature detectors[J]. Computer Science,2012,3(4):212⁃223.
[1] 陈磊,王江锋,谷远利,闫学东. 基于思维进化优化的多源交通数据融合算法[J]. 吉林大学学报(工学版), 2019, 49(3): 705-713.
[2] 席利贺,张欣,孙传扬,王泽兴,姜涛. 增程式电动汽车自适应能量管理策略[J]. 吉林大学学报(工学版), 2018, 48(6): 1636-1644.
[3] 江涛,林学东,李德刚,杨淼,汤雪林. 基于人工神经网络的放热规律的量化预测[J]. 吉林大学学报(工学版), 2018, 48(6): 1747-1754.
[4] 徐岩,孙美双. 基于卷积神经网络的水下图像增强方法[J]. 吉林大学学报(工学版), 2018, 48(6): 1895-1903.
[5] 董飒, 刘大有, 欧阳若川, 朱允刚, 李丽娜. 引入二阶马尔可夫假设的逻辑回归异质性网络分类方法[J]. 吉林大学学报(工学版), 2018, 48(5): 1571-1577.
[6] 顾海军, 田雅倩, 崔莹. 基于行为语言的智能交互代理[J]. 吉林大学学报(工学版), 2018, 48(5): 1578-1585.
[7] 王旭, 欧阳继红, 陈桂芬. 基于垂直维序列动态时间规整方法的图相似度度量[J]. 吉林大学学报(工学版), 2018, 48(4): 1199-1205.
[8] 张浩, 占萌苹, 郭刘香, 李誌, 刘元宁, 张春鹤, 常浩武, 王志强. 基于高通量数据的人体外源性植物miRNA跨界调控建模[J]. 吉林大学学报(工学版), 2018, 48(4): 1206-1213.
[9] 黄岚, 纪林影, 姚刚, 翟睿峰, 白天. 面向误诊提示的疾病-症状语义网构建[J]. 吉林大学学报(工学版), 2018, 48(3): 859-865.
[10] 李雄飞, 冯婷婷, 骆实, 张小利. 基于递归神经网络的自动作曲算法[J]. 吉林大学学报(工学版), 2018, 48(3): 866-873.
[11] 刘杰, 张平, 高万夫. 基于条件相关的特征选择方法[J]. 吉林大学学报(工学版), 2018, 48(3): 874-881.
[12] 底晓强, 王英政, 李锦青, 从立钢, 祁晖. 基于量子细胞神经网络超混沌的视频加密方法[J]. 吉林大学学报(工学版), 2018, 48(3): 919-928.
[13] 王旭, 欧阳继红, 陈桂芬. 基于多重序列所有公共子序列的启发式算法度量多图的相似度[J]. 吉林大学学报(工学版), 2018, 48(2): 526-532.
[14] 杨欣, 夏斯军, 刘冬雪, 费树岷, 胡银记. 跟踪-学习-检测框架下改进加速梯度的目标跟踪[J]. 吉林大学学报(工学版), 2018, 48(2): 533-538.
[15] 刘雪娟, 袁家斌, 许娟, 段博佳. 量子k-means算法[J]. 吉林大学学报(工学版), 2018, 48(2): 539-544.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 戴岩, 聂少锋, 周天华. 带环梁的方钢管约束钢骨混凝土柱-钢梁节点滞回性能有限元分析[J]. 吉林大学学报(工学版), 2018, 48(5): 1426 -1435 .
[2] 崔玲玲, 卢朝阳, 李静, 李益红. 基于非下采样Contourlet域高斯混合模型的布匹瑕疵识别算法[J]. 吉林大学学报(工学版), 2013, 43(03): 734 -739 .
[3] 姚运仕, 闫青青, 王瑞龙, 苏沛, 陈世斌, 冯忠绪. 车载式液态融冰雪剂洒布机洒布质量的控制[J]. 吉林大学学报(工学版), 2016, 46(1): 120 -125 .
[4] 王涛, 伞晓刚, 高世杰, 王惠先, 王晶, 倪迎雪. 光电跟踪转台垂直轴系动态特性[J]. 吉林大学学报(工学版), 2018, 48(4): 1099 -1105 .
[5] 田彦涛, 张宇, 王晓玉, 陈华. 基于平方根无迹卡尔曼滤波算法的电动汽车质心侧偏角估计[J]. 吉林大学学报(工学版), 2018, 48(3): 845 -852 .
[6] 朱剑峰, 张君媛, 陈潇凯, 洪光辉, 宋正超, 曹杰. 基于座椅拉拽安全性能的车身结构改进设计[J]. 吉林大学学报(工学版), 2018, 48(5): 1324 -1330 .
[7] 秦静, 徐鹤, 裴毅强, 左子农, 卢莉莉. 初始温度和初始压力对甲烷-甲醇裂解气预混层流燃烧特性的影响[J]. 吉林大学学报(工学版), 2018, 48(5): 1475 -1482 .
[8] 刘寒冰, 时成林, 谭国金, 王华, 黄彬. 基于分段思想的变截面连续梁桥动力特性计算[J]. 吉林大学学报(工学版), 2015, 45(6): 1779 -1783 .
[9] 宫亚峰, 申杨凡, 谭国金, 韩春鹏, 何钰龙. 不同孔隙率下纤维土无侧限抗压强度[J]. 吉林大学学报(工学版), 2018, 48(3): 712 -719 .
[10] 潘乔, 裴昌幸. 基于信息熵理论的高速IPv6网络流量抽样测量方法[J]. 吉林大学学报(工学版), 2009, 39(05): 1337 -1341 .