吉林大学学报(理学版) ›› 2023, Vol. 61 ›› Issue (2): 317-324.

• • 上一篇    下一篇

基于正向软标签的轻量级关系抽取

宋函宇1,2,  欧阳丹彤1,3, 叶育鑫1,3   

  1. 1. 吉林大学 计算机科学与技术学院, 长春 130012; 2. 一汽-大众汽车有限公司, 长春 130011;
    3. 吉林大学 符号计算与知识工程教育部重点实验室, 长春 130012
  • 收稿日期:2022-01-04 出版日期:2023-03-26 发布日期:2023-03-26
  • 通讯作者: 欧阳丹彤 E-mail:ouyd@jlu.edu.cn

Lightweight Relation Extraction Based on Positive Soft Labels

SONG Hanyu1,2, OUYANG Dantong1,3,  YE Yuxin1,3   

  1. 1. College of Computer Science and Technology, Jilin University, Changchun 130012, China;
    2. FAW-Volkswagen Automotive Co.Ltd, Changchun 130011, China;
    3. Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
  • Received:2022-01-04 Online:2023-03-26 Published:2023-03-26

摘要: 针对关系抽取模型规模越来越大、 耗时越来越长的问题, 提出一种知识筛选机制, 利用筛选出的正向软标签构造轻量级关系抽取模型. 首先, 利用知识蒸馏提取出知识并将其存储在软标签中, 为避免知识蒸馏中教师与学生间差距大导致的知识难吸收问题, 使用教师助手知识蒸馏模式; 其次, 使用标签的余弦相似度筛选出正向软标签, 在每步蒸馏中都动态赋予正向软标签更高的权重, 以此削弱知识传递中错误标签导致的影响. 在数据集SemEval-2010 Task 8上的实验结果表明, 该模型不仅能完成轻量化关系抽取任务, 还能提升抽取精度.

关键词: 轻量级关系抽取, 知识筛选, 正向软标签, 知识蒸馏, 余弦相似度

Abstract: Aiming at  the problem that the scale of relation extraction model was getting larger and larger, and the time consumption was getting longer and longer, we proposed a knowledge filtering mechanism to construct a lightweight relation extraction model by using the positive soft labels selected. Firstly, knowledge distillation was used to extract knowledge and store knowledge in soft labels. In order to avoid the problem of difficult  absorption of knowledge caused by the large gap between  teachers and  students in knowledge distillation, we used teacher assistant knowledge distillation pattern. Secondly,  the cosine similarity of labels was used to filter the positive soft labels and the positive soft labels were dynamically given  higher weight in each step of the distillation, so as to  weaken the influence caused by  the wrong labels in the knowledge transfer. The experimental results on SemEval-2010 Task 8 dataset show that the proposed  mode can not only complete the task of lightweight relation extraction, but also improve the extraction accuracy.

Key words: lightweight relation extraction, knowledge filtering, positive soft label, knowledge distillation, cosine similarity

中图分类号: 

  • TP391.1