吉林大学学报(信息科学版) ›› 2021, Vol. 39 ›› Issue (4): 403-408.

• • 上一篇    下一篇

基于映射规则的数据产品自动生成方法及系统

李子恒1a,1b , 叶育鑫1a,1b , 曹玲玲2 , 刘思培2   

  1. 1. 吉林大学 a. 计算机科学与技术学院; b. 符号计算与知识工程教育部重点实验室, 长春 130012; 2. 北方信息控制研究院集团有限公司 总体部, 南京 211111
  • 收稿日期:2021-01-08 出版日期:2021-07-24 发布日期:2021-07-26
  • 作者简介:李子恒(1997— ), 男, 山西运城人, 吉林大学硕士研究生, 主要从事智能信息处理研究, (Tel)86-18946520499(E-mail)ziheng19@ mails. jlu. edu. cn; 叶育鑫(1981— ), 男, 长春人, 吉林大学教授, 博士, 主要从事智能信息处理研究, (Tel)86-15943014160(E-mail)yeyx@ jlu. edu. cn
  • 基金资助:
    装备发展部“十三五冶国家高科技共性基础预研课题基金资助项目(31510040201)

Method and System for Automatically Generating Data Products Based on Mapping Rules

LI Ziheng1a,1b , YE Yuxin1a,1b , CAO Lingling2 , LIU Sipei2   

  1. 1a. College of Computer Science and Technology; 1b. Key Lab of Symbolic Computation and Knowledge Engineering Ministry of Education, Jilin University, Changchun 130012, China; 2. Overall Department, North Information Control Research Acdemy Group Company Limited, Nanjing 211111, China
  • Received:2021-01-08 Online:2021-07-24 Published:2021-07-26

摘要: 随着知识图谱的广泛应用, 为了提高从中提取知识数据和产品数据的准确率和效率, 以知识图谱为数据源, 根据实际业务需求制定业务数据抽取与组织规则(抽取规则即为题目中的映射规则, 设计规则的表达描述方法和规范约束, 由业务需求者填写实际可实施抽取的规则), 支持根据该规则从知识图谱中抽取符合规则的子图。 由于该子图符合设计业务需求方的规则, 因此该子图包含了符合业务需求的数据和组织结构。 通过数据产品生成规则(从结构相对固定, 具备实际业务含义的子图数据生成报告文档、统计表格等业务用户最终需要的数据产品), 从抽取的子图生成需要的数据产品(报告文档, 统计表格等)。 通过 SPARQL 查询语言、自然语言生成等技术实现了以知识图谱为数据源, 快速自动地生成文本、 图表、 报告文档等数据产品, 大幅度提升了效率。

关键词: 知识图谱; , 本体; , SPARQL 查询语言; , 生成规则; , 自然语言生成

Abstract: With the widespread use of knowledge graphs, in order to improve the accuracy and efficiency of extracting knowledge data and product data from them, the method and system use a knowledge graph as a data source, and formulates business data extraction and organization rules based on actual business requirements (the extraction rules are the mapping rules in the title, the expression description methods and specification constraints of the design rules are designed by us, and the actual requirements can be filled out by the business demander), and support the extraction of subgraphs that meet the rules from the knowledge graph according to the rules. Because the subgraph conforms to the rules of the business demander, the subgraph contains the data and organizational structure that meet the business requirements. Further, through data product generation rules (generating data products that are ultimately required by business users, such as report files and statistical tables, from subgraph data with relatively fixed structure and actual business meaning), generate the required data products from the extracted subgraphs ( report documents, statistical tables, etc). Rapid and automatic generation of data products such as text, charts, and report documents are achieved by using SPARQL query language, natural language generation, and other technologies to use knowledge graphs as data sources, which has substantially improved efficiency.

Key words: knowledge graph, ontology, SPARQL query language, generation rule, natural language generation

中图分类号: 

  • TP182