基于对比学习思想的多跳问题生成

吉林大学学报(理学版) ›› 2023, Vol. 61 ›› Issue (5): 1103-1111.

基于对比学习思想的多跳问题生成

王红斌^1,2,3, 杨何祯旻^1,2,3, 王灿宇⁴

1. 昆明理工大学信息工程与自动化学院, 昆明 650500； 2. 昆明理工大学云南省人工智能重点实验室, 昆明 650500；
3. 昆明理工大学云南省计算机技术应用重点实验室, 昆明 650500； 4. 云南农业大学大数据学院, 昆明 650201

收稿日期:2022-10-24 出版日期:2023-09-26 发布日期:2023-09-26
通讯作者: 王灿宇 E-mail:736559039@qq.com

Multi-hop Question Generation Based on Contrastive Learning Ideas

WANG Hongbin^1,2,3, YANG Hezhenmin^1,2,3, WANG Canyu⁴

1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China;
2. Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming 650500, China;
3. Yunnan Key Laboratory of Computer Technology Application, Kunming University of Science and Technology, Kunming 650500, China;
4. Faculty of Big Data, Yunnan Agricultural University, Kunming 650201, China

Received:2022-10-24 Online:2023-09-26 Published:2023-09-26

摘要/Abstract

摘要： 针对获取大规模的多跳问答训练数据集耗时耗力的问题, 提出一个基于对比学习思想的多跳问题生成模型. 模型分为生成阶段和对比学习打分阶段, 生成阶段通过执行推理图生成候选多跳问题, 对比学习打分阶段通过一个基于对比学习思想的无参考问题的候选问题打分模型对候选问题进行打分排序, 并选择最优的候选问题. 该模型在一定程度上缩小了无监督方法与人工标注方法的差距, 有效缓解了缺少多跳问答数据集的问题. 在数据集HotpotQA上的实验结果表明, 基于对比学习的多跳问题生成模型能有效扩充训练数据, 极大减少了人工标注数据的成本.

关键词: 多跳问题生成, 机器阅读理解, 对比学习

Abstract: Aiming at the time-consuming and labor-intensive problem of obtaining large-scale multi-hop question and answer training dataset, we proposed a multi-hop question generation model based on the contrastive learning idea. The model was divided into the generation phase and the contrastive learning scoring phase. In the generation phase, candidate multi-hop questions were generated by executing the inference graph. In the contrastive learning scoring phase, candidate questions were scored and sorted through a candidate question scoring model without reference question based on the contrastive learning idea, and the best candidate question was selected. This model had to some extent narrowed the gap between unsupervised methods and manual annotation methods, effectively alleviating the problem of lacking a multi-hop question and answer dataset. The experimental results on HotpotQA dataset show that the multi-hop question generation model based on contrastive learning can effectively expand the training data and greatly reduce the cost of manually labeling data.

Key words: multi-hop question generation, machine reading comprehension, contrastive learning

中图分类号:

TP391

王红斌, 杨何祯旻, 王灿宇. 基于对比学习思想的多跳问题生成[J]. 吉林大学学报(理学版), 2023, 61(5): 1103-1111.

WANG Hongbin, YANG Hezhenmin, WANG Canyu. Multi-hop Question Generation Based on Contrastive Learning Ideas[J]. Journal of Jilin University Science Edition, 2023, 61(5): 1103-1111.

参考文献

Metrics

Viewed

Full text

215

HTML			PDF

Just accepted	Online first	Issue	Just accepted	Online first	Issue
0	0	0	0	0	215

From	Others	local

Times	11	204
Rate	5%	95%

Abstract

397

Just accepted	Online first	Issue

0	0	397

	From	Others

	Times	397
	Rate	100%

Cited

Web of Science	Crossref	ScienceDirect	Search for Citations in Google Scholar >>


This page requires you have already subscribed to WoS.

Shared