吉林大学学报(信息科学版) ›› 2025, Vol. 43 ›› Issue (4): 736-746.

• • 上一篇    下一篇

融合段落和文档特征的金融公告事件抽取方法

李佳静1,2, 董泽信1, 李 盛1, 孟 涛2, 罗小清3a, 闫宏飞3b   

  1. 1. 中国矿业大学(北京) 机电与信息工程学院,北京100083;2. 南京网感至察信息科技有限公司,南京210001; 3. 北京大学a. 经济学院;b. 计算机学院,北京100871
  • 收稿日期:2024-02-27 出版日期:2025-08-15 发布日期:2025-08-14
  • 作者简介:李佳静(1979— ), 女, 黑龙江大庆人,中国矿业大学(北京)副教授,硕士生导师,南京网感至察信息科技有限公司高级工程师,主要从事认知计算、人工智能、自然语言处理研究,(Tel)86-18600514982(E-mail)lijj@cumtb.edu.cn。
  • 基金资助:
    北京大学教学新思路2.0冶重点基金项目(2023ZD03)

Method for Extracting Events of Financial Announcement by Integrating Paragraph and Document Features 

LI Jiajing1,2, DONG Zexin1, LI Sheng1, MENG Tao2, LUO Xiaoqing3a, YAN Hongfei3b   

  1. 1. School of Mechanical Electrical and Information Engineering, China University of Mining and Technology (Beijing), Beijing 100083, China; 2. Wangganzhicha Information Technology Incorporated, Nanjing 210001, China; 3a. School of Economics; 3b. School of Computer Science, Peking University, Beijing 100871, China
  • Received:2024-02-27 Online:2025-08-15 Published:2025-08-14

摘要: 针对金融事件存在论元专业性强、分散度高等特点,传统事件抽取方法难以实现精准抽取问题,提出了 一种融合段落局部和文档全局特征的事件抽取方法。 该方法首先对金融公告文档分段后并行地对所有段落 利用Fin-BERT(Financial Bidirectional Encoder Representation from Transformers) 预训练模型、卷积神经网络和 自注意力机制获取文档局部特征;然后利用Bi-LSTM(Bi-directional Long-Short Term Memory)对整篇文档的语义 信息进行学习获取文档全局特征;最后将段落局部特征与文档全局特征融合,输出事件论元和事件类型。在金融公开数据集ChFinAnn上进行的实验结果表明,该方法获得了平均80.2%F1 ,优于基线模型,证明了该方法的有效性。

关键词: 事件抽取, 金融公告, 事件论元分散

Abstract:  Financial announcement is the carrier for enterprises to publicly inform the society of major financial events, and its information is of great significance to financial practitioners. However, financial events have the characteristics of strong argument specialization and high dispersion, and traditional event extraction methods are difficult to achieve accurate extraction. Therefore an event extraction method combining the local features of paragraphs and the global features of documents is proposed. This method first segments the financial announcement document, and then uses all the paragraphs in parallel Fin-BERT(Financial Bidirectional Encoder Representation from Transformers ) Pre training model, convolutional neural network and self attention mechanism to obtain local features of documents. Then Bi LSTM(Bi directional Long Short Term Memory) is used to learn the semantic information of the whole document to obtain the global features of the document. Finally, the local features of the paragraph and the global features of the document are fused to output event arguments and event types. A series of experiments are carried out on the financial open data set chfinann. The experimental results show that the method achieves an average F1 value of 80. 2%, which is better than the baseline model, and proves the effectiveness of the method. 

Key words: event extraction, financial announcements, scattering of event elemen

中图分类号: 

  • TP391