吉林大学学报(理学版) ›› 2025, Vol. 63 ›› Issue (2): 528-0536.

• • 上一篇    下一篇

基于动态主题情感模型的文本聚类算法

胡萍   

  1. 铜仁学院 大数据学院, 贵州 铜仁 554300
  • 收稿日期:2023-11-07 出版日期:2025-03-26 发布日期:2025-03-26
  • 通讯作者: 胡萍 E-mail:290222350@qq.com

Text Clustering Algorithm Based on Dynamic Theme Emotion Model

HU Ping   

  1. School of Data Science, Tongren University, Tongren 554300, Guizhou Province, China
  • Received:2023-11-07 Online:2025-03-26 Published:2025-03-26

摘要: 针对目前已有的相关主题模型中, 对大众情感因素考虑不足, 难以精准挖掘, 同时对社交文本的实时动态演化考虑弱化了模型聚类能力的问题, 通过在模型中增加情感层以提取社交文本情感极性特征, 并引入先验分布函数, 提出一种基于动态主题情感模型的文本聚类算法. 利用真实新冠疫情Twitter文本数据集进行实验, 实验结果表明, 该模型的性能优于基线模型, 提高了情感特征区分度, 使文本主题与对应的情感极性联合生成时间节点, 进而使模型有处理时间演化的能力.

关键词: 动态主题情感模型; 文本挖掘; 情感标签; 时间戳, 文本聚类, 困惑度

Abstract: Aiming at  the problem that the emotional factors of the public were not considered enough in the existing related theme models, which was difficult to accurately excavate them, and the real-time dynamic evolution of social texts was considered to weaken the clustering ability of the model, the author  proposed a text clustering algorithm based on the dynamic theme emotin model by adding  the emotional layer to the model to extract the polar features of social text emotion, and introducing a prior distribution function. The experiments were carried out by using real COVID-19 Twitter text datasets.  The experimental results show that the performance of the model is better than the baseline model,  and   the discrimination of emotional features is improved, so that  the text theme and the corresponding emotional polarity can jointly generate time nodes, and then  the model has the ability to deal with time evolution.

Key words: dynamic topic emotion model; text mining; , emotional label; time stamp, text clustering, perplexity

中图分类号: 

  • TP391.1