吉林大学学报(信息科学版) ›› 2024, Vol. 42 ›› Issue (3): 406-420.

• • 上一篇    下一篇

基于弱监督集成的概念演化自适应检测方法

王 婧a , 郭虎升a,b , 王文剑a,b    

  1. 山西大学 a. 计算机与信息技术学院; b. 计算智能与中文信息处理教育部重点实验室, 太原 030006
  • 收稿日期:2023-05-26 出版日期:2024-06-18 发布日期:2024-06-17
  • 通讯作者: 王文剑(1968— ), 女, 太原人, 山西大学教授, 博士生导师, 主要从事机器学习和 数据挖掘研究, (Tel)86-13099070737(E-mail)wjwang@ sxu. edu. cn。
  • 基金资助:
    国家自然科学基金资助项目(62276157; U21A20513; 62076154); 山西省重点研发计划基金资助项目(202202020101003) 

Adaptive Detection Method for Concept Evolution Based on Weakly Supervised Ensemble

WANG Jing a , GUO Husheng a,b , WANG Wenjian a,b   

  • Received:2023-05-26 Online:2024-06-18 Published:2024-06-17
  • About author:王婧(1999— ), 女, 山西离石人, 山西大学硕士研究生, 主要从事数据挖掘研究, ( Tel) 86-15525088854 ( E-mail) 1756645158@ qq. com;

摘要: 由于现有的多数概念演化检测方法本质上是基于监督学习, 且通常用于解决一个时间段内仅出现一个新类, 不能处理数据流中的类消失和类循环任务。 为此, 提出一种基于弱监督集成的概念演化自适应检测方法 (AD_WE: Adaptive Detection Method for Concept Evolution Based on Weakly Supervised Ensemble)。 该方法利用弱监督集成策略构建集成学习器, 对数据块中的训练样本进行局部预测, 在此基础上, 基于局部密度和相对距离识别特征空间中具有较强内聚性的相似数据并对其聚类, 对聚类结果进行相似度比较, 实现新类实例的检测及不同新类的区分; 同时根据数据随时间变化特征建立动态衰减模型, 及时消除消失类, 并通过相似度比较检测循环类。 实验表明, 所提方法能对概念演化做出及时响应, 可有效识别消失类和循环类, 提高学习器的泛化性能。

关键词:  概念演化, 弱监督集成, 自适应模型, 动态衰减模型, 消失类, 循环类

Abstract:  Most of the existing detection methods for concept evolution are essentially based on supervised learning and are often used to solve the problem that only one novel class appears in a period of time. However, they can not handle the task of a class disappearing and recurring in streaming data. To address the above problems, an adaptive detection method for concept evolution based on weakly supervised ensemble (AD_WE) is proposed. The weakly supervised ensemble strategy is used to construct an ensemble learner to make local predictions on the training samples in the data block. Similar data with strong cohesion in the feature space are detected and clustered using local density and relative distance. The similarity of the clustering results is then compared to detect novel class instances and distinguish between different novel classes. And a dynamic decay model is established according to the characteristics of data change over time. The vanished class is eliminated in time, and the recurring class is detected through similarity comparison. Experiments show that the proposed method can respond to concept evolution in a timely manner, effectively identify vanished classes and recurring classes, and improve the generalization performance of the learner.

Key words: concept evolution, weakly supervised ensemble, adaptive model, dynamic decay model, vanished class, recurring class

中图分类号: 

  • TP181