一种基于聚类和AdaBoost的自适应集成算法

吉林大学学报(理学版)

一种基于聚类和AdaBoost的自适应集成算法

王玲娣, 徐华

江南大学化工物联网工程学院, 江苏无锡 214122

收稿日期:2017-05-12 出版日期:2018-07-26 发布日期:2018-07-31
通讯作者: 徐华 E-mail:joanxh2003@163.com

An Adaptive Ensemble Algorithm Based on Clustering and AdaBoost

WANG Lingdi, XU Hua

School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, Jiangsu Province, China

Received:2017-05-12 Online:2018-07-26 Published:2018-07-31
Contact: XU Hua E-mail:joanxh2003@163.com

摘要/Abstract

摘要： 为同时保证基分类器的准确性和差异性, 提出一种基于聚类和AdaBoost的自适应集成算法. 首先利用聚类算法将训练样本分成多个类簇; 然后分别在每个类簇上进行AdaBoost训练并得到一组分类器; 最后按加权投票策略进行分类器的集成. 每个分类器的权重是自适应的, 且为基于测试样本与每个类簇的相似性及分类器对此测试样本的分类置信度计算得到. 实验结果表明, 与AdaBoost,Bagging(bootstrap aggregating)和随机森林等代表性集成算法相比, 该算法可取得更高的分类精度.

关键词: 自适应权重, 聚类, 集成学习, AdaBoost算法

Abstract: In order to ensure the accuracy and diversity of the base classifier at the same time, we proposed an adaptive ensemble algorithm based on clustering and AdaBoost. Firstly, the training samples were divided into multiple clusters by clustering algorithm. Secondly, AdaBoost training was performed on each cluster to get a set of classifiers. Finally, these classifiers were combined according to the weighted voting strategy. The weights of each classifier were adaptive, we calculated the similarity between the test samples and each cluster and got the test samples’ classification confidence given by classifiers. The experimental results show that the algorithm can achieve a higher classification accuracy compared with representative ensemble algorithms such as AdaBoost, Bagging (bootstrap aggregating) and Random Forest.

Key words: clustering, adaptive weight, AdaBoost algorithm, ensemble learning

中图分类号:

TP181

王玲娣, 徐华. 一种基于聚类和AdaBoost的自适应集成算法[J]. 吉林大学学报(理学版), 2018, 56(4): 917-924.

WANG Lingdi, XU Hua. An Adaptive Ensemble Algorithm Based on Clustering and AdaBoost[J]. Journal of Jilin University Science Edition, 2018, 56(4): 917-924.

[1]	张蕾, 姜宇, 孙莉. 一种改进型TF-IDF文本聚类方法[J]. 吉林大学学报(理学版), 2021, 59(5): 1199-1204.
[2]	胡雅婷, 陈营华, 宝音巴特, 曲福恒, 李卓识. 一种增量式MinMax k-Means聚类算法[J]. 吉林大学学报(理学版), 2021, 59(5): 1205-1211.
[3]	聂逯松, 常方圆, 常学智, 刘畅, 金有为, 刘国晟, 付加胜, 韩霄松. 一种新型的自适应多核学习算法[J]. 吉林大学学报(理学版), 2021, 59(5): 1212-1218.
[4]	蒲晓川, 黄俊丽, 祁宁, 宋长松. 基于密度信息熵的K-means算法在客户细分中的应用[J]. 吉林大学学报(理学版), 2021, 59(5): 1245-1251.
[5]	曾宏志, 史洪松. 半监督技术和主动学习相结合的网络入侵检测方法[J]. 吉林大学学报(理学版), 2021, 59(4): 936-942.
[6]	李健, 姜楠, 宝音巴特, 张帆, 张伟健, 王薇. 空间颜色聚类算法及其在图像特征提取中的应用[J]. 吉林大学学报(理学版), 2020, 58(3): 627-633.
[7]	王海燕, 崔文超, 许佩迪, 李闯. Canopy在划分聚类算法中对K选取的优化[J]. 吉林大学学报(理学版), 2020, 58(3): 634-638.
[8]	张尧, 才华, 李心达, 米晓红, 孙俊喜. 基于Adaboost首帧检测的时空上下文人脸跟踪算法[J]. 吉林大学学报(理学版), 2020, 58(2): 314-320.
[9]	吕洪武, 赵航, 王宏志, 胡黄水. 基于模糊神经网络的MVB故障诊断算法[J]. 吉林大学学报(理学版), 2020, 58(1): 104-108.
[10]	齐向明, 孙煦骄. 基于语义簇的中文文本聚类算法[J]. 吉林大学学报(理学版), 2019, 57(5): 1193-1199.
[11]	刘良凤, 刘三阳. 基于权重差异度的动态模糊聚类算法[J]. 吉林大学学报(理学版), 2019, 57(3): 574-582.
[12]	朱超平, 任继平. 基于智能优化算法的物联网异构数据融合方法[J]. 吉林大学学报(理学版), 2019, 57(3): 627-632.
[13]	刘久彪. 空间数据库反向最近邻聚类方法[J]. 吉林大学学报(理学版), 2019, 57(2): 387-392.
[14]	董立岩, 王宇, 任怡, 李永丽. 基于矩阵分解和聚类的协同过滤算法[J]. 吉林大学学报(理学版), 2019, 57(1): 105-110.
[15]	薛小娜, 高淑萍, 彭弘铭, 吴会会. 基于K近邻和多类合并的密度峰值聚类算法[J]. 吉林大学学报(理学版), 2019, 57(1): 111-120.