吉林大学学报(理学版) ›› 2025, Vol. 63 ›› Issue (2): 472-0478.

• • 上一篇    下一篇

基于Bayes超参数优化梯度提升树的心脏病预测方法

王海燕, 焦增晨, 赵剑, 安天博, 鞠熠   

  1. 长春大学 计算机科学技术学院, 残障人士智能康复及无障碍教育部重点实验室, 长春 130022
  • 收稿日期:2024-06-04 出版日期:2025-03-26 发布日期:2025-03-26
  • 通讯作者: 王海燕 E-mail:wanghy80@ccu.edu.cn

Heart Disease Prediction Method Based on Bayesian Hyperparameter Optimization Gradient Boosting Trees

WANG Haiyan, JIAO Zengchen, ZHAO Jian, AN Tianbo, JU Yi   

  1. Key Laboratory of Intelligent Rehabilitation and Accessibility for People with Disabilities of Ministry of Education, College of Computer Science and Technology,  Changchun University, Changchun 130022, China
  • Received:2024-06-04 Online:2025-03-26 Published:2025-03-26

摘要: 针对传统机器学习算法在数据集Cleveland和Hungary上预测准确率低的问题, 提出一种基于Bayes超参数优化梯度提升树的心脏病预测方法. 首先, 采用K-最近邻算法对数据集中的缺失值进行填补, 用Min-Max标准化、One-Hot编码处理数据, 并基于梯度提升树算法进行心脏病预测; 其次, 采用Bayes优化和十倍交叉验证的方式搜寻算法的最佳超参数组合. 实验结果表明, 优化后的梯度提升树算法在心脏病数据集Cleveland上预测准确率可达90.2%, 在心脏病数据集Hungary上预测准确率可达81.4%, 优于决策树、 支持向量机、 K-最近邻等传统机器学习方法, 可辅助医生进行心脏病诊断.

关键词: 心脏病预测, K-最近邻算法, 梯度提升树, Bayes优化

Abstract: Aiming at  the problem of low prediction accuracy of traditional machine learning algorithms on Cleveland and Hungary dataset, we proposed a heart disease prediction method based on Bayesian hyperparameter optimization gradient boosting trees. Firstly, the K-nearest neighbor algorithm was used to fill in the missing values in the dataset, Min-Max standardization and One-Hot encoding were used  to process the data, and  the gradient boosting tree algorithm was used to predict the heart disease. Secondly, Bayesian optimization and ten-fold cross validation were used to search for the best combination of hyperparameters of the algorithm. The experimental results show that  the prediction accuracy of the optimized gradient boosting tree algorithm can reach 90.2% on the Cleveland heart disease dataset, and the prediction accuracy can reach 81.4% on the Hungarian heart disease dataset, outperforming  traditional machine learning methods such as decision tree, support vector machine and the K-nearest neighbor, it  can assist doctors in the diagnosis of heart disease.

Key words: heart disease prediction, K-nearest neighbor algorithm, gradient boosting tree, Bayesian optimization

中图分类号: 

  • TP181