吉林大学学报(信息科学版) ›› 2022, Vol. 40 ›› Issue (4): 638-643.

• • 上一篇    下一篇

基于机器学习算法的丙肝预测

苗馨方1 , 刘 铭1 , 蒋 扬2   

  1. 1. 长春工业大学 数学与统计学院, 长春 130012; 2. 中电金信数字科技集团有限公司 汽车制造数字化事业部, 辽宁 大连 116000
  • 收稿日期:2021-09-06 出版日期:2022-08-16 发布日期:2022-08-17
  • 通讯作者: 刘铭(1979— ), 男, 吉林白山人, 长春工业大学教授, 硕士生导师, 主要从事机器学习、 大数据分析与数据挖掘研究, (Tel)86-15843108878(E-mail)jlcclm@163.com。
  • 作者简介:苗馨方(1997— ), 女, 吉林公主岭人, 长春工业大学硕士研究生, 主要从事机器学习、 大数据分析与数据挖掘研究, (Tel)86-15590190707(E-mail)2433556586@qq.com;
  • 基金资助:
    吉林省自然科学基金资助项目(2020021157JC); 吉林省教育厅科学技术基金资助项目(JJKH20191295KJ)

Hepatitis C Prediction Based on Machine Learning Algorithms

MIAO Xinfang 1 , LIU Ming 1 , JIANG Yang 2   

  1. 1. College of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, China; 2. Department of Vehicle Manufacturing Digital, CEC Gientech Technology Group Company, Dalian 116000
  • Received:2021-09-06 Online:2022-08-16 Published:2022-08-17

摘要: 由于丙型病毒性肝炎病毒感染后约有 3 ~ 10% 丙肝病例发展为肝细胞癌, 因此准确预测丙肝感染情况,提高丙型肝炎病毒检测技术非常重要, 为此, 采用机器学习中的集成算法进行丙肝预测。 为挑选出最优检测丙肝模型, 将不同机器学习模型在 UCI(University of California Irvine)丙肝数据进行比较分析。 实验结果表明,梯度提升树, 随机森林以及轻量级梯度提升机表现较好, 其中梯度提升树在预测丙肝准确率高达 0. 9351。使用梯度提升树对丙肝感染情况进行预测最为准确。

关键词: 丙肝; , 机器学习; , 梯度提升树; , 轻量级梯度提升机

Abstract: Approximately 3% to 10% of hepatitis C cases can develop to hepatocellular carcinoma after viral hepatitis C virus infection. Worldwide, 27% of cirrhosis statistics are due to hepatitis C and 25% are due to hepatocellular carcinoma. Accurate prediction of hepatitis C infection is a matter of urgency. Machine learning is fast and accurate. Hepatitis research often used time series analysis or pathological analysis in the past and did not use machine learning algorithms as an auxiliary diagnosis method for hepatitis C. To select the optimal model for detecting hepatitis C, different machine learning models are compared and analyzed in UCI(University of California Irvine) hepatitis C data. The experimental results show that gradient boosting tree, random forest and light gradient boosting machine perform better, among which the gradient boosting tree is accurate in predicting hepatitis C up to 0. 935 1. The most accurate prediction of hepatitis C infection is performed using gradient boosting tree.

Key words: hepatitis C; , machine learning; , gradient boosting decision tree; , light gradient boosting machine

中图分类号: 

  • TP181