吉林大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (3): 884-889.doi: 10.13229/j.cnki.jdxbgxb201603031

• Orginal Article • Previous Articles     Next Articles

Novel naive Bayes classification algorithm based on semi-supervised learning

DONG Li-yan1, SUI Peng1, SUN Peng1, LI Yong-li2   

  1. 1.College of Computer Science and Technology , Jilin University , Changchun 130012, China;
    2.School of Computer Science and Technology, Northeast Normal University, Changchun 130117,China
  • Received:2013-08-01 Online:2016-06-20 Published:2016-06-20

Abstract: A novel naive Bayes classification algorithm based on semi-supervised learning is proposed. First, to retain high quality samples in the training sect with class label, the unlabeled training set is sued to obtain k samples of high confidence. Then, these high confidence samples are combined with the labeled-training samples to iterate until the training is complete. The experimental results show that with the increasing proportion of unlabeled samples, the predictive accuracy of the proposed algorithm is significant higher than that of the Na?ve Bayesian classification. In addition, the effectiveness and performance of the algorithm are improved compared with the traditional semi-supervised learning algorithm.

Key words: computer application, semi-supervised study, Naive Bayes, unknown label classification

CLC Number: 

  • TP301.6
[1] Zhang H, Sheng S. Learning weighted naive Bayes with accurate ranking[C]∥Proceedings of the Fourth IEEE International Conference on Data Mining, Brighton, 2004: 567-570.
[2] Zheng Z, Webb G I. Lazy learning of bayesian rules[J]. Machine Learning, 2000,41(1):53-84.
[3] Wang Z H, Webb G I,Zheng F. Adjusting Dependence Relations for Semi-lazy TAN Classfier[M]. Berlin:Springer,2003:453-456.
[4] Yager R R. An extension of the naive Bayesian classifier[J]. Information Sciences, 2006,176(5): 577-588.
[5] 江凯,高阳. 并行化的半监督朴素贝叶斯分类算法[J]. 计算机科学与探索,2012, 6(10):912-918.
Jiang Kai,Gao Yang. A parallelized semi-supervised na?ve bayes classifier[J]. Journal of Frontiers of Computer Science and Technology,2012, 6(10):912-918.
[6] Shahshahani B M, Landgrebe D A. The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon[J].IEEE Transactions on Geoscience and Remote Sensing, 1994, 32(5): 1087-1095.
[7] Kulis B, Basu S, Dhillon I, et al. Semi-supervised graph clustering: a kernel approach[J]. Machine learning, 2009, 74(1): 1-22.
[8] 彭兴媛. 朴素贝叶斯分类改进算法的研究[D].重庆:重庆大学数学与统计学院,2012.
Peng Xin-yuan. Research on naive Bayesian classifier algorithm[D].Chongqing: School of Mathematics and Statistics, Chongqing University,2012.
[9] Su J, Shirab J S, Matwin S. Large scale text classification using semi-supervised multinomial naive bayes[C]∥Proceedings of the 28th International Conference on Machine Learning,Bellevue, WA, USA,2011: 97-104.
[10] 孔怡青. 半监督学习及其应用研究[D]. 无锡:江南大学信息工程学院,2009.
Kong Yi-qing. Studies on semi-supervised learning and its applications[D]. Wuxi: School of Information Engineering, Jiangnan University,2009.
[11] Mann G S, McCallum A. Generalized expectation criteria for semi-supervised learning with weakly labeled data[J]. The Journal of Machine Learning Research, 2010, 11: 955-984.
[12] Hall M, Frank E, Holmes G, et al. The WEKA data mining software: an update[J]. ACM SIGKDD Explorations Newsletter,2009,11(1): 10-18.
[13] Zhu X J,Goldberg A B.Introduction to semi-supervised learning[J].Synthesis Lectures on Artificial Intelligence and Machine Learning,2009,3(1):1-130.
[14] Kveton B, Valko M, Rahimi A, et al. Semi-supervised learning with max-margin graph cuts[C]∥Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 2010:421-428.
[15] UCI machine learning repository[DB/OL]. [2013-05-20].http://archive.ics.uci.edu/ml/index.html
[16] Modha D S, Spangler W S. Feature weighting in k-means clustering[J]. Machine Learning, 2003, 52(3): 217-237.
[1] LIU Fu,ZONG Yu-xuan,KANG Bing,ZHANG Yi-meng,LIN Cai-xia,ZHAO Hong-wei. Dorsal hand vein recognition system based on optimized texture features [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(6): 1844-1850.
[2] WANG Li-min,LIU Yang,SUN Ming-hui,LI Mei-hui. Ensemble of unrestricted K-dependence Bayesian classifiers based on Markov blanket [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(6): 1851-1858.
[3] JIN Shun-fu,WANG Bao-shuai,HAO Shan-shan,JIA Xiao-guang,HUO Zhan-qiang. Synchronous sleeping based energy saving strategy of reservation virtual machines in cloud data centers and its performance research [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(6): 1859-1866.
[4] ZHAO Dong,SUN Ming-yu,ZHU Jin-long,YU Fan-hua,LIU Guang-jie,CHEN Hui-ling. Improved moth-flame optimization method based on combination of particle swarm optimization and simplex method [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(6): 1867-1872.
[5] LIU En-ze,WU Wen-fu. Agricultural surface multiple feature decision fusion disease judgment algorithm based on machine vision [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(6): 1873-1878.
[6] OUYANG Dan-tong, FAN Qi. Clause-level context-aware open information extraction [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1563-1570.
[7] LIU Fu, LAN Xu-teng, HOU Tao, KANG Bing, LIU Yun, LIN Cai-xia. Metagenomic clustering method based on k-mer frequency optimization [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1593-1599.
[8] GUI Chun, HUANG Wang-xing. Network clustering method based on improved label propagation algorithm [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1600-1605.
[9] LIU Yuan-ning, LIU Shuai, ZHU Xiao-dong, CHEN Yi-hao, ZHENG Shao-ge, SHEN Chun-zhuang. LOG operator and adaptive optimization Gabor filtering for iris recognition [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1606-1613.
[10] CHE Xiang-jiu, WANG Li, GUO Xiao-xin. Improved boundary detection based on multi-scale cues fusion [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1621-1628.
[11] ZHAO Hong-wei, LIU Yu-qi, DONG Li-yan, WANG Yu, LIU Pei. Dynamic route optimization algorithm based on hybrid in ITS [J]. 吉林大学学报(工学版), 2018, 48(4): 1214-1223.
[12] HUANG Hui, FENG Xi-an, WEI Yan, XU Chi, CHEN Hui-ling. An intelligent system based on enhanced kernel extreme learning machine for choosing the second major [J]. 吉林大学学报(工学版), 2018, 48(4): 1224-1230.
[13] FU Wen-bo, ZHANG Jie, CHEN Yong-le. Network topology discovery algorithm against routing spoofing attack in Internet of things [J]. 吉林大学学报(工学版), 2018, 48(4): 1231-1236.
[14] CAO Jie, SU Zhe, LI Xiao-xu. Image annotation method based on Corr-LDA model [J]. 吉林大学学报(工学版), 2018, 48(4): 1237-1243.
[15] HOU Yong-hong, WANG Li-wei, XING Jia-ming. HTTP-based dynamic adaptive streaming video transmission algorithm [J]. 吉林大学学报(工学版), 2018, 48(4): 1244-1253.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LIU Song-shan, WANG Qing-nian, WANG Wei-hua, LIN Xin. Influence of inertial mass on damping and amplitude-frequency characteristic of regenerative suspension[J]. 吉林大学学报(工学版), 2013, 43(03): 557 -563 .
[2] CHU Liang, WANG Yan-bo, QI Fu-wei, ZHANG Yong-sheng. Control method of inlet valves for brake pressure fine regulation[J]. 吉林大学学报(工学版), 2013, 43(03): 564 -570 .
[3] LI Jing, WANG Zi-han, YU Chun-xian, HAN Zuo-yue, SUN Bo-hua. Design of control system to follow vehicle state with HIL test beach[J]. 吉林大学学报(工学版), 2013, 43(03): 577 -583 .
[4] HU Xing-jun, LI Teng-fei, WANG Jing-yu, YANG Bo, GUO Peng, LIAO Lei. Numerical simulation of the influence of rear-end panels on the wake flow field of a heavy-duty truck[J]. 吉林大学学报(工学版), 2013, 43(03): 595 -601 .
[5] WANG Tong-jian, CHEN Jin-shi, ZHAO Feng, ZHAO Qing-bo, LIU Xin-hui, YUAN Hua-shan. Mechanical-hydraulic co-simulation and experiment of full hydraulic steering systems[J]. 吉林大学学报(工学版), 2013, 43(03): 607 -612 .
[6] ZHANG Chun-qin, JIANG Gui-yan, WU Zheng-yan. Factors influencing motor vehicle travel departure time choice behavior[J]. 吉林大学学报(工学版), 2013, 43(03): 626 -632 .
[7] MA Wan-jing, XIE Han-zhou. Integrated control of main-signal and pre-signal on approach of intersection with double stop line[J]. 吉林大学学报(工学版), 2013, 43(03): 633 -639 .
[8] YU De-xin, TONG Qian, YANG Zhao-sheng, GAO Peng. Forecast model of emergency traffic evacuation time under major disaster[J]. 吉林大学学报(工学版), 2013, 43(03): 654 -658 .
[9] WANG Guo-lin, FU Nai-ji, ZHANG Jian, PEI Zi-rong. Simulation of the radial tire curing process based on K-R kinetic model[J]. 吉林大学学报(工学版), 2013, 43(03): 659 -664 .
[10] XIAO Yun, LEI Jun-qing, ZHANG Kun, LI Zhong-san. Fatigue stiffness degradation of prestressed concrete beam under multilevel amplitude cycle loading[J]. 吉林大学学报(工学版), 2013, 43(03): 665 -670 .