Journal of Jilin University(Engineering and Technology Edition) ›› 2023, Vol. 53 ›› Issue (10): 2952-2963.doi: 10.13229/j.cnki.jdxbgxb.20211348

Previous Articles     Next Articles

SVM parameters and feature selection optimization based on improved whale algorithm

Hui GUO1,2(),Jie-di FU1,2,Zhen-dong LI1,2(),Yan YAN3,Xiao LI1,2   

  1. 1.School of Information Engineering,Ningxia University,Yinchuan 750021,China
    2.Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education,Yinchuan 750021,China
    3.Electric Power Research Institute,State Grid Ningxia Power Co. ,Ltd. ,Yinchuan 750011,China
  • Received:2021-12-07 Online:2023-10-01 Published:2023-12-13
  • Contact: Zhen-dong LI E-mail:gh606@163.com;lizhendong@nxu.edu.cn

Abstract:

Prevalent scenario of support vector machine (SVM) in data classification exists the low recognition accuracy problems, this paper proposes a feature selection model for synchronous optimization of SVM with improved Whale optimization algorithm. Firstly, the Levy flight strategy was used to perturb the spiral update position of the whale optimization algorithm and the reflection operation of the simplex method to improve the reflection point solution of the elite individuals in the population. With extensive empirical validations on standard function, the proposed method can effectively improve the convergence speed and calculation accuracy. Secondly, SVM kernel parameters and feature selection targets were the co-optimization objects effect for obtaining the optimal kernel parameters and the corresponding optimal feature subset. Finally, feature selection simulation experiments were conducted on UCI standard data set and real breast cancer data set, and the average classification accuracy, average fitness value, fitness standard deviation and the number of selected features were evaluated and analyzed. Compared with traditional support vector machines, the classification accuracy of real breast cancer data set is improved by 11.053%.

Key words: whale optimization algorithm, feature selection, simplex, levy flight, SVM, data classification

CLC Number: 

  • TP301.6

Fig.1

Levy's flight trajectory"

Fig.2

Flow chart of LSWOA algorithm"

Fig.3

Search for individual dimension representations"

Fig.4

Flow chart of hybrid algorithm"

Table 1

Experimental data set description"

序号数据集特征个数样本个数类别数
1Abalone841773
2Planning121822
3Glass102146
4SETAP102742
5Arrhythmia27945216
6SCADI205706

Table 2

Initialization parameters of different algorithms"

算法参数
WOA随机数rp[0,1],收敛因子a[0,2],常数b=1,随机数l=[-1,1]
LSWOA

随机数rp[0,1],收敛因子a[0,2],常数b=1,随机数l=[-1,1],反射系数α=1,扩展系数γ=2

收缩系数β=0.5,指数系数ζ(1,3],常数τ=1.5,步长缩放因子η=1

GWO随机数r1r2[0,1],收敛因子a[0,2]
PSO学习因子c1=c2=2,惯性因子ω=0.6,速度v[-1,1]
DE交叉概率CR=0.2,缩放因子β[0.2,0.8]
RSO开发参数C=[0,2],勘探参数A=[0,2],随机变量R = [12

Table 3

Average classification accuracy of different algorithms"

数据集LSWOA-SVMWOA-SVMDE-SVMRSO-SVMGWO-SVMPSO-SVMSVM
Abalone58.231 536 9357.828 343 3158.143 712 6056.918 163 6758.099 800 4057.788 423 2053.640 718 56
Planning79.166 666 6773.981 481 4873.703 703 7072.129 629 6376.111 111 1175.092 592 5957.222 222 22
Glass76.744 186 0574.418 604 6574.883 720 9373.643 410 8574.418 604 6572.945 736 4366.589 147 29
SETAP93.111 111 1178.000 000 0079.111 111 1174.222 222 2280.666 666 6753.266 666 6754.222 222 22
Arrhythmia79.074 074 1073.555 555 5670.777 777 7870.148 148 1575.910 000 0076.400 000 0033.615 047 62
SCADI84.047 619 0582.619 047 6280.238 095 2478.333 333 3381.428 571 4337.380 952 3846.904 761 90

Fig.5

Average number of feature selection for different algorithms on three low-dimensional datasets"

Fig.6

Average number of feature selection for different algorithms on three high-dimensional datasets"

Table 4

Average fitness values of different algorithms on six datasets"

数据集LSWOA-SVMWOA-SVMDE-SVMRSO-SVMGWO-SVMPSO-SVM
Abalone0.413 534 8680.417 540 2350.414 421 4120.426 546 8460.414 843 6430.417 951 277
Planning0.206 288 0560.257 623 8890.260 375 8330.275 943 3330.236 542 7780.246 637 222
Glass0.230 259 9660.253 290 9990.248 697 0890.260 945 7880.253 295 0730.267 877 209
SETAP0.083 609 5470.217 824 2720.206 853 1070.255 206 6020.191 426 4400.420 250 421
Arrhythmia0.207 214 4210.261 833 1180.289 359 3790.295 536 0100.238 520 7170.233 689 842
SCADI0.157 937 2710.174 459 7910.195 695 0850.214 508 1300.212 160 7760.619 976 506

Table 5

Average of the standard deviation of fitness values for different algorithms on six datasets"

数据集LSWOA-SVMWOA-SVMDE-SVMRSO-SVMGWO-SVMPSO-SVM
Abalone0.011 671 2040.015 791 3200.003 578 2560.012 751 9770.011 699 5780.007 107 340
Planning0.056 415 9320.057 459 2540.074 715 9870.057 451 3240.056 623 3500.062 293 555
Glass0.033 731 5240.049 733 5230.054 607 8240.050 202 9460.051 482 3960.034 393 867
SETAP0.053 713 1940.096 625 2410.083 018 5510.110 092 8200.070 986 4150.080 800 224
Arrhythmia0.009 876 5140.019 347 7460.030 610 4460.063 172 9140.020 095 4320.014 849 456
SCADI0.070 778 7720.062 548 3040.085 021 3580.071 844 9290.100 004 8030.096 068 102

Fig.7

Convergence curves of fitness functions for different algorithms"

Table 6

Classification accuracy of ten runs of different algorithms on breast cancer dataset"

算法

1

2

3

4

5

6

7

8

9

10

平均值

SVM

71.92

70.17

68.42

71.92

72.50

68.62

69.93

67.67

73.81

71.24

70.62

LSWOA-SVM

80.70

80.19

78.94

80.70

85.96

78.43

80.70

84.21

85.96

80.94

81.67

WOA-SVM

68.42

82.45

80.70

68.42

77.56

77.35

78.35

70.46

66.35

80.72

75.078

DE-SVM

82.45

77.19

68.42

80.70

70.17

75.43

73.68

71.92

75.46

71.98

74.74

RSO-SVM

71.92

77.19

73.68

78.94

71.92

77.19

78.94

68.42

73.68

68.56

74.04

GWO-SVM

78.94

77.19

75.43

71.92

77.19

78.94

78.94

75.43

71.92

71.92

75.78

PSO-SVM

64.91

77.19

70.17

80.70

75.43

76.56

72.13

66.14

81.56

76.42

74.12

Table 7

Ten run results of LSWOA-SVM algorithm"

序号c/q分类 准确率/%选择特征个数适应度值
18.4514/0.195780.7040.1911
21.2511/1.808777.1950.2259
30.8659/0.498678.9440.2085
45.3472/5.568080.7040.1911
50.3845/0.547685.9650.1391
60.3015/0.221275.4340.2433
736.0582/1.249980.7040.1911
823.498/0.935884.2140.1564
924.5493/0.014185.9660.1391
1020.2506/2.602478.9440.2085
1 Meenakshi S. Improving the performance of heart disease classification using chaotic tent map based whale optimizer for feature selection with SVM algorithm[J]. Turkish Journal of Physiotherapy and Rehabilitation,2021, 32(3): 4229-4244.
2 Antal B, Hajdu A. An ensemble-based system for automatic screening of diabetic retinopathy[J]. Knowledge-based Systems, 2014, 60: 20-27.
3 Phuong T M, Lin Z, Altman R B. Choosing SNPs using feature selection[C]∥IEEE Computational Systems Bioinformatics Conference,Stanford, USA, 2005: 301-309.
4 Hamla H, Ghanem K. Comparative study of embedded feature selection methods on microarray data[C]∥IFIP International Conference on Artificial Intelligence Applications and Innovations, Hersonissos, Greece, 2021: 69-77.
5 Blum Christian, Roli Andrea. Metaheuristics in combi-natorial optimization: overview and conceptual comparison[J]. ACM Comput Surv, 2003, 35(3): 268-308.
6 Mirjalili S, Mirjalili S M, Lewis A. Grey wolf optimizer[J]. Advances in Engineering Software, 2014, 69: 46-61.
7 Kennedy J, Eberhart R. Particle swarm optimization[C]∥Proceedings of International Conference on Neural Networks, Perth, Australia, 1995: 1942-1948.
8 Wong K P, Dong Z Y. Differential evolution, an alternative approach to evolutionary algorithm[C]∥Proceedings of the 13th International Conference on, Intelligent Systems Application to Power Systems, Arlington, USA, 2005: 73-83.
9 Dhiman G, Kaur A. STOA: a bio-inspired based optimization algorithm for industrial engineering problems[J]. Engineering Applications of Artificial Intelligence, 2019, 82: 148-174.
10 Zhang Y, Gong D, Cheng J. Multi-objective particle swarm optimization approach for cost-based feature selection in classification[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2015, 14(1): 64-75.
11 Sayed G I, Darwish A, Hassanien A E. A new chaotic whale optimization algorithm for features selection[J]. Journal of Classification, 2018, 35(2): 300-344.
12 张文杰, 蒋烈辉. 一种基于遗传算法优化的大数据特征选择方法[J]. 计算机应用研究, 2020, 37(1): 50-52, 56.
Zhang Wen-jie, Jiang Lie-hui. A feature selection method for big data based on genetic algorithm optimization[J]. Computer Application Research, 2020, 37(1): 50-52, 56.
13 Mirjalili S, Saremi S, Mirjalili S M, et al. Multi-objective grey wolf optimizer: a novel algorithm for multi-criterion optimization[J]. Expert Systems with Applications, 2016, 47: 106-119.
14 贾鹤鸣, 李瑶, 孙康健. 基于遗传乌燕鸥算法的同步优化特征选择[J]. 自动化学报, 2022, 48(6): 1601-1615.
Jia He-ming, Li Yao, Sun Kang-jian. Synchronous optimization feature selection based on Genetic Black-tern algorithm[J]. Automatica, 2022, 48(6): 1601-1615.
15 沈永良, 宋杰, 万志超. 基于改进烟花算法的SVM特征选择和参数优化[J]. 微电子学与计算机, 2018, 35(1): 21-25.
Shen Yong-liang, Song Jie, Wan Zhi-chao. SVM feature selection and parameter optimization based on improved fireworks algorithm[J]. Microelectronics and Computers, 2018, 35(1): 21-25.
16 姚全珠, 蔡婕. 基于PSO的LS-SVM特征选择与参数优化算法[J]. 计算机工程与应用, 2010, 46(1): 134-136, 229.
Yao Quan-zhu, Cai Jie. LS-SVM Feature selection and parameter optimization algorithm based on PSO[J]. Computer Engineering and Applications, 2010, 46(1): 134-136, 229.
17 Mirjalili S, Lewis A. The whale optimization algorithm[J]. Advances in Engineering Software, 2016, 95: 51-67.
18 Mafarja Majdi M, Seyedali Mirjalili. Hybrid whale optimization algorithm with simulated annealing for feature selection[J]. Neurocomputing, 2017, 260: 302-312.
19 Jadhav Amolkumar Narayan, Gomathi N. WGC: hybridzation of exponential grey wolf optimizer with whale optimization for data clustering[J]. Alexandria Engineering Journal, 2018, 57: 1569-1584.
20 Yan Z, Zhang J, Zeng J, et al. Nature-inspired approach: an enhanced whale optimization algorithm for global optimization[J]. Mathematics and Computers in Simulation, 2021, 185: 17-46.
21 龙文, 蔡绍洪, 焦建军, 等. 求解大规模优化问题的改进鲸鱼优化算法[J]. 系统工程理论与实践, 2017, 37(11): 2983-2994.
Long Wen, Cai Shao-hong, Jiao Jian-jun, et al. Improved whale optimization algorithm for solving large scale optimization problems[J]. Systems Engineering Theory and Practice, 2017, 37(11): 2983-2994.
22 Zhang Y, Zhang Y, Wang G, et al. An improved hybrid whale optimization algorithm based on differential evolution[C]∥International Conference on Artificial Intelligence and Electromechanical Automation (AIEA), Tianjin, China, 2020: 103-107.
23 毕孝儒, 牟琦, 龚尚福. 融合动态概率阈值和自适应变异的鲸鱼优化算法[J]. 微电子学与计算机, 2019, 36(12): 78-83.
Bi Xiao-ru, Mou Qi, Gong Shang-fu. Whale optimization algorithm combining dynamic probability threshold and adaptive variation[J]. Microelectronics and Computers, 2019, 36(12): 78-83.
24 Shi X, Li M. Whale optimization algorithm improved effectiveness analysis based on compound chaos optimization strategy and dynamic optimization parameters[C]∥International Conference on Virtual Reality and Intelligent Systems, Jishou, China, 2019: 338-341.
25 Mostafa Bozorgi S, Yazdani S. IWOA: an improved whale optimization algorithm for optimization problems[J]. Journal of Computational Design and Engineering, 2019, 6(3): 243-259.
26 Zamli K Z. Optimizing S-box generation based on the adaptive agent heroes and cowards algorithm[J]. Expert Systems with Applications, 2021,182: No.115305.
27 Nelder J A, Mead R. A simplex method for function minimization[J]. The Computer Journal, 1965, 7(4): 308-313.
28 郭德龙, 周锦程, 周永权. 基于Levy飞行改进蝴蝶优化算法[J]. 数学的实践与认识, 2021, 51(12): 130-137.
Guo De-long, Zhou Jin-cheng, Zhou Yong-quan. Improved butterfly optimization algorithm based on Levy flight[J]. Practice and Understanding of Mathematics, 2021, 51(12): 130-137.
29 Aljarah I, Al-Zoubi A M, Faris H, et al. Simultaneous feature selection and support vector machine optimization using the grasshopper optimization algorithm[J]. Cognitive Computation, 2018, 10(2): 478-495.
30 Dua D, Graff C. UCI machine learning repository[R]. Irvine: University of California, School of Information and Computer Science, 2019.
31 Blake C. UCI repository of machine learning databases[DB/OL].[2020-07-04]. .
[1] Feng-feng ZHOU,Zhen-wei YAN. A model for identifying neuropeptides by feature selection based on hybrid features [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(11): 3238-3245.
[2] Li-li BAI,Feng-guo JIANG,Yu-ming ZHOU,Xiao ZENG. Optimized design of structure reliability based on improved whale algorithm [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(11): 3160-3165.
[3] Pei-ze LI,Shi-shun ZHAO,Xiao-hui WENG,Xin-mei JIANG,Hong-bo CUI,Jian-lei QIAO,Zhi-yong CHANG. A new method for rapid detection of pesticide residues based on multi⁃sensor optimization [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1951-1956.
[4] Feng-feng ZHOU,Hai-yang ZHU. SEE: sense EEG⁃based emotion algorithm via three⁃step feature selection strategy [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1834-1841.
[5] Bin WANG,Bing-hui HE,Na LIN,Wei WANG,Tian-yang LI. Tea plantation remote sensing extraction based on random forest feature selection [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(7): 1719-1732.
[6] Sheng-sheng WANG,Lin-yan JIANG,Yong-bo YANG. Transfer learning of medical image segmentation based on optimal transport feature selection [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(7): 1626-1638.
[7] Yin-di YAO,Jun-jin HE,Yang-li LI,Dang-yuan XIE,Ying LI. ET0 simulation of self⁃constructed improved whale optimized BP neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(5): 1798-1807.
[8] Jing-bin LI,Yu-kun YANG,Bao-qin WEN,Za KAN,Wen SUN,Shuo YANG. Method of extraction of navigation path of post-autumn residual film recovery based on stubble detection [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(4): 1528-1539.
[9] Wan-fu GAO,Ping ZHANG,Liang HU. Nonlinear feature selection method based on dynamic change of selected features [J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(4): 1293-1300.
[10] ZHAO Dong,SUN Ming-yu,ZHU Jin-long,YU Fan-hua,LIU Guang-jie,CHEN Hui-ling. Improved moth-flame optimization method based on combination of particle swarm optimization and simplex method [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(6): 1867-1872.
[11] KUI Hai-lin, BAO Cui-zhu, LI Hong-xue, LI Ming-da. Idling time prediction method based on least square support vector machine [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1360-1365.
[12] LIU Jie, ZHANG Ping, GAO Wan-fu. Feature selection method based on conditional relevance [J]. 吉林大学学报(工学版), 2018, 48(3): 874-881.
[13] GENG Qing-tian, YU Fan-hua, WANG Yu-ting, GAO Qi-kun. New algorithm for vehicle type detection based on feature fusion [J]. 吉林大学学报(工学版), 2018, 48(3): 929-935.
[14] YUAN Zhe-ming, ZHANG Hong-yang, CHEN Yuan. HIV-1 protease cleavage site prediction based on feature selection and support vector machine [J]. 吉林大学学报(工学版), 2017, 47(2): 639-646.
[15] LU Ying, WANG Hui-qin, QIN Li-ke. Accurate fire location method in high and large-span space buildings [J]. 吉林大学学报(工学版), 2016, 46(6): 2067-2073.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!