Journal of Jilin University(Engineering and Technology Edition) ›› 2022, Vol. 52 ›› Issue (12): 2984-2993.doi: 10.13229/j.cnki.jdxbgxb20210538

Previous Articles     Next Articles

Multiple sequence alignment of proteins based on slime mold algorithm

Xin-lu WANG1,2,3(),Da-you LIU1,2(),Si-han LIU1,3,Zheng WANG1,2,Li-wei ZHANG1,2,Sa DONG1,2   

  1. 1.College of Computer Science and Technology,Jilin University,Changchun 130012,China
    2.Key Laboratory of Symbolic Computation and Knowledge Engineering,Ministry of Education,Jilin University,Changchun 130012,China
    3.College of International Education,Jilin University,Changchun 130012,China
  • Received:2021-06-18 Online:2022-12-01 Published:2022-12-08
  • Contact: Da-you LIU E-mail:xinlu@jlu.edu.cn;liudy@jlu.edu.cn

Abstract:

Based on the slime mold algorithm (SMA), an effective matching model (SMA_MSA) was developed to assist bioscientists in judging whether different sequences have homology, so as to predict protein structure. SMA_MSA and some other well-known competing algorithms were tested in the six data sets of BAliBASE 3.0. The results clearly show that SMA_SMA has excellent matching ability in the 31 data sets, indicating that the proposed model has great development potential in the problem of protein multiple sequence alignment.

Key words: computer application, protein sequence alignment, slime mold algorithm, BAliBASE

CLC Number: 

  • TP391

Fig.1

Description of MSA insertion vacancy"

Table 1

Effect of penalty score setting on sequence matching"

开放空位罚分扩展空位罚分作用
极少插入或插入,用于亲缘接近的蛋白质序列比对
少量较长空位插入,用于可能在整个功能域插入空位的情况
大量短的空位插入,用于亲缘关系疏远的蛋白质序列比对

Fig.2

BLOSUM62 scoring matrix"

Fig.3

Illustration of open space and extended space"

Fig.4

Example of protein multiple sequence alignment"

Table 2

SP values of each algorithm for solving multiple sequence comparisons"

数据集SNminLmaxLSP
DE算法GWO算法PSO算法SMA算法
BB11

BB11001

BB11008

BB11010

BB11012

BB11013

BB11015

BB11017

BB11021

BB11025

BB11035

4

4

4

4

5

4

4

4

4

5

83

104

490

320

51

297

247

102

64

71

91

540

492

397

101

327

264

139

103

138

1

0.25

0.91

1

1

0.92

1

0.56

1

1

1

0.75

0.91

1

1

0.92

1

0.56

1

1

1

0.25

0.91

1

1

0.75

0.86

0.44

1

0.67

1

1

0.91

1

1

0.92

0.93

0.56

1

1

Mean----0.860.910.790.93
BB12

BB12002

BB12003

BB12005

BB12006

BB12009

BB12012

BB12018

BB12020

BB12021

BB12022

6

8

9

4

5

4

4

4

6

5

165

58

197

220

67

298

752

118

71

79

231

85

234

242

201

548

974

129

85

475

1

0.88

1

1

1

0.97

0.98

0.94

1

0.89

0.88

0.75

0.9

1

1

0.97

0.98

0.94

1

1

1

0.63

0.9

1

1

0.89

0.97

0.76

1

0.72

0.88

0.88

0.8

1

1

1

0.98

0.94

1

0.89

Mean----0.970.940.890.94
BB20

BB20006

BB20030

51

47

224

76

293

155

0.33

1

0.22

1

0.44

1

0.56

1

Mean----0.670.610.720.78
BB30

BB30010

BB30018

BB30024

50

78

69

503

372

226

1293

688

982

0.69

0.78

0.61

0.82

0.72

0.59

0.85

0.72

0.60

0.88

0.81

0.76

Mean----0.690.710.720.82
BB40

BB40010

BB40025

BB40028

9

14

22

67

247

247

214

527

759

1

0.92

0.52

1

0.92

0.33

1

0.92

0

1

0.92

0.67

Mean----0.810.750.640.86
BB50

BB50004

BB50010

BB50013

9

17

18

386

372

230

505

688

318

0.71

0.25

0.73

0.76

0.25

0.6

0.68

0.25

0.6

0.71

0

0.27

Mean----0.560.540.510.33

Fig.5

Convergence curves of SP values obtained by the four algorithms on different data sets"

1 Eric S, Martinez H M. A multiple sequence alignment program[J]. Nucleic Acids Research, 1986(1):363-374.
2 Layeb A, Boudra A, Korichi W, et al. A new greedy randomized adaptive search procedure for multiobjective RNA structural alignment[J]. International Journal in Foundations of Computer Science & Technology, 2013, 3(1): 1-14.
3 李文. 基于k-mer相异度算法在系统进化关系中的应用[D]. 广州:华南理工大学物理与光电学院, 2019.
Li Wen. Application of dissimilarity algorithms based on k-mer in evolutionary relationship[D]. Guangzhou: School of Physics, South China University of Technology, 2019.
4 Edgar R C. Muscle: a multiple sequence alignment method with reduced time and space complexity[J]. BMC Bioinformatics, 2004, 5(1): No.113.
5 Lassmann T, Sonnhammer E L. Kalign—an accurate and fast multiple sequence alignment algorithm[J]. BMC Bioinformatics, 2005, 6(1): 1-9.
6 Paolo D T, Sebastien M, Ioannis X, et al. T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension[J]. Nucleic Acids Research, 2011, 39: W13-W17.
7 Wallace I M, Orla O, Higgins D G, et al. M-Coffee: combining multiple sequence alignment methods with T-Coffee[J]. Nucleic Acids Research, 2006, 34(6): 1692-1699.
8 Katoh K, Kuma K I, Miyata T, et al. Improvement in the accuracy of multiple sequence alignment program MAFFT[J]. Genome Informatics, 2005, 16(1):22-33.
9 Notredame C.Recent Evolutions of multiple sequence alignment algorithms[J]. Plos Computational Biology, 2007, 3: No.e123.
10 Ling C, Wei L, Chen J. Ant colony optimization methodfor multiple sequence alignment[C]∥International Conference on Machine Learning & Cybernetics, Piscatanay, NJ, 2007: 914-919.
11 Rubio-Largo A, Vega-Rodríguez M A, González-Álvarez D L. A hybrid multiobjective memetic metaheuristic for multiple sequence alignment[J]. IEEE Transactions on Evolutionary Computation, 2016, 20(4): 499-514.
12 Liu Y, Schmidt B, Maskell D L. Msaprobs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities[J]. Bioinformatics, 2010, 26: 1958-1964.
13 Gao Y X. A multiple sequence alignment algorithm based on inertia weights particle swarm optimization[J]. Journal of Bionanoence, 2014, 8(5): 400-404.
14 Rani R R, Ramyachitra D. Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm[J]. Biosystems, 2016, 150: 177-189.
15 Sun J, Wu X, Wei F, et al. Multiple sequence alignment using the hidden markov model trained by an improved quantum-behaved particle swarm optimization[J]. Information Sciences, 2012, 182(1): 93-114.
16 Öztürk C, Aslan S. A new artificial bee colony algorithm to solve the multiple sequence alignment problem[J]. International Journal of Data Mining & Bioinformatics, 2016, 14(4): 332-353.
17 Zhu H, He Z, Jia Y. A novel approach to multiple sequence alignment using multiobjective evolutionary algorithm based on decomposition[J]. IEEE Journal of Biomedical & Health Informatics, 2016, 20(2): No.717.
18 Zambrano-Vega C, Nebro A J, Durillo J, et al. Multiple sequence alignment with multiobjective metaheuristics. a comparative study[J]. International Journal of Intelligent Systems, 2017, 32(8): 843-861.
19 Mokaddem A, Hadj A B, Elloumi M. Refin-align: new refinement algorithm for multiple sequence alignment[J]. Informatica, 2019, 43(4):527-534.
20 Bonizzoni P, Vedova G D. The complexity of multiple sequence alignment with SP-score that is a metric[J]. Theoretical Computer Science, 2001, 259(1/2):63-79.
21 Needleman S B, Wunsch C D. A general method applicable to search for similarities in amino acid sequence of 2 proteins[J]. Journal of Molecular Biology, 1970, 48(3): 443-453.
22 Thompson J D, Higgins D G, Gibson T J. Clustal W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice[J]. Nucleic Acids Research, 1994, 22(22): 1673-1680.
23 Notredame C, Holm L, Higgins D G. Coffee: an objective function for multiple sequence alignments[J]. Bioinformatics, 1998(5): 407-422.
24 Notredame C, Higgins D G, Heringa J . et al. T-Coffee: a novel method for fast and accurate multiple sequence alignment[J]. Journal of Molecular Biology, 2000, 302(1): 205-217.
25 O'Sullivan O, Karsten S, Chantal A, et al. 3Dcoffee: combining protein sequences and structures within multiple sequence alignments[J]. Journal of molecular biology, 2004, 340(2): 385-395.
26 Naznin F, Sarker R, Essam D. Progressive alignment method using genetic algorithm for multiple sequence alignment[J]. IEEE Transactions on Evolutionary Computation, 2012, 16: 615-631.
27 Naznin F, Sarker R, Essam D. Vertical decomposition with genetic algorithm for multiple sequence alignment[J]. BMC Bioinformatics, 2011, 12: No. 353.
28 Henikoff H. Amino acid substitution matrices from protein blocks[J]. Proceedings of the National Academy of Sciences, 1992, 89(22):10915-10919.
29 Li S, Chen H, Wang M, et al. Slime mould algorithm: a new method for stochastic optimization[J]. Future Generation Computer Systems, 2020, 111: 300-323.
30 Venter G, Jaroslaw S S. Particle swarm optimization[J]. AIAA Journal, 2003, 41(8): 129-132.
31 Mirjalili S, Mirjalili S, Lewis A. Grey wolf optimizer[J]. Advances in Engineering Software, 2014, 69: 46-61.
32 Das S S P N. Differential evolution: a survey of the state-of-the-art[J]. IEEE Transactions on Evolutionary Computation, 2011, 15(1): 4-31.
33 Thompson J D, Koehl P, Ripp R, et al. Balibase 3.0: latest developments of the multiple sequence alignment benchmark.[J]. Proteins-structure Function & Bioinformatics, 2010, 61(1): 127-136.
[1] Xian-yu QI,Wei WANG,Lin WANG,Yu-fei ZHAO,Yan-peng DONG. Semantic topological map building with object semantic grid map [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(2): 569-575.
[2] Xiao-hu SHI,Jia-qi WU,Chun-guo WU,Shi CHENG,Xiao-hui WENG,Zhi-yong CHANG. Residual network based curve enhanced lane detection method [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(2): 584-592.
[3] Peng GUO,Wen-chao ZHAO,Kun LEI. Dual⁃resource constrained flexible job shop optimal scheduling based on an improved Jaya algorithm [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(2): 480-487.
[4] Jin-Zhen Liu,Guo-Hui Gao,Hui Xiong. Multi⁃scale attention network for brain tissue segmentation [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(2): 576-583.
[5] Gui-he QIN,Jun-feng HUANG,Ming-hui SUN. Text input based on two⁃handed keyboard in virtual environment [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1881-1888.
[6] Fu-heng QU,Tian-yu DING,Yang LU,Yong YANG,Ya-ting HU. Fast image codeword search algorithm based on neighborhood similarity [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1865-1871.
[7] Tian BAI,Ming-wei XU,Si-ming LIU,Ji-an ZHANG,Zhe WANG. Dispute focus identification of pleading text based on deep neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1872-1880.
[8] Feng-feng ZHOU,Hai-yang ZHU. SEE: sense EEG⁃based emotion algorithm via three⁃step feature selection strategy [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1834-1841.
[9] Feng-feng ZHOU,Yi-chi ZHANG. Unsupervised feature engineering algorithm BioSAE based on sparse autoencoder [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(7): 1645-1656.
[10] Jun WANG,Yan-hui XU,Li LI. Data fusion privacy protection method with low energy consumption and integrity verification [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(7): 1657-1665.
[11] Yao-long KANG,Li-lu FENG,Jing-an ZHANG,Fu CHEN. Outlier mining algorithm for high dimensional categorical data streams based on spectral clustering [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(6): 1422-1427.
[12] Wen-jun WANG,Yin-feng YU. Automatic completion algorithm for missing links in nowledge graph considering data sparsity [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(6): 1428-1433.
[13] Xue-yun CHEN,Xue-yu BEI,Qu YAO,Xin JIN. Pedestrian segmentation and detection in multi-scene based on G-UNet [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(4): 925-933.
[14] Shi-min FANG. Multiple source data selective integration algorithm based on frequent pattern tree [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(4): 885-890.
[15] Da-xiang LI,Meng-si CHEN,Ying LIU. Spontaneous micro-expression recognition based on STA-LSTM [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(4): 897-909.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Tang Xin-xing,Zhao Ding-xuan,Huang Hai-dong,Ai Xue-zhong,Feng Shi-zhu . Modified stereo vision calibration method for construction robot
[J]. 吉林大学学报(工学版), 2007, 37(02): 391 -0395 .
[2] Zeng Ping, Liu Yan-tao, Wu Bo-da, Cheng Guang-ming,Yang Zhi-gang,Kan Jun-wu . A novel wireless electropult powered by piezoelectricity[J]. 吉林大学学报(工学版), 2006, 36(增刊2): 78 -82 .
[3] Lu Shou-feng,Yang Zhao-sheng,Liu Xi-min . Synergy of traffic signal control and route guidance
based on multiagent system
[J]. 吉林大学学报(工学版), 2006, 36(增刊2): 143 -146 .
[4] Peng Tai-jiang,Yang shu-chen,Yang Zhi-gang,Cheng Guang-ming,Zeng Ping, Zhang De-jun . Experimental study on ultrasonic antifriction behavior[J]. 吉林大学学报(工学版), 2006, 36(增刊2): 88 -90 .
[5] Tian Xiaole,Meng Qingfan,Wang Zhenzuo,Su Weibiao, Zhu Kai,Gao Haiying,Teng Lirong. Preparation of Antimicrobial Peptide Gel and Inhibition Effect on Bacteria[J]. 吉林大学学报(工学版), 2006, 36(01): 133 -0136 .
[6] . [J]. 吉林大学学报(工学版), 2005, 35(02): 191 -0194 .
[7] Sun Xiao-feng,Li Xin-xin,Yang Zhi-gang,Liu Jiu-long,Cheng Guang-ming,. Piezoelectric membrane pump with series-connected double chambers and holistic opening valve[J]. 吉林大学学报(工学版), 2006, 36(04): 529 -533 .
[8] Gao Ji-dong, Zhang Yuan-jun, Li Meng-liang,Qin Kong-jian ,Chen Jie-feng . Size distributions of PM emissions from heavy-duty diesel vehicles on road[J]. 吉林大学学报(工学版), 2008, 38(01): 37 -041 .
[9] CHENG Yong-chun,TAN Guo-jin,LIU Han-bing,FU Cong . Damage identification of bridge structure based on statistical properties of eigen-solution[J]. 吉林大学学报(工学版), 2008, 38(04): 812 -816 .
[10] Liu Feng,Wang Xin-wei . Differential quadrature method for nonlinear buckling analysis of tubing in straight wells [J]. 吉林大学学报(工学版), 2007, 37(01): 234 -238 .