吉林大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (2): 585-594.doi: 10.13229/j.cnki.jdxbgxb201602037

• 论文 • 上一篇    下一篇

动态步长蛋白质构象空间搜索方法

张贵军, 郝小虎, 周晓根, 秦传庆   

  1. 浙江工业大学 信息工程学院, 杭州 310023
  • 收稿日期:2014-07-04 出版日期:2016-02-20 发布日期:2016-02-20
  • 作者简介:张贵军(1974-),男,教授,博士.研究方向:智能信息处理,全局优化理论及算法设计,生物信息学.E-mail:zgj@zjut.edu.cn
  • 基金资助:
    国家自然科学基金项目(61075062,61573317); 浙江省自然科学基金项目(LY13F030008); 浙江省科技厅公益项目(2014C33088); 浙江省重中之重学科开放基金项目(20120811); 杭州市产学研合作项目(20131631E31)

Ab-initio dynamic-step-size searching of protein conformational space

ZHANG Gui-jun, HAO Xiao-hu, ZHOU Xiao-gen, QIN Chuan-qing   

  1. College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
  • Received:2014-07-04 Online:2016-02-20 Published:2016-02-20

摘要: 针对蛋白质构象空间采样问题,提出了一种基于能量引导树搜索框架的动态步长构象空间搜索方法.通过蛋白质构象特征提取,将高维二面角优化空间映射到低维结构特征向量空间,有效避免了维数灾难问题;根据能量和温度测度离散化特征空间为多个能量层和温度层,并系统划分为"构象室",减小构象空间搜索范围.在不同能量层,赋予相应的片段组装步长和蒙特卡洛扰动步长,在不同温度层,采用相应Metropolis准则接收当前构象;辅以副本交换方法,增强对构象空间中稳态结构的采样能力.12个蛋白质测试结果表明,该方法可以快速有效地采样得到近天然态构象.

关键词: 人工智能, 从头预测, 树搜索, 动态步长, 片段组装, 蒙特卡洛

Abstract: To address the sampling problem of protein conformational space, an Ab-initio dynamic-step-size searching method of protein conformational space is proposed. This method is based on the energy tree-based searching framework. The high-dimensional optimization space of dihedral angle is projected to a low-dimensional space of feature vector with the protein conformation feature extraction, effectively avoiding the curse of dimensionality problem. The feature space is discretized according to the energy and temperature. Then the layers are systematically divided into cells to reduce the searching space. Relevant Fragment Assembly (FA) step-size and Monte Carlo disturbance step-size are set according to the specific energy layer, and the corresponding Metropolis criterion is employed to accept the conformation within different temperature layers. The replica-exchange method is used as auxiliary method to enhance the sampling of native-like protein conformation. Test results of 12 proteins show that their native-like protein conformations can be reached successfully and effectively by the proposed method.

Key words: artificial intelligence, Ab-initio, tree-based searching, dynamic-step-size, fragment assembly, Monte Carlo

中图分类号: 

  • TP301.6
[1] Ken A Dill, Justin L, Mac Callum. The protein folding problem, 50 years on[J]. Science, 2012, 338 (11): 1042-1046.
[2] Anfinsen C. Principles that govern the folding of protein chains[J]. Science, 1973, 181(96): 223-230.
[3] 许忠能.生物信息学[M].北京:清华大学出版社,2008.
[4] Kim D E, Blum B, Bradley P, et al. Sampling bottlenecks in de novo protein structure prediction[J]. Journal of Molecular Biology, 2009, 393(1): 249-260.
[5] 黄俊峰,段鹏,吴文言.基于模板的蛋白质结构预测[J].生物物理学报,2011,27(1):28-37.
Huang Jun-feng, Duan Peng, Wu Wen-yan. Protein structure prediction based on the template[J]. Acta Biophysica Sinica, 2011,27(1):28-37.
[6] Kryshtafovych A, Fidelis K, Moult J. CASP10 results compared to those of previous CASP experiments[J]. Proteins: Structure, Function, and Bioinformatics, 2014, 82(S2): 164-174.
[7] Beliakov G, Lim K F. Challenges of continuous global optimization in molecular structure prediction[J]. European Journal of Operational Research, 2007, 181(3): 1198-1213.
[8] Tantar A A, Melab N, Talbi E G, et al. A parallel hybrid genetic algorithm for protein structure prediction on the computational grid[J]. Future Generation Computer Systems, 2007, 23(3): 398-409.
[9] Hoque M T, Chetty M, Lewis A, et al. Twin removal in genetic algorithms for protein structure prediction using low-resolution model[J]. Computational Biology and Bioinformatics, IEEE/ACM Transactions on, 2011, 8(1): 234-245.
[10] Islam M K, Chetty M. Clustered memetic algorithm with local heuristics for Ab-initio protein structure prediction[J]. IEEE Transactions on Evolutionary Computation, 2013, 17(4): 558-576.
[11] Custódio F L, Barbosa H J C, Dardenne L E. A multiple minima genetic algorithm for protein structure prediction[J]. Applied Soft Computing, 2014, 15: 88-99.
[12] Duan Y, Kollman P A. Pathways to a protein folding intermediate observed in a 1-microsecond simulation in aqueous solution[J]. Science, 1998, 282(5389): 740-744.
[13] Scheraga H A, Khalili M, Liwo A. Protein folding dynamics: overview of molecular simulation techniques[J]. Annu Rev Phys Chem, 2007, 58: 57-83.
[14] Lindorff-Larsen K, Trbovic N, Maragakis P, et al. Structure and dynamics of an unfolded protein examined by molecular dynamics simulation[J]. Journal of the American Chemical Society, 2012, 134(8): 3787-3791.
[15] Zhang Y, Kihara D, Skolnick J. Local energy landscape flattening: parallel hyperbolic Monte Carlo sampling of protein folding[J]. Proteins: Structure, Function, and Bioinformatics, 2002, 48(2): 192-201.
[16] Shen Y, Picord G, Guyon F, et al. Detecting protein candidate fragments using a structural alphabet profile comparison approach[J]. PloS One, 2013, 8(11): e80493.
[17] Xu D, Zhang Y. Toward optimal fragment generations for ab-initio protein structure assembly[J]. Proteins: Structure, Function, and Bioinformatics, 2013, 81(2): 229-239.
[18] Dotu I, Cebrian M, Van Hentenryck P, et al. On lattice protein structure prediction revisited[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2011, 8(6): 1620-1632.
[19] Tyka M D, Jung K, Baker D. Efficient sampling of protein conformational space using fast loop building and batch minimization on highly parallel computers[J]. Journal of Computational Chemistry, 2012, 33(31): 2483-2491.
[20] Joo K, Lee J, Sim S, et al. Protein structure modeling for CASP10 by multiple layers of global optimization[J]. Proteins: Structure, Function, and Bioinformatics, 2014, 82(S2): 188-195.
[21] Sugita Y, Okamoto Y. Replicae-xchange molecular dynamics method for protein folding[J]. Chemical Physics Letters, 1999, 314(1): 141-151.
[22] Sugita Y, Okamoto Y. Replica-exchange multicanonical algorithm and multicanonical replica-exchange method for simulating systems with rough energy landscape[J]. Chemical Physics Letters, 2000, 329(3): 261-270.
[23] Shehu A. An Ab-initio tree-based exploration to enhance sampling of low-energy protein conformations[C]∥Robotics: Science and Systems,2009: 241-248.
[24] Shehu A, Olson B. Guiding the search for native-like protein conformations with an Ab-inito tree-based exploration[J]. Robotics Research,2010, 29(8):1106-1127.
[25] Olson B, Molloy K, Shehu A. In search of the protein native state with a probabilistic sampling approach[J]. Journal of Bioinformatics and Computational Biology, 2011, 9(3): 383-398.
[26] Olson B, Shehu A. Evolutionary-inspired probabilistic search for enhancing sampling of local minima in the protein energy surface[J]. Proteome Sci, 2012, 10(Suppl 1): S5.
[27] Molloy K, Saleh S, Shehu A. Probabilistic search and energy guidance for biased decoy sampling in Ab initio protein structure prediction[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 2013, 10(5): 1162-1175.
[28] 周建红,艾观华,方慧生,等.蛋白质结构从头预测方法研究进展[J].生物信息学, 2011,9(1):1-5.
Zhou Jian-hong, Ai Guan-hua, Fang Hui-sheng, et al. Progress in ab initio protein structure prediction[J]. Bioinformatics, 2011, 9(1):1-5.
[29] Bradley P, Misura K M, Baker D. Toward high-resolution de novo structure prediction for small proteins[J]. Science, 2005, 309(5742): 1868-1871.
[30] Liwo A, Khalili M, Scherage H A. Ab-initio simulations of protein-folding pathways by molecular dynamics with the united-residue model of polypeptide chains[J]. PNAS, 2005, 102 (7): 2362-2367.
[31] Kuhlman B, Baker D. Native protein sequences are close to optimal for their structures[J]. Proceedings of the National Academy of Sciences of the Unitized States of America, 2000, 97(19): 10383-10388.
[32] Kortemme T, Morozov A V, Baker D. An orientation-dependent hydrogen bonding potential improves prediction of speci?city and structure for proteins and protein-protein complexes[J]. Journal of Molecular Biology, 2003, 326(4): 1239-1259.
[33] Enoch S Huang, Ram Samudrala, Britt H Park. Scoring functions for ab initio protein structure prediction[J]. Methods Mol Biol, 2000, 143 (1): 223-245.
[34] Ballester P J, Richards W G. Ultrafast shape recognition for similarity search in molecular databases[J]. Proc R Soc A, 2007,463(2081): 1307-1321.
[35] Lee J, Scheraga H A, Rackovsky S. New optimization method for conformational energy calculations on polypeptides: conformational space annealing[J]. Journal of Computational Chemistry, 1997, 18(9): 1222-1232.
[36] Berg B A, Neuhaus T. Multicanonical ensemble: a new approach to simulate first-order phase transitions[J]. Physical Review Letters, 1992, 68(1):9-12.
[37] Wang G, Dunbrack R L. PISCES: a protein sequence culling server[J]. Bioinformatics, 2003, 19(12): 1589-1591.
[38] Gront D, Kulp D W, Vernon R M, et al. Generalized fragment picking in Rosetta: design, protocols and applications[J]. PLoS One, 2011, 6(8): e23294.
[39] Keaver-Fay A, Tyka M, Lewis S M, et al. ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules[J]. Methods in Enzymology, 2011, 487:545-574.
[40] Li Z, Scheraga H A. Monte Carlo-minimization approach to the multiple-minima problem in protein folding[J]. PNAS, 1987, 84(19): 6611-6615.
[41] Lee J Y, Lee J Y, Sasaki T N, et al. De novo protein structure prediction by dynamic fragment assembly and conformational space annealing[J]. Proteins, 2011, 79(8):2403-2417.
[42] Stoean C, Preuss M, Stoean R, et al. Multimodal optimization by means of a topological species conservation algorithm[J]. Evolutionary Computation, 2010, 14(6): 842-864.
[1] 董飒, 刘大有, 欧阳若川, 朱允刚, 李丽娜. 引入二阶马尔可夫假设的逻辑回归异质性网络分类方法[J]. 吉林大学学报(工学版), 2018, 48(5): 1571-1577.
[2] 顾海军, 田雅倩, 崔莹. 基于行为语言的智能交互代理[J]. 吉林大学学报(工学版), 2018, 48(5): 1578-1585.
[3] 王旭, 欧阳继红, 陈桂芬. 基于垂直维序列动态时间规整方法的图相似度度量[J]. 吉林大学学报(工学版), 2018, 48(4): 1199-1205.
[4] 张浩, 占萌苹, 郭刘香, 李誌, 刘元宁, 张春鹤, 常浩武, 王志强. 基于高通量数据的人体外源性植物miRNA跨界调控建模[J]. 吉林大学学报(工学版), 2018, 48(4): 1206-1213.
[5] 黄岚, 纪林影, 姚刚, 翟睿峰, 白天. 面向误诊提示的疾病-症状语义网构建[J]. 吉林大学学报(工学版), 2018, 48(3): 859-865.
[6] 李雄飞, 冯婷婷, 骆实, 张小利. 基于递归神经网络的自动作曲算法[J]. 吉林大学学报(工学版), 2018, 48(3): 866-873.
[7] 刘杰, 张平, 高万夫. 基于条件相关的特征选择方法[J]. 吉林大学学报(工学版), 2018, 48(3): 874-881.
[8] 王旭, 欧阳继红, 陈桂芬. 基于多重序列所有公共子序列的启发式算法度量多图的相似度[J]. 吉林大学学报(工学版), 2018, 48(2): 526-532.
[9] 杨欣, 夏斯军, 刘冬雪, 费树岷, 胡银记. 跟踪-学习-检测框架下改进加速梯度的目标跟踪[J]. 吉林大学学报(工学版), 2018, 48(2): 533-538.
[10] 刘雪娟, 袁家斌, 许娟, 段博佳. 量子k-means算法[J]. 吉林大学学报(工学版), 2018, 48(2): 539-544.
[11] 曲慧雁, 赵伟, 秦爱红. 基于优化算子的快速碰撞检测算法[J]. 吉林大学学报(工学版), 2017, 47(5): 1598-1603.
[12] 李嘉菲, 孙小玉. 基于谱分解的不确定数据聚类方法[J]. 吉林大学学报(工学版), 2017, 47(5): 1604-1611.
[13] 邵克勇, 陈丰, 王婷婷, 王季驰, 周立朋. 无平衡点分数阶混沌系统全状态自适应控制[J]. 吉林大学学报(工学版), 2017, 47(4): 1225-1230.
[14] 王生生, 王创峰, 谷方明. OPRA方向关系网络的时空推理[J]. 吉林大学学报(工学版), 2017, 47(4): 1238-1243.
[15] 马淼, 李贻斌. 基于多级图像序列和卷积神经网络的人体行为识别[J]. 吉林大学学报(工学版), 2017, 47(4): 1244-1252.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 刘松山, 王庆年, 王伟华, 林鑫. 惯性质量对馈能悬架阻尼特性和幅频特性的影响[J]. 吉林大学学报(工学版), 2013, 43(03): 557 -563 .
[2] 初亮, 王彦波, 祁富伟, 张永生. 用于制动压力精确控制的进液阀控制方法[J]. 吉林大学学报(工学版), 2013, 43(03): 564 -570 .
[3] 李静, 王子涵, 余春贤, 韩佐悦, 孙博华. 硬件在环试验台整车状态跟随控制系统设计[J]. 吉林大学学报(工学版), 2013, 43(03): 577 -583 .
[4] 胡兴军, 李腾飞, 王靖宇, 杨博, 郭鹏, 廖磊. 尾板对重型载货汽车尾部流场的影响[J]. 吉林大学学报(工学版), 2013, 43(03): 595 -601 .
[5] 王同建, 陈晋市, 赵锋, 赵庆波, 刘昕晖, 袁华山. 全液压转向系统机液联合仿真及试验[J]. 吉林大学学报(工学版), 2013, 43(03): 607 -612 .
[6] 张春勤, 姜桂艳, 吴正言. 机动车出行者出发时间选择的影响因素[J]. 吉林大学学报(工学版), 2013, 43(03): 626 -632 .
[7] 马万经, 谢涵洲. 双停车线进口道主、预信号配时协调控制模型[J]. 吉林大学学报(工学版), 2013, 43(03): 633 -639 .
[8] 于德新, 仝倩, 杨兆升, 高鹏. 重大灾害条件下应急交通疏散时间预测模型[J]. 吉林大学学报(工学版), 2013, 43(03): 654 -658 .
[9] 肖赟, 雷俊卿, 张坤, 李忠三. 多级变幅疲劳荷载下预应力混凝土梁刚度退化[J]. 吉林大学学报(工学版), 2013, 43(03): 665 -670 .
[10] 肖锐, 邓宗才, 兰明章, 申臣良. 不掺硅粉的活性粉末混凝土配合比试验[J]. 吉林大学学报(工学版), 2013, 43(03): 671 -676 .