J4

• 计算机科学 • 上一篇    下一篇

一种基于支持向量机的蛋白质结构域边界预测方法

邹淑雪1, 黄艳新1, 李艳文2, 周春光1   

  1. 1. 吉林大学 计算机科学与技术学院, 长春 130012; 2. 东北师范大学 计算机学院, 长春 130024
  • 收稿日期:2008-02-29 修回日期:1900-01-01 出版日期:2008-09-26 发布日期:2008-09-26
  • 通讯作者: 周春光

Protein Domains Prediction Method Based on Support Vector Machines

ZOU Shuxue1, HUANG Yanxin1, LI Yanwen2, ZHOU Chunguang1   

  1. 1. College of Computer Science and Technology, Jilin University, Changchun 130012, China;2. School of Computer Science, Northeast Normal University, Changchun 130024, China
  • Received:2008-02-29 Revised:1900-01-01 Online:2008-09-26 Published:2008-09-26
  • Contact: ZHOU Chunguang

摘要: 提出一种基于支持向量机学习蛋白质结构域的边界预测方法. 在分析多序列比对结果的基础上, 定义了几种能够直接或间接反映蛋白质结构属性及结构域信息的新方法. 结果表明, 蛋白质序列信息预测边界信号的正确识别率达85%以上, 具有较好的泛化能力.

关键词: 蛋白质结构域, 序列, 支持向量机, 生物信息学

Abstract: Guessing the boundaries of structural domains has been an important and challenging problem in experimentand computational structural biology. A promising method for detecting the domain structure of a protein from sequence information alone was presented. The method is based on analyzing multiple sequence alignments that are derived from a database search. Multiple measures were defined to quantify the domain information content of each position along the sequence and were combined into a single predictor using support vector machines. The overall accuracy of the method for a single protein chains dataset is about 85%. The result demonstrates that the utility of the method can help not only predict the complete 3D structure of a protein but also study proteins’ building blocks of functional analysis.

Key words: protein domains, sequence, support vector machine, bi oinformatics

中图分类号: 

  • TP391.4