吉林大学学报(信息科学版)

• 论文 • 上一篇    下一篇

基于集成特征选择策略的基因共表达模式识别

王浩畅 1 , 李 钰 2 , 李 斌 1 , 吴 旻 1   

  1. 1. 东北石油大学 计算机与信息技术学院, 黑龙江 大庆 163318; 2. 哈尔滨工业大学 生命科学与工程系, 哈尔滨 150001
  • 收稿日期:2017-04-25 出版日期:2017-09-29 发布日期:2017-10-23
  • 作者简介: 王浩畅(1974— ), 女, 吉林省吉林市人, 东北石油大学教授, 硕士生导师, 主要从事自然语言处理、 信息抽取和生物信息学研究, (Tel)86-13199099111(E-mail)kinghaosing@ gmail. com。
  • 基金资助:
     国家自然科学基金资助项目(61402099)

Ensemble Feature Selection for Recognizing Co-Expression Patterns of Genes

WANG Haochang 1 , LI Yu 2 , LI Bin 1 , WU Min 1   

  1. 1. College of Computer and Information Technology, Northeast Petroleum University, Daqing 163318, China;
    2. School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, China
  • Received:2017-04-25 Online:2017-09-29 Published:2017-10-23

摘要: 为有效识别内含子 miRNA 及其宿主基因共表达模式, 提出了一种基于集成特征选择的识别方法。 首先
使用基于支持度的集成特征选择算法, 获取相关性和稳定性较高的特征子集, 再使用封装式特征选择方法结合
FCBF(Fast Correlation-Based Filter)搜索策略进一步去除冗余特征和弱相关的特征, 获得最优的特征子集。 实验
结果表明, 该方法融合了多个特征选择方法的优点, 能提高学习模型的泛化能力并能有效识别内含子 miRNA
及其宿主基因的共表达模式。

关键词: 支持度, 共表达,  集成特征选择, 特征提取, 内含子 miRNA

Abstract: A new method based on ensemble features selection technique is proposed to recognize the
co-expression patterns of the intronic miRNAs with their host genes. The Support-based Ensemble Feature
Selection algorithm is used to obtain a subset of features with high correlation and stability, and then through the
combination of wrapper and FCBF(Fast Correlation-Based Filter) search to reduce the redundant features and
weakly related features to get optimal features. The experimental results show that the proposed method can take
advantage of the benefits of multiple feature selection methods and effectively recognize the co-expression patterns
of the intronic miRNAs with their host genes.

Key words: support, co-expression, ensemble feature selection, feature extraction, intronic miRNA

中图分类号: 

  • TP391