吉林大学学报(医学版) ›› 2021, Vol. 47 ›› Issue (6): 1570-1580.doi: 10.13481/j.1671-587X.20210631

• 方法学 • 上一篇    下一篇

基于骨肉瘤核心驱动基因筛选的生物信息学分析和患者生存期预测基因模型的构建

李苇航1,丁子毅1,王栋1,潘益凯2,刘玉辉3,张世磊1,李靖1,闫铭1()   

  1. 1.中国人民解放军空军军医大学西京医院骨科,陕西 西安 710032
    2.中国人民解放军空军军医大学航空航天医学系航空航天医学训练教研室,陕西 西安 710032
    3.中国人民解放军空军军医大学 航空航天医学系航空航天临床医学中心 教育部航空航天医学重点实验室,陕西 西安 710032
  • 收稿日期:2021-04-20 出版日期:2021-11-28 发布日期:2021-12-14
  • 通讯作者: 闫铭 E-mail:yanming_spine@163.com
  • 作者简介:李苇航(1997-),男,陕西省西安市人,在读硕士研究生,主要从事骨肿瘤和失重等方面的研究。
  • 基金资助:
    国家自然科学基金面上项目(82072475)

Bioinformatics analysis based on screening of core driving genes in osteosarcoma and construction of gene model for prediction of survival time of patients

Weihang LI1,Ziyi DING1,Dong WANG1,Yikai PAN2,Yuhui LIU3,Shilei ZHANG1,Jing LI1,Ming YAN1()   

  1. 1.Department of Orthopaedics,Xijing Hospital,Air Force Medical University,Xi’an 710032,China
    2.Department of Aerospace Medical Training,School of Aerospace Medicine,Air Force Medical University,Xi’an 710032,China
    3.School of Aerospace Medicine,Center of Clinical Aerospace Medicine,Key Laboratory of Aerospace Medicine of Ministry of Education,Air Force Medical University,Xi’an 710032,China
  • Received:2021-04-20 Online:2021-11-28 Published:2021-12-14
  • Contact: Ming YAN E-mail:yanming_spine@163.com

摘要: 目的

筛选骨肉瘤(OS)发生发展的核心驱动基因,从分子水平探讨OS的致病机制,并构建基因模型用于患者生存期的预测。

方法

采用基因表达汇编(GEO)数据库下载OS芯片对应矩阵数据GSE12865、GSE14359和GSE36001。采用生物信息学方法筛选OS与正常组织的差异表达基因(DEGs)。通过基因本体论(GO)、京都基因和基因组百科全书(KEGG)分析全面了解DEGs富集的分子功能及通路,采用STRING数据库构建蛋白-蛋白相互作用(PPI)网络,采用Cytoscape软件对DEGs进行相关性分析,找出与OS进展最相关的基因集,明确OS核心致病基因。采用肿瘤基因组图谱(TCGA)数据库下载OS的379个样本相关的临床记录信息和转录组数据,进行Kaplan-Meier(K-M)生存分析以进一步明确和验证核心基因与OS患者预后之间的关系,并寻找性别和种族等与预后相关的因素。对6个基因特征集的表达量进行建模以预测OS患者的生存时间。

结果

MCC算法获得的排名前十的DEGs为TYROBP、LAPTM5、FCER1G、CD74、HCLS1、ARHGDIB、HLA-DPA1、CD93、GIMAP4和LYZ,其表达水平在骨肉瘤患者与正常患者中比较差异有统计学意义(P<0.05)。GO和KEGG分析,DEGs在PI3K-AKT和Notch信号通路显著富集。K-M生存分析,6个基因(ARHGDIB、CD74、FCER1G、HCLS1、HLA-DPA1和TYROBP)表达量更低的OS患者较高表达患者的总生存时间更长(P<0.05)。由该6个基因组成的基因集在预测模型的构建中C指数为0.71。

结论

筛选出的OS的核心驱动基因高表达与OS的发生发展相关。OS发生发展的异常信号通路为PI3K-AKT和Notch信号通路。6个核心驱动基因组成OS的特征基因集构建的预测模型有良好的预测能力。

关键词: 骨肉瘤, 癌症基因组图谱数据库, 分子机制, 肿瘤标志物, 肿瘤预后模型

Abstract: Objective

To screen the core driving genes of the occurrence and development of osteosarcoma (OS), and to explore the pathogenic mechanism of OS at the molecular level as well as to construct the gene model to predict the survival time of the OS patients.

Methods

The matrix data of gene chips in OS patients were downloaded from the Gene Expression Omnibus (GEO) database: GSE12865,GSE14359 and GSE36001.The differentially expressed genes (DEGs) between the normal tissue and OS tissue were screened through the bioinformatic method. The molecular functions and pathways of DEGs were comprehensively understood through Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. The protein-protein interaction (PPI) network was constructed by STRING data, and Cytoscape software was conducted to analyze the correlation between DEGs to identify the most related gene set in the progression of OS as well as to figure out the core pathogenic genes of OS. The clinical record information and transcriptome data of 379 samples of OS were obtained from The Cancer Genome Atlas (TCGA) database, and Kaplan-Meier (K-M) survival analysis was further performed to clarify the relationship between hub genes and survival time of the OS patients, then other factors related to prognosis such as gender and race were searched and discussed. The expression amounts of 6 gene sets were modeled to predict the survival time of the patients.

Results

The top ten DEGs analyzed by MCC algorithm were TYROBP, LAPTM5, FCER1G, CD74, HCLS1, ARHGDIB, HLA-DPA1, CD93, GIMAP4, and LYZ,and the expression difference in these 10 DEGs between OS and normal patients showed statistical significance (P<0.05).The GO and KEGG results revealed that the DEGs were chiefly enriched in PI3K-AKT and Notch signaling pathways.The K-M survival analysis results demonstrated that the OS patients with lower expressions of 6 genes (ARHGDIB, CD74, FCER1G, HCLS1, HLA-DPA1, and TYROBP) had longer overall survival time than those with higher expressions (P<0.05). The C-index of the gene set composed of these 6 genes in the construction of prediction model was 0.71.

Conclusion

The high expressions of screened core driving genes are correlated with the occurrence and development of OS.The abnormal signaling pathways of occurrence and development of OS are PI3K-AKT and Notch signal pathways. The prediction model constituted by 6 characteristic gene sets of OS possesses a good predictive ability.

Key words: osteosarcoma, The Cancer Genome Atlas, molecular mechanism, tumor biomarker, tumor prognostic model

中图分类号: 

  • R738.1