吉林大学学报(医学版) ›› 2023, Vol. 49 ›› Issue (6): 1491-1503.doi: 10.13481/j.1671-587X.20230612

• 临床研究 • 上一篇    下一篇

基于GEO和TCGA数据库对肺腺癌差异表达基因的生物信息学分析

叶汇,孙哲,周丽婷,齐雯(),叶琳()   

  1. 吉林大学公共卫生学院劳动卫生与环境卫生教研室,吉林 长春 130021
  • 收稿日期:2022-12-14 出版日期:2023-11-28 发布日期:2023-12-22
  • 通讯作者: 齐雯,叶琳 E-mail:qiwen@jlu.edu.cn;yel@jlu.edu.cn
  • 作者简介:叶 汇(1998-),女,河南省南阳市人,在读硕士研究生,主要从事十溴二苯乙烷对甲状腺损害方面的研究。
  • 基金资助:
    国家自然科学基金项目(81972996)

Bioinformatics analysis on differentially expressed genes in lung adenocarcinoma based on GEO and TCGA Databases

Hui YE,Zhe SUN,Liting ZHOU,Wen QI(),Lin YE()   

  1. Department of Labor Health and Environmental Sanitation,School of Public Health,Jilin University,Changchun 130021,China
  • Received:2022-12-14 Online:2023-11-28 Published:2023-12-22
  • Contact: Wen QI,Lin YE E-mail:qiwen@jlu.edu.cn;yel@jlu.edu.cn

摘要:

目的 采用生物信息学方法筛选影响肺腺癌(LUAD)的关键基因,分析其生物学功能及其对LUAD预后的影响。 方法 于高通量基因表达(GEO)数据库下载GSE118370和GSE136043芯片数据,癌症基因组图谱(TCGA)数据库筛选LUAD相关数据。采用R软件分析共同表达的差异表达基因(DEGs)。采用clusterProfile R包对DEGs进行基因本体(GO)功能富集分析,DAVID数据库进行京都基因与基因组百科全书(KEGG)通路富集分析,STRING数据库构建蛋白-蛋白相互作用(PPI)网络。采用Cytoscape筛选连接度排名前10位的关键基因,GEPIA数据库和人类蛋白质图谱(HPA)数据库分析正常肺组织和LUAD组织中关键基因mRNA和蛋白表达情况及不同分期LUAD组织中关键基因表达情况。关键基因免疫浸润分析和生存分析获取关键基因表达与患者生存期的相关关系。 结果 共筛选DEGs 428个。GO分析,LUAD的DEGs在主要富集于上皮-间质转化(EMT)等生物过程(BP)方面、细胞基部等细胞组分(CC)方面和细胞外基质(ECM)结构形成等分子功能(MF)方面。KEGG分析,LUAD的DEGs主要富集于细胞因子受体相互作用通路等方面。筛选DNA拓扑异构酶Ⅱα(TOP2A)、果蝇纺锤体异常基因(ASPM)、细胞周期蛋白B1(CCNB1)、人类细胞分裂周期相关基因8(CDCA8)、含杆状病毒IAP重复序列蛋白5(BIRC5)、苏氨酸激酶(AURKA)、驱动蛋白超家族成员20A(KIF20A)、中心体相关蛋白55(CEP55)、着丝粒蛋白F(CENPF)和微管组织因子(TPX2)为关键基因。与正常肺组织比较,LUAD患者肺组织中TOP2A、CCNB1、CDCA8、BIRC5、AURKA、KIF20A、CEP55、CENPF和TPX2 mRNA表达水平均增加(P<0.01),蛋白表达均增加。CCNB1、CDCA8、BIRC5、AURKA、KIF20A、CEP55和TPX2 mRNA在不同LUAD分期的表达水平差异均有统计学意义(P<0.01)。与Ⅰ、Ⅱ和Ⅲ期LUAD患者比较,Ⅳ期LUAD患者肺组织中CCNB1、CDCA8、AURKA、KIF20A、CEP55和TPX2 mRNA表达水平增加(P<0.01);与Ⅰ、Ⅱ和Ⅳ期LUAD患者比较,Ⅲ期LUAD患者肺组织中BIRC5 mRNA表达水平增加(P<0.01)。10个关键基因表达与B淋巴细胞浸润均呈负相关关系(-0.253≤r≤-0.014,P<0.01);TOP2A、ASPM、CDCA8、BIRC5、CEP55、CENPF和TPX2表达与中性粒细胞浸润呈正相关关系(0.049≤r≤0.165,P<0.01);CCNB1和AURKA表达与CD4 T淋巴细胞、巨噬细胞和树突状细胞浸润呈负相关关系(-0.210≤r≤-0.100,P<0.01)。CDCA8高表达会增加LUAD恶化风险(P<0.01),TOP2A、CCNB1、CDCA8、BIRC5、AURKA、KIF20A、CEP55、CENPF和TPX2高表达会增加患者死亡风险(P<0.01)。 结论 TOP2A、ASPM、CCNB1、CDCA8、BIRC5、AURKA、KIF20A、CEP55、CENPF和TPX2是参与LUAD发生进展过程的关键基因,可能通过加速EMT过程促进LUAD发展,其高表达提示LUAD患者预后不良和死亡风险升高。

关键词: 肺肿瘤, 关键基因, 上皮-间质转化, 免疫浸润, 预后因子, 腺癌

Abstract:

Objective To screen out the key genes affecting lung adenocarcinoma (LUAD) through bioinformatics methods,and to analyze their biological functions and the influences on the LUAD prognosis. Methods The GSE118370 and GSE136043 chip data were downloaded from the Gene Expression Omnibus(GEO) Database. The LUAD-related data were selected from the The Cancer Genome Atlas(TCGA) Database. R software was used to analyze the co-expressed differentially expressed genes (DEGs); clusterProfile R package was utilized for Gene Ontology (GO) functional enrichment analysis; DAVID Database was used for the Kyoto Gene and Genome Encyclopedia (KEGG) signaling pathway enrichment analysis; STRING Database was used to construct the protein-protein interaction (PPI) network; Cytoscape was used to screen out the top 10 key genes; GEPIA Database and Human Protein Atlas(HPA) Database were used to analyze the expressions of key genes mRNA and protein in normal lung tissue and LUAD tissue, and their expressions in LUAD tissues with different stages;immune infiltration analysis and survival analysis were used to analyze the correlation between the expressions of key genes and the survival time of the patients. Results In total, 428 DEGs were screened out. The GO functional analysis results showed that the DEGs of LUAD were mainly enriched in biological process(BP) such as epithelial-mesenchymal transition (EMT), cellular component(CC) such as the cell base, and molecular function(MF) such as extracellular matrix (ECM) structure formation.The KEGG signaling pathway analysis results showed that the DEGs of LUAD were mainly enriched in the pathways like cytokine receptor interactions.topoisomerase Ⅱ alpha(TOP2A),abnormal spindle microtubule assembly(ASPM),cyclin B1,(CCNB1),cell division cycle associated 8(CDCA8),baculoviral IAP repeat containing 5(BIRC5),aurora A(AURKA),kinesin family member 20A(KIF20A),centrosomal protein 55(CEP55),centromere protein F(CENPF),and targeting protein for xklp2(TPX2) were selected as the key genes. Compared with normal lung tissue, the expression levels of TOP2A, CCNB1, CDCA8, BIRC5, AURKA, KIF20A, CEP55, CENPF, and TPX2 mRNA in lung tissue of the LUAD patients were increased (P<0.01), and the expression levels of TOP2A, CCNB1, CDCA8, BIRC5, AURKA, KIF20A, CEP55, CENPF, and TPX2 proteins were increased. There were significant differences in the expression levels of CCNB1, CDCA8, BIRC5, AURKA, KIF20A, CEP55, and TPX2 mRNA in the LUAD tissue with different stages (P<0.01). Compared with the LUAD patients with stage Ⅰ,stage Ⅱ,and stage Ⅲ,the expression levels of CCNB1, CDCA8, AURKA, KIF20A, CEP55, and TPX2 mRNA in lung tissue of the LUAD patients with stage Ⅳ were increased(P<0.01); compared with the LUAD patients with stage Ⅰ,stage Ⅱ, and stage Ⅳ,the expression level of BIRC5 mRNA in lung tissue of the LUAD patients with stage Ⅲ was increased (P<0.01). The expressions of 10 key genes were negatively correlated with the B-lymphocyte infiltration (-0.253≤r≤-0.014, P<0.01); the expressions of TOP2A, ASPM, CDCA8, BIRC5, CEP55, CENPF, and TPX2 were positively correlated with the neutrophil infiltration (0.049≤r≤0.165,P<0.01); the expressions of CCNB1 and AURKA were negatively correlated with the CD4 T lymphocyte, macrophage, and dendritic cell infiltration (-0.210≤r≤-0.100,P<0.01). The high expression of CDCA8 increased the risk of LUAD deterioration (P<0.01), and the high expressions of TOP2A, CCNB1, CDCA8, BIRC5, AURKA, KIF20A, CEP55, CENPF, and TPX2 increased the death risk of the patients(P<0.01). Conclusion TOP2A, ASPM, CCNB1, CDCA8, BIRC5, AURKA, KIF20A, CEP55, CENPF, and TPX2 are the key genes involved in the development and progression of LUAD. They may promote the development of LUAD by accelerating the EMT process, and their high expressions suggest the poor prognosis and elevate the death risk of the LUAD patients.

Key words: Lung adenocarcinoma, Key gene, Epithelial-mesenchymal transition, Immune infiltration, Prognostic factor, Adenocarcinoma

中图分类号: 

  • R737.9