基于集成的共表达网络分析方法研究3种癌症的肿瘤相关模块
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

蛋白质组学国家重点实验室开放课题 (No. SKLP-O2020005),国家重点研发计划 (No. 2016YFB0201702),国家高技术研究发展计划 (863计划) (No. 2012AA020409),国家重点基础研究发展计划 (973计划) (No. 2011CB910601) 资助。


Integration-based co-expression network analysis to investigate tumor-associated modules across three cancer types
Author:
Affiliation:

Fund Project:

Open Project Program of the State Key Laboratory of Proteomics (No. SKLP-O2020005), National Key Research and Development Program of China (No. 2016YFB0201702), National High Technology Research and Development Program of China (863 Program) (No. 2012AA020409), National Basic Research Program of China (973 Program) (No. 2011CB910601).

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在肿瘤/癌旁基因表达数据中,差异表达 (DE,differential expression) 代表各种生物条件下基因表达水平的变化,而差异共表达 (DC,differential co-expression) 代表基因对之间相关系数的变化。单独的DC和DE研究方法已经被广泛应用于人类疾病研究中。但是,目前仍然缺乏有效整合DC和DE的分析方法。文中提出一个新颖的分析框架DC&DEmodule,该框架可以基于共表达模块整合DC和DE的特征,并同时整合多个肿瘤/癌旁表达谱的信息,用以识别与疾病相关的基因共表达模块,包括激活模块 (肿瘤样本中上调且共表达增强) 和失能模块 (肿瘤样本中下调且失去共表达)。将该框架用于分析肝癌、胃癌和结直肠癌各两组微阵列数据,分别得到肝癌、胃癌和结直肠癌的2、5和2个激活模块以及5、5和1个失能模块。富集分析表明与同类方法相比,文中的方法在检测已知的肿瘤相关通路和发现新通路方面均具有更高的灵敏度。然后,进一步从这3种癌症的激活模块中鉴定出17、69和11个模块关键基因,其中包含53个已报道的预后生物标志物以及3个分别与3种癌症存活率显著相关的新预后标志物。基于关键基因训练了3种癌症的随机森林模型,用于区分TCGA(The Cancer Genome Atlas) 和GEO (Gene Expression Omnibus)数据库中的肿瘤和癌旁样本,结果显示其分类的平均准确性达到了93%。三种癌症的比较为不同癌症的共有和组织特异性机制提供了新的见解。一系列评估表明,DC&DEmodule框架能够整合公共数据库中快速积累的表达谱,发现更多疾病中功能失调的生物过程。

    Abstract:

    In case/control gene expression data, differential expression (DE) represents changes in gene expression levels across various biological conditions, whereas differential co-expression (DC) represents an alteration of correlation coefficients between gene pairs. Both DC and DE genes have been studied extensively in human diseases. However, effective approaches for integrating DC–DE analyses are lacking. Here, we report a novel analytical framework named DC&DEmodule for integrating DC and DE analyses and combining information from multiple case/control expression datasets to identify disease-related gene co-expression modules. This includes activated modules (gaining co-expression and up-regulated in disease) and dysfunctional modules (losing co-expression and down-regulated in disease). By applying this framework to microarray data associated with liver, gastric and colon cancer, we identified two, five and two activated modules and five, five and one dysfunctional module(s), respectively. Compared with the other methods, pathway enrichment analysis demonstrated the superior sensitivity of our method in detecting both known cancer-related pathways and those not previously reported. Moreover, we identified 17, 69, and 11 module hub genes that were activated in three cancers, which included 53 known and three novel cancer prognostic markers. Random forest classifiers trained by the hub genes showed an average of 93% accuracy in differentiating tumor and adjacent normal samples in the TCGA and GEO database. Comparison of the three cancers provided new insights into common and tissue-specific cancer mechanisms. A series of evaluations demonstrated the framework is capable of integrating the rapidly accumulated expression data and facilitating the discovery of dysregulated processes.

    参考文献
    相似文献
    引证文献
引用本文

王梦男,韩明飞,刘炳辉,田春艳,朱云平. 基于集成的共表达网络分析方法研究3种癌症的肿瘤相关模块[J]. 生物工程学报, 2021, 37(11): 4111-4123

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-02-10
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2021-11-25
  • 出版日期:
文章二维码
您是第位访问者
生物工程学报 ® 2024 版权所有

通信地址:中国科学院微生物研究所    邮编:100101

电话:010-64807509   E-mail:cjb@im.ac.cn

技术支持:北京勤云科技发展有限公司