切换至 "中华医学电子期刊资源库"

中华妇幼临床医学杂志(电子版) ›› 2021, Vol. 17 ›› Issue (02) : 181 -189. doi: 10.3877/cma.j.issn.1673-5250.2021.02.009

所属专题: 文献

论著

基于TCGA数据库的子宫内膜癌差异基因患者的预后预测模型构建
郝琦蓉1,1, 任艺婷1,1, 胡晶晶1,1, 刘晓阳2,2, 刘小春3,,3()   
  • 收稿日期:2020-04-13 修回日期:2021-03-10 出版日期:2021-04-01
  • 通信作者: 刘小春

Construction of model for prognosis prediction based on TCGA database for differentially expressed genes of endometrial cancer

Qirong Hao1,1, Yiting Ren1,1, Jingjing Hu1,1, Xiaoyang Liu2,2, Xiaochun Liu3,3,()   

  • Received:2020-04-13 Revised:2021-03-10 Published:2021-04-01
  • Corresponding author: Xiaochun Liu
  • Supported by:
    National Natural Science Foundation of China(81971365)
引用本文:

郝琦蓉, 任艺婷, 胡晶晶, 刘晓阳, 刘小春. 基于TCGA数据库的子宫内膜癌差异基因患者的预后预测模型构建[J]. 中华妇幼临床医学杂志(电子版), 2021, 17(02): 181-189.

Qirong Hao, Yiting Ren, Jingjing Hu, Xiaoyang Liu, Xiaochun Liu. Construction of model for prognosis prediction based on TCGA database for differentially expressed genes of endometrial cancer[J]. Chinese Journal of Obstetrics & Gynecology and Pediatrics(Electronic Edition), 2021, 17(02): 181-189.

目的

探讨子宫内膜癌(EC)患者预后相关差异基因筛选,并构建其预后预测模型。

方法

在癌症基因组图谱(TCGA)数据库(https://portal.gdc.cancer.gov/)中,以"Uteri" "TCGA-UCEC" "transcriptome profiling" "gene expression quantification and HTSeq-FPKM"为关键词,检索EC患者和正常女性受试者的子宫内膜组织RNA-seq微阵列基因表达数据及其相关临床信息。本研究检索时间设定为TCGA数据库建库至2021年1月15日。选择最终符合本研究纳入标准的542例EC患者与35例正常女性受试者为研究对象。本研究基于TCGA数据库的EC差异基因患者的预后预测模型构建步骤为:①利用R语言微阵列数据的线性模型(LIMMA)包,对TCGA数据库的RNA-seq微阵列基因表达数据进行差异基因分析,筛选影响EC发生、发展的候选差异基因。②利用Kaplan-Meier法、LASSO算法回归、单因素Cox比例风险回归分析法,对EC患者生存相关差异基因进行筛选。采用多因素Cox比例风险回归分析法,确定EC患者预后相关差异基因。③构建EC差异基因的EC患者预后预测模型。④利用survival受试者工作特征(ROC)曲线软件包,检测该预测模型的准确性,并绘制列线图。

结果

①在本组EC患者中,共计发现466个EC差异基因,其中上调基因为179个,下调基因为287个。②在本组EC患者的96个EC生存相关差异基因中,7个为预后相关差异基因,包括孕激素受体(PGR)、sushi重复含蛋白质X连锁(SRPX)、γ-谷氨酰水解酶(GGH)、分泌球蛋白家族2A成员1(SCGB2A1)、胰岛素样生长因子结合蛋白5(IGFBP5)、细胞周期蛋白依赖性激酶抑制剂2A(CDKN2A)、神经调节素U(NMU)基因。对这7个差异基因的单因素Cox比例风险回归分析结果显示,其均为EC患者预后影响因素(P<0.05)。多因素Cox比例风险回归分析结果显示,GGH、IGFBP5、CDKN2A差异基因,均为影响EC患者预后的独立危险因素(P<0.05),若EC患者GGH、IGFBP5、CDKN2A差异基因表达水平越高,则患者预后越差。③建立EC患者总体生存(OS)期预测模型为:ln[h(t,X)/h0(t)]=1.300xGGH+1.200xIGFBP5+1.200xCDKN2A。其中,h(t,X):受试者在t时刻的风险率函数,h0(t):受试者在t时刻的基准风险率函数,即xGGHxIGFBP5xCDKN2A均为0时的风险率函数,xGGHxIGFBP5xCDKN2A分别表示GGH、IGFBP5、CDKN2A差异基因表达水平。④采用上述预测模型,对研究组患者的生存风险进行评分,并按照其中位风险评分,进一步将其分为高危亚组(n=271,风险评分高于中位评分)与低危亚组(n=271,风险评分低于中位评分),并且低危亚组OS期显著长于高危亚组,差异有统计学意义(χ2=33.000,P<0.001),对该模型预测EC患者OS期的ROC曲线分析结果显示,曲线下面积(AUC)为0.700(95%CI:0.673~0.751,P<0.001),同时构建的Nomogram列线图,可定量预测EC患者1、3、5年OS率。

结论

构建的GGH、IGFBP5和CDKN2A差异基因的EC患者预后预测模型,可为临床预测EC患者预后及寻找相应靶向治疗药物提供数据支持。

Objective

To screen prognostic related differentially expressed genes of patients with endometrial cancer (EC) and construct a prognostic prediction model for patients with EC.

Methods

This study was searched by " Uteri" " TCGA-UCEC" " transcriptome profiling" " gene expression quantification and HTSeq-FPKM" as search key words in The Cancer Genome Atlas (TCGA) database from establishment of TCGA database to January 15, 2021. A total of 542 EC patients and 35 normal women who met inclusion criteria of this study were selected as research subjects. The steps for constructing a prognostic prediction model for patients with EC differential genes based on TCGA database in this study were as follows. ①R language linear models for microarray data (LIMMA) package was used to perform differential gene analysis on RNA-seq microarray gene expression data of TCGA database, and to screen candidate differential genes that affect the occurrence and development of EC gene. ②Kaplan-Meier method, LASSO algorithm regression, and univariate Cox proportional hazard regression analysis methods were used to screen survival-related differential genes of EC patients. Multivariate Cox proportional hazard regression analysis method was used to determine the prognostic differential genes of EC patients. ③Prognosis prediction model of EC patients was constructed with EC differential genes. ④Survival receiver operating characteristic (ROC) curve software package was used to detect the accuracy of the prediction model and draw a nomogram.

Results

①Among EC patients in this study, a total of 466 EC differential genes were found, of which 179 were up-regulated genes and 287 were down-regulated genes. ②Among 96 EC survival-related differential genes in EC patients of this study, 7 were found to be prognostic-related differential genes, including progesterone receptor (PGR), sushi repeat containing protein X-linked (SRPX), gamma-glutamyl hydrolase (GGH), secretoglobin family 2A member 1 (SCGB2A1), insulin like growth factor binding protein 5 (IGFBP5), cyclin dependent kinase inhibitor 2A (CDKN2A), and neuromedin U (NMU) genes. Univariate Cox proportional hazards regression analysis of these seven differential genes showed that they were all prognostic factors affecting prognosis of EC patients (P<0.05). Multivariate Cox proportional hazard regression analysis showed that differential genes of GGH, IGFBP5, and CDKN2A were independent risk factors that affect prognostic of EC patients (P<0.05). If expression levels of differential genes GGH, IGFBP5, and CDKN2A in EC patient are higher, the patient′s prognosis will be worse. ③Overall survival (OS) prediction model of EC patients was as follows: ln[h(t, X)/ h0(t)]=1.300xGGH+ 1.200xIGFBP5+ 1.200xCDKN2A. Among them, h(t, X) represented the hazard rate function of subject at time t, h0(t) represented the hazard rate function when xGGH, xIGFBP5, xCDKN2A were all 0, and xGGH, xIGFBP5, xCDKN2A represented expression level of GGH, IGFBP5, CDKN2A differential genes. ④Survival risk of 542 EC patients was scored by established model and the patients were divided into high-risk subgroup (n=271, risk score higher than median score) and low-risk subgroup (n=271, risk score lower than median score) according to the median value of risk score. And OS period of low-risk subgroup was significantly longer than that of high-risk subgroup, and the difference was statistically significant (χ2=33.000, P<0.001). ROC curve analysis of this model for predicting OS period of EC patients showed that area under curve (AUC) was 0.700 (95%CI: 0.673-0.751, P<0.001). At the same time, Nomogram was constructed to quantitatively predict EC patients 1, 3, 5 year OS rate.

Conclusion

The constructed prognosis prediction model of EC patients with GGH, IGFBP5 and CDKN2A differential genes can provide data support for clinical prediction of prognosis of EC patients and search for corresponding targeted therapy drugs.

图1 EC差异基因分层聚类热图(根据差异基因表达情况进行聚类)
表1 EC患者预后相关差异基因的单因素与多因素Cox比例风险回归分析结果
图2 对EC差异基因采取LASSO回归与多因素Cox比例风险回归分析结果图(图2A、2B:96个EC差异基因的LASSO回归模型;图2C:GGH、IGFBP5、CDKN2A差异基因的多因素Cox比例风险回归模型森林图)
图3 EC患者OS期预测模型预测准确性分析(图3A:高危与低危亚组EC患者OS曲线分析;图3B:该模型预测EC患者OS期的ROC曲线分析;图3C~3E:该模型中的高危与低危EC患者分布)
图4 EC患者OS期预测模型列线图
[1]
Bendifallah S, Ballester M, Daraï E. Endometrial cancer: predictive models and clinical impact[J]. Bull Cancer, 2017, 104(12): 1022-1031. DOI: 10.1016/j.bulcan.2017.06.017.
[2]
Miller KD, Nogueira L, Mariotto AB, et al. Cancer treatment and survivorship statistics, 2019[J]. CA Cancer J Clin, 2019, 69(5): 363-385. DOI: 10.3322/caac.21565.
[3]
Dou Y, Kawaler EA, Cui Zhou D, et al. Clinical proteomic tumor analysis consortium. proteogenomic characterization of endometrial carcinoma[J]. Cell, 2020, 180(4): 729.e26-748.e26. DOI: 10.1016/j.cell.2020.01.026.
[4]
Guo CB, Tang YQ, Zhang YQ, et al. Mining TCGA data for key biomarkers related to immune microenvironment in endometrial cancer by immune score and weighted correlation network analysis[J].Front Mol Biosci, 2021, 8: 645388. DOI: 10.3389/fmolb.2021.645388.
[5]
Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA, et al. The Cancer Genome Atlas Pan-Cancer analysis project[J]. Nat Genet, 2013, 45(10): 1113-1120. DOI: 10.1038/ng.2764.
[6]
Wang Z, Jensen MA, Zenklusen JC. A practical guide to The Cancer Genome Atlas (TCGA)[J]. Methods Mol Biol, 2016, 1418: 111-141. DOI: 10.1007/978-1-4939-3578-9_6.
[7]
谢龙祥,闫中义,党艺方,等. TCGA数据库:海量癌症数据的源泉[J]. 河南大学学报(医学版), 2018, 37(3): 223-228.
[8]
Law CW, Chen Y, Shi W, et al. Voom: precision weights unlock linear model analysis tools for RNA-seq read counts[J]. Genome Biol, 2014, 15(2): R29. DOI: 10.1186/gb-2014-15-2-r29.
[9]
Fan Yuan, Li Xingchen, Tian Li et al. Identification of a metabolism-related signature for the prediction of survival in endometrial cancer patients[J].Front Oncol, 2021, 11: 630905. DOI: 10.3389/fonc.2021.630905.
[10]
Aggarwal A, Prinz-Wohlgenannt M, Tennakoon S, et al. The calcium-sensing receptor: a promising target for prevention of colorectal cancer[J]. Biochim Biophys Acta, 2015, 1853(9): 2158-2167. DOI: 10.1016/j.bbamcr.2015.02.011.
[11]
Xi YB, Guo F, Xu ZL, et al. Radiomics signature: a potential biomarker for the prediction of MGMT promoter methylation in glioblastoma[J]. J Magn Reson Imaging, 2018, 47(5): 1380-1387. DOI: 10.1002/jmri.25860.
[12]
Huang R, Liao X, Li Q. Identification and validation of potential prognostic gene biomarkers for predicting survival in patients with acute myeloid leukemia[J]. Onco Targets Ther, 2017, 10: 5243-5254. DOI: 10.2147/OTT.S147717.
[13]
Gaudin F, Nasreddine S, Donnadieu AC, et al. Identification of the chemokine CX3CL1 as a new regulator of malignant cell proliferation in epithelial ovarian cancer[J]. PLoS One, 2011, 6(7): e21546. DOI: 10.1371/journal.pone.0021546.
[14]
Constantine GD, Kessler G, Graham S, et al. Increased incidence of endometrial cancer following the women′s health initiative: an assessment of risk factors[J]. J Womens Health (Larchmt), 2019, 28(2): 237-243. DOI: 10.1089/jwh.2018.6956.
[15]
Ravo M, Cordella A, Saggese P, et al. Identification of long non-coding RNA expression patterns useful for molecular-based classification of type Ⅰ endometrial cancers[J]. Oncol Rep, 2019, 41(2): 1209-1217. DOI: 10.3892/or.2018.6880.
[16]
Mackay HJ, Eisenhauer EA, Kamel-Reid S, et al. Molecular determinants of outcome with mammalian target of rapamycin inhibition in endometrial cancer[J]. Cancer, 2014, 120(4): 603-610. DOI: 10.1002/cncr.28414.
[17]
Wang F, Wang B, Long J, et al. Identification of candidate target genes for endometrial cancer, such as ANO1, using weighted gene co-expression network analysis[J]. Exp Ther Med, 2019, 17(1): 298-306. DOI: 10.3892/etm.2018.6965.
[18]
Goebel EA, Vidal A, Matias-Guiu X, et al. The evolution of endometrial carcinoma classification through application of immunohistochemistry and molecular diagnostics: past, present and future[J]. Virchows Arch, 2018, 472(6): 885-896. DOI: 10.1007/s00428-017-2279-8.
[19]
张远丽,张师前. 基于预后和分子分型的子宫内膜癌分期修订建议:国际声音与中国现状[J]. 中国实用妇科与产科杂志,2020, 36(3): 283-286. DOI: 10.19538/j.fk2020030123.
[20]
Gu Y, Zhang M, Peng F, et al. The BRCA1/2-directed miRNA signature predicts a good prognosis in ovarian cancer patients with wild-type BRCA1/2[J]. Oncotarget, 2015, 6(4): 2397-2406. DOI: 10.18632/oncotarget.2963.
[21]
Melling N, Rashed M, Schroeder C, et al. High-level γ-glutamyl-hydrolase (GGH) expression is linked to poor prognosis in ERG negative prostate cancer[J]. Int J Mol Sci, 2017, 18(2): 286. DOI: 10.3390/ijms18020286.
[22]
Shubbar E, Helou K, Kovács A, et al. High levels of γ-glutamyl hydrolase (GGH) are associated with poor prognosis and unfavorable clinical outcomes in invasive breast cancer[J]. BMC Cancer, 2013, 13: 47. DOI: 10.1186/1471-2407-13-47.
[23]
Wang W, Lim WK, Leong HS, et al. An eleven gene molecular signature for extra-capsular spread in oral squamous cell carcinoma serves as a prognosticator of outcome in patients without nodal metastases[J]. Oral Oncol, 2015, 51(4): 355-362. DOI: 10.1016/j.oraloncology.2014.12.012.
[24]
Rutanen EM, Nyman T, Lehtovirta P, et al. Suppressed expression of insulin-like growth factor binding protein-1 mRNA in the endometrium: a molecular mechanism associating endometrial cancer with its risk factors[J]. Int J Cancer, 1994, 59(3): 307-312. DOI: 10.1002/ijc.2910590303.
[25]
Naciff JM, Khambatta ZS, Thomason RG, et al. The genomic response of a human uterine endometrial adenocarcinoma cell line to 17alpha-ethynyl estradiol[J]. Toxicol Sci, 2009, 107(1): 40-55. DOI: 10.1093/toxsci/kfn219.
[26]
Miyake H, Nelson C, Rennie PS, et al. Overexpression of insulin-like growth factor binding protein-5 helps accelerate progression to androgen-independence in the human prostate LNCaP tumor model through activation of phosphatidylinositol 3′-kinase pathway[J]. Endocrinology, 2000, 141(6): 2257-2265. DOI: 10.1210/endo.141.6.7520.
[27]
Baron-Hay S, Boyle F, Ferrier A, et al. Elevated serum insulin-like growth factor binding protein-2 as a prognostic marker in patients with ovarian cancer[J]. Clin Cancer Res, 2004, 10(5): 1796-1806. DOI: 10.1158/1078-0432.ccr-0672-2.
[28]
Xiao Z, He Y, Liu C, et al. Targeting P16INK4A in uterine serous carcinoma through inhibition of histone demethylation[J]. Oncol Rep, 2019, 41(5): 2667-2678. DOI: 10.3892/or.2019.7067.
[29]
Davidson B, Abeler VM, Hellesylt E, et al. Gene expression signatures differentiate uterine endometrial stromal sarcoma from leiomyosarcoma[J]. Gynecol Oncol, 2013, 128(2): 349-355. DOI: 10.1016/j.ygyno.2012.11.021.
[30]
Cai H, Xiang YB, Qu S, et al. Association of genetic polymorphisms in cell-cycle control genes and susceptibility to endometrial cancer among Chinese women[J]. Am J Epidemiol, 2011, 173(11): 1263-1271. DOI: 10.1093/aje/kwr002.
[31]
Su L, Wang H, Miao J, et al. Clinicopathological significance and potential drug target of CDKN2A/p16 in endometrial carcinoma[J]. Sci Rep, 2015, 5: 13238. DOI: 10.1038/srep13238.
[32]
Cancer Genome Atlas Research Network, Kandoth C, Schultz N, et al. Integrated genomic characterization of endometrial carcinoma[J]. Nature, 2013, 497(7447): 67-73. DOI: 10.1038/nature12113.
[33]
Stelloo E, Bosse T, Nout RA, et al. Refining prognosis and identifying targetable pathways for high-risk endometrial cancer; a TransPORTEC initiative[J]. Mod Pathol, 2015, 28(6): 836-844. DOI: 10.1038/modpathol.2015.43.
[34]
Zhang W, Gao L, Wang C, et al. Combining bioinformatics and experiments to identify and verify key genes with prognostic values in endometrial carcinoma[J]. J Cancer, 2020, 11(3): 716-732. DOI: 10.7150/jca.35854.
[35]
He X, Lei S, Zhang Q, et al. Deregulation of cell adhesion molecules is associated with progression and poor outcomes in endometrial cancer: analysis of The Cancer Genome Atlas data[J]. Oncol Lett, 2020, 19(3): 1906-1914. DOI: 10.3892/ol.2020.11295.
[1] 衣晓丽, 胡沙沙, 张彦. HER-2低表达对乳腺癌新辅助治疗疗效及预后的影响[J]. 中华乳腺病杂志(电子版), 2023, 17(06): 340-346.
[2] 施杰, 李云涛, 高海燕. 腋窝淋巴结阳性Luminal A型乳腺癌患者新辅助与辅助化疗的预后及影响因素分析[J]. 中华乳腺病杂志(电子版), 2023, 17(06): 353-361.
[3] 谭巧, 苏小涵, 侯令密, 黎君彦, 邓世山. 乳腺髓样癌的诊治进展[J]. 中华乳腺病杂志(电子版), 2023, 17(06): 366-368.
[4] 张思平, 刘伟, 马鹏程. 全膝关节置换术后下肢轻度内翻对线对疗效的影响[J]. 中华关节外科杂志(电子版), 2023, 17(06): 808-817.
[5] 杨倩, 李翠芳, 张婉秋. 原发性肝癌自发性破裂出血急诊TACE术后的近远期预后及影响因素分析[J]. 中华普外科手术学杂志(电子版), 2024, 18(01): 33-36.
[6] 栗艳松, 冯会敏, 刘明超, 刘泽鹏, 姜秋霞. STIP1在三阴性乳腺癌组织中的表达及临床意义研究[J]. 中华普外科手术学杂志(电子版), 2024, 18(01): 52-56.
[7] 马伟强, 马斌林, 吴中语, 张莹. microRNA在三阴性乳腺癌进展中发挥的作用[J]. 中华普外科手术学杂志(电子版), 2024, 18(01): 111-114.
[8] 姜明, 罗锐, 龙成超. 闭孔疝的诊断与治疗:10年73例患者诊疗经验总结[J]. 中华疝和腹壁外科杂志(电子版), 2023, 17(06): 706-710.
[9] 卢艳军, 马健, 白鹏宇, 郭凌宏, 刘海义, 江波, 白文启, 张毅勋. 纳米碳在腹腔镜直肠癌根治术中253组淋巴结清扫的临床效果[J]. 中华结直肠疾病电子杂志, 2023, 12(06): 473-477.
[10] 钟广俊, 刘春华, 朱万森, 徐晓雷, 王兆军. MRI联合不同扫描序列在胃癌术前分期诊断及化疗效果和预后的评估[J]. 中华消化病与影像杂志(电子版), 2023, 13(06): 378-382.
[11] 胡宝茹, 尚乃舰, 高迪. 中晚期肝细胞癌的DCE-MRI及DWI表现与免疫治疗预后的相关性分析[J]. 中华消化病与影像杂志(电子版), 2023, 13(06): 399-403.
[12] 陆萍, 邹健. 凝血和纤维蛋白溶解标志物的动态变化对急性胰腺炎患者预后的评估价值[J]. 中华消化病与影像杂志(电子版), 2023, 13(06): 427-432.
[13] 李永胜, 孙家和, 郭书伟, 卢义康, 刘洪洲. 高龄结直肠癌患者根治术后短期并发症及其影响因素[J]. 中华临床医师杂志(电子版), 2023, 17(9): 962-967.
[14] 王军, 刘鲲鹏, 姚兰, 张华, 魏越, 索利斌, 陈骏, 苗成利, 罗成华. 腹膜后肿瘤切除术中大量输血患者的麻醉管理特点与分析[J]. 中华临床医师杂志(电子版), 2023, 17(08): 844-849.
[15] 索利斌, 刘鲲鹏, 姚兰, 张华, 魏越, 王军, 陈骏, 苗成利, 罗成华. 原发性腹膜后副神经节瘤切除术麻醉管理的特点和分析[J]. 中华临床医师杂志(电子版), 2023, 17(07): 771-776.
阅读次数
全文


摘要