切换至 "中华医学电子期刊资源库"

中华妇幼临床医学杂志(电子版) ›› 2021, Vol. 17 ›› Issue (02) : 181 -189. doi: 10.3877/cma.j.issn.1673-5250.2021.02.009

所属专题: 文献

论著

基于TCGA数据库的子宫内膜癌差异基因患者的预后预测模型构建
郝琦蓉1,1, 任艺婷1,1, 胡晶晶1,1, 刘晓阳2,2, 刘小春3,,3()   
  • 收稿日期:2020-04-13 修回日期:2021-03-10 出版日期:2021-04-01
  • 通信作者: 刘小春

Construction of model for prognosis prediction based on TCGA database for differentially expressed genes of endometrial cancer

Qirong Hao1,1, Yiting Ren1,1, Jingjing Hu1,1, Xiaoyang Liu2,2, Xiaochun Liu3,3,()   

  • Received:2020-04-13 Revised:2021-03-10 Published:2021-04-01
  • Corresponding author: Xiaochun Liu
  • Supported by:
    National Natural Science Foundation of China(81971365)
引用本文:

郝琦蓉, 任艺婷, 胡晶晶, 刘晓阳, 刘小春. 基于TCGA数据库的子宫内膜癌差异基因患者的预后预测模型构建[J/OL]. 中华妇幼临床医学杂志(电子版), 2021, 17(02): 181-189.

Qirong Hao, Yiting Ren, Jingjing Hu, Xiaoyang Liu, Xiaochun Liu. Construction of model for prognosis prediction based on TCGA database for differentially expressed genes of endometrial cancer[J/OL]. Chinese Journal of Obstetrics & Gynecology and Pediatrics(Electronic Edition), 2021, 17(02): 181-189.

目的

探讨子宫内膜癌(EC)患者预后相关差异基因筛选,并构建其预后预测模型。

方法

在癌症基因组图谱(TCGA)数据库(https://portal.gdc.cancer.gov/)中,以"Uteri" "TCGA-UCEC" "transcriptome profiling" "gene expression quantification and HTSeq-FPKM"为关键词,检索EC患者和正常女性受试者的子宫内膜组织RNA-seq微阵列基因表达数据及其相关临床信息。本研究检索时间设定为TCGA数据库建库至2021年1月15日。选择最终符合本研究纳入标准的542例EC患者与35例正常女性受试者为研究对象。本研究基于TCGA数据库的EC差异基因患者的预后预测模型构建步骤为:①利用R语言微阵列数据的线性模型(LIMMA)包,对TCGA数据库的RNA-seq微阵列基因表达数据进行差异基因分析,筛选影响EC发生、发展的候选差异基因。②利用Kaplan-Meier法、LASSO算法回归、单因素Cox比例风险回归分析法,对EC患者生存相关差异基因进行筛选。采用多因素Cox比例风险回归分析法,确定EC患者预后相关差异基因。③构建EC差异基因的EC患者预后预测模型。④利用survival受试者工作特征(ROC)曲线软件包,检测该预测模型的准确性,并绘制列线图。

结果

①在本组EC患者中,共计发现466个EC差异基因,其中上调基因为179个,下调基因为287个。②在本组EC患者的96个EC生存相关差异基因中,7个为预后相关差异基因,包括孕激素受体(PGR)、sushi重复含蛋白质X连锁(SRPX)、γ-谷氨酰水解酶(GGH)、分泌球蛋白家族2A成员1(SCGB2A1)、胰岛素样生长因子结合蛋白5(IGFBP5)、细胞周期蛋白依赖性激酶抑制剂2A(CDKN2A)、神经调节素U(NMU)基因。对这7个差异基因的单因素Cox比例风险回归分析结果显示,其均为EC患者预后影响因素(P<0.05)。多因素Cox比例风险回归分析结果显示,GGH、IGFBP5、CDKN2A差异基因,均为影响EC患者预后的独立危险因素(P<0.05),若EC患者GGH、IGFBP5、CDKN2A差异基因表达水平越高,则患者预后越差。③建立EC患者总体生存(OS)期预测模型为:ln[h(t,X)/h0(t)]=1.300xGGH+1.200xIGFBP5+1.200xCDKN2A。其中,h(t,X):受试者在t时刻的风险率函数,h0(t):受试者在t时刻的基准风险率函数,即xGGHxIGFBP5xCDKN2A均为0时的风险率函数,xGGHxIGFBP5xCDKN2A分别表示GGH、IGFBP5、CDKN2A差异基因表达水平。④采用上述预测模型,对研究组患者的生存风险进行评分,并按照其中位风险评分,进一步将其分为高危亚组(n=271,风险评分高于中位评分)与低危亚组(n=271,风险评分低于中位评分),并且低危亚组OS期显著长于高危亚组,差异有统计学意义(χ2=33.000,P<0.001),对该模型预测EC患者OS期的ROC曲线分析结果显示,曲线下面积(AUC)为0.700(95%CI:0.673~0.751,P<0.001),同时构建的Nomogram列线图,可定量预测EC患者1、3、5年OS率。

结论

构建的GGH、IGFBP5和CDKN2A差异基因的EC患者预后预测模型,可为临床预测EC患者预后及寻找相应靶向治疗药物提供数据支持。

Objective

To screen prognostic related differentially expressed genes of patients with endometrial cancer (EC) and construct a prognostic prediction model for patients with EC.

Methods

This study was searched by " Uteri" " TCGA-UCEC" " transcriptome profiling" " gene expression quantification and HTSeq-FPKM" as search key words in The Cancer Genome Atlas (TCGA) database from establishment of TCGA database to January 15, 2021. A total of 542 EC patients and 35 normal women who met inclusion criteria of this study were selected as research subjects. The steps for constructing a prognostic prediction model for patients with EC differential genes based on TCGA database in this study were as follows. ①R language linear models for microarray data (LIMMA) package was used to perform differential gene analysis on RNA-seq microarray gene expression data of TCGA database, and to screen candidate differential genes that affect the occurrence and development of EC gene. ②Kaplan-Meier method, LASSO algorithm regression, and univariate Cox proportional hazard regression analysis methods were used to screen survival-related differential genes of EC patients. Multivariate Cox proportional hazard regression analysis method was used to determine the prognostic differential genes of EC patients. ③Prognosis prediction model of EC patients was constructed with EC differential genes. ④Survival receiver operating characteristic (ROC) curve software package was used to detect the accuracy of the prediction model and draw a nomogram.

Results

①Among EC patients in this study, a total of 466 EC differential genes were found, of which 179 were up-regulated genes and 287 were down-regulated genes. ②Among 96 EC survival-related differential genes in EC patients of this study, 7 were found to be prognostic-related differential genes, including progesterone receptor (PGR), sushi repeat containing protein X-linked (SRPX), gamma-glutamyl hydrolase (GGH), secretoglobin family 2A member 1 (SCGB2A1), insulin like growth factor binding protein 5 (IGFBP5), cyclin dependent kinase inhibitor 2A (CDKN2A), and neuromedin U (NMU) genes. Univariate Cox proportional hazards regression analysis of these seven differential genes showed that they were all prognostic factors affecting prognosis of EC patients (P<0.05). Multivariate Cox proportional hazard regression analysis showed that differential genes of GGH, IGFBP5, and CDKN2A were independent risk factors that affect prognostic of EC patients (P<0.05). If expression levels of differential genes GGH, IGFBP5, and CDKN2A in EC patient are higher, the patient′s prognosis will be worse. ③Overall survival (OS) prediction model of EC patients was as follows: ln[h(t, X)/ h0(t)]=1.300xGGH+ 1.200xIGFBP5+ 1.200xCDKN2A. Among them, h(t, X) represented the hazard rate function of subject at time t, h0(t) represented the hazard rate function when xGGH, xIGFBP5, xCDKN2A were all 0, and xGGH, xIGFBP5, xCDKN2A represented expression level of GGH, IGFBP5, CDKN2A differential genes. ④Survival risk of 542 EC patients was scored by established model and the patients were divided into high-risk subgroup (n=271, risk score higher than median score) and low-risk subgroup (n=271, risk score lower than median score) according to the median value of risk score. And OS period of low-risk subgroup was significantly longer than that of high-risk subgroup, and the difference was statistically significant (χ2=33.000, P<0.001). ROC curve analysis of this model for predicting OS period of EC patients showed that area under curve (AUC) was 0.700 (95%CI: 0.673-0.751, P<0.001). At the same time, Nomogram was constructed to quantitatively predict EC patients 1, 3, 5 year OS rate.

Conclusion

The constructed prognosis prediction model of EC patients with GGH, IGFBP5 and CDKN2A differential genes can provide data support for clinical prediction of prognosis of EC patients and search for corresponding targeted therapy drugs.

图1 EC差异基因分层聚类热图(根据差异基因表达情况进行聚类)
表1 EC患者预后相关差异基因的单因素与多因素Cox比例风险回归分析结果
图2 对EC差异基因采取LASSO回归与多因素Cox比例风险回归分析结果图(图2A、2B:96个EC差异基因的LASSO回归模型;图2C:GGH、IGFBP5、CDKN2A差异基因的多因素Cox比例风险回归模型森林图)
图3 EC患者OS期预测模型预测准确性分析(图3A:高危与低危亚组EC患者OS曲线分析;图3B:该模型预测EC患者OS期的ROC曲线分析;图3C~3E:该模型中的高危与低危EC患者分布)
图4 EC患者OS期预测模型列线图
[1]
Bendifallah S, Ballester M, Daraï E. Endometrial cancer: predictive models and clinical impact[J]. Bull Cancer, 2017, 104(12): 1022-1031. DOI: 10.1016/j.bulcan.2017.06.017.
[2]
Miller KD, Nogueira L, Mariotto AB, et al. Cancer treatment and survivorship statistics, 2019[J]. CA Cancer J Clin, 2019, 69(5): 363-385. DOI: 10.3322/caac.21565.
[3]
Dou Y, Kawaler EA, Cui Zhou D, et al. Clinical proteomic tumor analysis consortium. proteogenomic characterization of endometrial carcinoma[J]. Cell, 2020, 180(4): 729.e26-748.e26. DOI: 10.1016/j.cell.2020.01.026.
[4]
Guo CB, Tang YQ, Zhang YQ, et al. Mining TCGA data for key biomarkers related to immune microenvironment in endometrial cancer by immune score and weighted correlation network analysis[J].Front Mol Biosci, 2021, 8: 645388. DOI: 10.3389/fmolb.2021.645388.
[5]
Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA, et al. The Cancer Genome Atlas Pan-Cancer analysis project[J]. Nat Genet, 2013, 45(10): 1113-1120. DOI: 10.1038/ng.2764.
[6]
Wang Z, Jensen MA, Zenklusen JC. A practical guide to The Cancer Genome Atlas (TCGA)[J]. Methods Mol Biol, 2016, 1418: 111-141. DOI: 10.1007/978-1-4939-3578-9_6.
[7]
谢龙祥,闫中义,党艺方,等. TCGA数据库:海量癌症数据的源泉[J]. 河南大学学报(医学版), 2018, 37(3): 223-228.
[8]
Law CW, Chen Y, Shi W, et al. Voom: precision weights unlock linear model analysis tools for RNA-seq read counts[J]. Genome Biol, 2014, 15(2): R29. DOI: 10.1186/gb-2014-15-2-r29.
[9]
Fan Yuan, Li Xingchen, Tian Li et al. Identification of a metabolism-related signature for the prediction of survival in endometrial cancer patients[J].Front Oncol, 2021, 11: 630905. DOI: 10.3389/fonc.2021.630905.
[10]
Aggarwal A, Prinz-Wohlgenannt M, Tennakoon S, et al. The calcium-sensing receptor: a promising target for prevention of colorectal cancer[J]. Biochim Biophys Acta, 2015, 1853(9): 2158-2167. DOI: 10.1016/j.bbamcr.2015.02.011.
[11]
Xi YB, Guo F, Xu ZL, et al. Radiomics signature: a potential biomarker for the prediction of MGMT promoter methylation in glioblastoma[J]. J Magn Reson Imaging, 2018, 47(5): 1380-1387. DOI: 10.1002/jmri.25860.
[12]
Huang R, Liao X, Li Q. Identification and validation of potential prognostic gene biomarkers for predicting survival in patients with acute myeloid leukemia[J]. Onco Targets Ther, 2017, 10: 5243-5254. DOI: 10.2147/OTT.S147717.
[13]
Gaudin F, Nasreddine S, Donnadieu AC, et al. Identification of the chemokine CX3CL1 as a new regulator of malignant cell proliferation in epithelial ovarian cancer[J]. PLoS One, 2011, 6(7): e21546. DOI: 10.1371/journal.pone.0021546.
[14]
Constantine GD, Kessler G, Graham S, et al. Increased incidence of endometrial cancer following the women′s health initiative: an assessment of risk factors[J]. J Womens Health (Larchmt), 2019, 28(2): 237-243. DOI: 10.1089/jwh.2018.6956.
[15]
Ravo M, Cordella A, Saggese P, et al. Identification of long non-coding RNA expression patterns useful for molecular-based classification of type Ⅰ endometrial cancers[J]. Oncol Rep, 2019, 41(2): 1209-1217. DOI: 10.3892/or.2018.6880.
[16]
Mackay HJ, Eisenhauer EA, Kamel-Reid S, et al. Molecular determinants of outcome with mammalian target of rapamycin inhibition in endometrial cancer[J]. Cancer, 2014, 120(4): 603-610. DOI: 10.1002/cncr.28414.
[17]
Wang F, Wang B, Long J, et al. Identification of candidate target genes for endometrial cancer, such as ANO1, using weighted gene co-expression network analysis[J]. Exp Ther Med, 2019, 17(1): 298-306. DOI: 10.3892/etm.2018.6965.
[18]
Goebel EA, Vidal A, Matias-Guiu X, et al. The evolution of endometrial carcinoma classification through application of immunohistochemistry and molecular diagnostics: past, present and future[J]. Virchows Arch, 2018, 472(6): 885-896. DOI: 10.1007/s00428-017-2279-8.
[19]
张远丽,张师前. 基于预后和分子分型的子宫内膜癌分期修订建议:国际声音与中国现状[J]. 中国实用妇科与产科杂志,2020, 36(3): 283-286. DOI: 10.19538/j.fk2020030123.
[20]
Gu Y, Zhang M, Peng F, et al. The BRCA1/2-directed miRNA signature predicts a good prognosis in ovarian cancer patients with wild-type BRCA1/2[J]. Oncotarget, 2015, 6(4): 2397-2406. DOI: 10.18632/oncotarget.2963.
[21]
Melling N, Rashed M, Schroeder C, et al. High-level γ-glutamyl-hydrolase (GGH) expression is linked to poor prognosis in ERG negative prostate cancer[J]. Int J Mol Sci, 2017, 18(2): 286. DOI: 10.3390/ijms18020286.
[22]
Shubbar E, Helou K, Kovács A, et al. High levels of γ-glutamyl hydrolase (GGH) are associated with poor prognosis and unfavorable clinical outcomes in invasive breast cancer[J]. BMC Cancer, 2013, 13: 47. DOI: 10.1186/1471-2407-13-47.
[23]
Wang W, Lim WK, Leong HS, et al. An eleven gene molecular signature for extra-capsular spread in oral squamous cell carcinoma serves as a prognosticator of outcome in patients without nodal metastases[J]. Oral Oncol, 2015, 51(4): 355-362. DOI: 10.1016/j.oraloncology.2014.12.012.
[24]
Rutanen EM, Nyman T, Lehtovirta P, et al. Suppressed expression of insulin-like growth factor binding protein-1 mRNA in the endometrium: a molecular mechanism associating endometrial cancer with its risk factors[J]. Int J Cancer, 1994, 59(3): 307-312. DOI: 10.1002/ijc.2910590303.
[25]
Naciff JM, Khambatta ZS, Thomason RG, et al. The genomic response of a human uterine endometrial adenocarcinoma cell line to 17alpha-ethynyl estradiol[J]. Toxicol Sci, 2009, 107(1): 40-55. DOI: 10.1093/toxsci/kfn219.
[26]
Miyake H, Nelson C, Rennie PS, et al. Overexpression of insulin-like growth factor binding protein-5 helps accelerate progression to androgen-independence in the human prostate LNCaP tumor model through activation of phosphatidylinositol 3′-kinase pathway[J]. Endocrinology, 2000, 141(6): 2257-2265. DOI: 10.1210/endo.141.6.7520.
[27]
Baron-Hay S, Boyle F, Ferrier A, et al. Elevated serum insulin-like growth factor binding protein-2 as a prognostic marker in patients with ovarian cancer[J]. Clin Cancer Res, 2004, 10(5): 1796-1806. DOI: 10.1158/1078-0432.ccr-0672-2.
[28]
Xiao Z, He Y, Liu C, et al. Targeting P16INK4A in uterine serous carcinoma through inhibition of histone demethylation[J]. Oncol Rep, 2019, 41(5): 2667-2678. DOI: 10.3892/or.2019.7067.
[29]
Davidson B, Abeler VM, Hellesylt E, et al. Gene expression signatures differentiate uterine endometrial stromal sarcoma from leiomyosarcoma[J]. Gynecol Oncol, 2013, 128(2): 349-355. DOI: 10.1016/j.ygyno.2012.11.021.
[30]
Cai H, Xiang YB, Qu S, et al. Association of genetic polymorphisms in cell-cycle control genes and susceptibility to endometrial cancer among Chinese women[J]. Am J Epidemiol, 2011, 173(11): 1263-1271. DOI: 10.1093/aje/kwr002.
[31]
Su L, Wang H, Miao J, et al. Clinicopathological significance and potential drug target of CDKN2A/p16 in endometrial carcinoma[J]. Sci Rep, 2015, 5: 13238. DOI: 10.1038/srep13238.
[32]
Cancer Genome Atlas Research Network, Kandoth C, Schultz N, et al. Integrated genomic characterization of endometrial carcinoma[J]. Nature, 2013, 497(7447): 67-73. DOI: 10.1038/nature12113.
[33]
Stelloo E, Bosse T, Nout RA, et al. Refining prognosis and identifying targetable pathways for high-risk endometrial cancer; a TransPORTEC initiative[J]. Mod Pathol, 2015, 28(6): 836-844. DOI: 10.1038/modpathol.2015.43.
[34]
Zhang W, Gao L, Wang C, et al. Combining bioinformatics and experiments to identify and verify key genes with prognostic values in endometrial carcinoma[J]. J Cancer, 2020, 11(3): 716-732. DOI: 10.7150/jca.35854.
[35]
He X, Lei S, Zhang Q, et al. Deregulation of cell adhesion molecules is associated with progression and poor outcomes in endometrial cancer: analysis of The Cancer Genome Atlas data[J]. Oncol Lett, 2020, 19(3): 1906-1914. DOI: 10.3892/ol.2020.11295.
[1] 张晓宇, 殷雨来, 张银旭. 阿帕替尼联合新辅助化疗对三阴性乳腺癌的疗效及预后分析[J/OL]. 中华乳腺病杂志(电子版), 2024, 18(06): 346-352.
[2] 许杰, 李亚俊, 韩军伟. 两种入路下腹腔镜根治性全胃切除术治疗超重胃癌的效果比较[J/OL]. 中华普外科手术学杂志(电子版), 2025, 19(01): 19-22.
[3] 高杰红, 黎平平, 齐婧, 代引海. ETFA和CD34在乳腺癌中的表达及与临床病理参数和预后的关系研究[J/OL]. 中华普外科手术学杂志(电子版), 2025, 19(01): 64-67.
[4] 李代勤, 刘佩杰. 动态增强磁共振评估中晚期低位直肠癌同步放化疗后疗效及预后的价值[J/OL]. 中华普外科手术学杂志(电子版), 2025, 19(01): 100-103.
[5] 梁孟杰, 朱欢欢, 王行舟, 江航, 艾世超, 孙锋, 宋鹏, 王萌, 刘颂, 夏雪峰, 杜峻峰, 傅双, 陆晓峰, 沈晓菲, 管文贤. 联合免疫治疗的胃癌转化治疗患者预后及术后并发症分析[J/OL]. 中华普外科手术学杂志(电子版), 2024, 18(06): 619-623.
[6] 张志兆, 王睿, 郜苹苹, 王成方, 王成, 齐晓伟. DNMT3B与乳腺癌预后的关系及其生物学机制[J/OL]. 中华普外科手术学杂志(电子版), 2024, 18(06): 624-629.
[7] 李伟, 宋子健, 赖衍成, 周睿, 吴涵, 邓龙昕, 陈锐. 人工智能应用于前列腺癌患者预后预测的研究现状及展望[J/OL]. 中华腔镜泌尿外科杂志(电子版), 2024, 18(06): 541-546.
[8] 韩加刚, 王振军. 梗阻性左半结肠癌的治疗策略[J/OL]. 中华结直肠疾病电子杂志, 2024, 13(06): 450-458.
[9] 董佳, 王坤, 张莉. 预后营养指数结合免疫球蛋白、血糖及甲胎蛋白对HBV 相关慢加急性肝衰竭患者治疗后预后不良的预测价值[J/OL]. 中华消化病与影像杂志(电子版), 2024, 14(06): 555-559.
[10] 刘郁, 段绍斌, 丁志翔, 史志涛. miR-34a-5p 在结肠癌患者的表达及其与临床特征及预后的相关性研究[J/OL]. 中华消化病与影像杂志(电子版), 2024, 14(06): 485-490.
[11] 陈倩倩, 袁晨, 刘基, 尹婷婷. 多层螺旋CT 参数、癌胚抗原、错配修复基因及病理指标对结直肠癌预后的影响[J/OL]. 中华消化病与影像杂志(电子版), 2024, 14(06): 507-511.
[12] 曾明芬, 王艳. 急性胰腺炎合并脂肪肝患者CT 与彩色多普勒超声诊断参数与其病情和预后的关联性研究[J/OL]. 中华消化病与影像杂志(电子版), 2024, 14(06): 531-535.
[13] 沈炎, 张俊峰, 唐春芳. 预后营养指数结合血清降钙素原、胱抑素C及视黄醇结合蛋白对急性胰腺炎并发急性肾损伤的预测价值[J/OL]. 中华消化病与影像杂志(电子版), 2024, 14(06): 536-540.
[14] 王景明, 王磊, 许小多, 邢文强, 张兆岩, 黄伟敏. 腰椎椎旁肌的研究进展[J/OL]. 中华临床医师杂志(电子版), 2024, 18(09): 846-852.
[15] 郭曌蓉, 王歆光, 刘毅强, 何英剑, 王立泽, 杨飏, 汪星, 曹威, 谷重山, 范铁, 李金锋, 范照青. 不同亚型乳腺叶状肿瘤的临床病理特征及预后危险因素分析[J/OL]. 中华临床医师杂志(电子版), 2024, 18(06): 524-532.
阅读次数
全文


摘要