[1]毛 俊,沈秀芬,马 润,等.基于TCGA 和GEO 数据库建立了肝内胆管癌的预后风险模型及验证分析[J].现代检验医学杂志,2023,38(03):40-46+64.[doi:10.3969/j.issn.1671-7414.2023.03.008]
 MAO Jun,SHEN Xiu-fen,MA Run,et al.Establishment and Verification of Prognostic Risk Model of Intrahepatic Cholangiocarcinoma Based on TCGA and GEO Database[J].Journal of Modern Laboratory Medicine,2023,38(03):40-46+64.[doi:10.3969/j.issn.1671-7414.2023.03.008]
点击复制

基于TCGA 和GEO 数据库建立了肝内胆管癌的预后风险模型及验证分析()
分享到:

《现代检验医学杂志》[ISSN:/CN:]

卷:
第38卷
期数:
2023年03期
页码:
40-46+64
栏目:
论著
出版日期:
2023-05-15

文章信息/Info

Title:
Establishment and Verification of Prognostic Risk Model of Intrahepatic Cholangiocarcinoma Based on TCGA and GEO Database
文章编号:
1671-7414(2023)03-040-08
作者:
毛 俊沈秀芬马 润何 薇瞿巧莉胡 莹
(昆明医科大学第二附属医院检验科,昆明 650101)
Author(s):
MAO Jun SHEN Xiu-fen MA Run HE Wei QU Qiao-li HU Ying
(Department of Clinical Laboratory,the Second Affiliated Hospital of Kunming Medical University, Kunming 650101, China)
关键词:
肝内胆管癌生物信息学生存分析风险分数风险模型
分类号:
R735.8;R730.43
DOI:
10.3969/j.issn.1671-7414.2023.03.008
文献标志码:
A
摘要:
目的 基于TCGA( the cancer genome atlas)和GEO(gene expression omnibus)数据库构建肝内胆管癌(intrahepaticcholangiocarcinoma, ICCA)预后风险模型,筛选ICCA 预后相关基因。方法 TCGA 数据库31 例ICCA 组织及9 例癌旁组织数据作为训练集,GEO 数据库30 例ICCA 组织及27 例癌旁组织数据作为验证集,R 软件“DESeq2”包过滤表达有差异的基因,过滤条件:差异倍数绝对值> 2,校正P 值< 0.05。单因素COX 回归分析筛选两组数据预后差异均有统计学意义的基因,通过LASSO 回归分析构建ICCA 的预后风险模型。计算训练集及验证集风险分数,并根据中值分为高、低风险组,绘制Kaplan-Meier 生存曲线图和时间依赖性受试者工作特征(receiver operating characteristic,ROC)曲线。将风险分数与临床病理信息进行单、多因素COX 回归分析,并绘制列线图展示,综合评价及验证模型效能。利用基因本体论(gene ontology, GO)、京都基因与基因组百科全书(Kyoto Encyclopedia of Genes and Genomes,KEGG) 、基因集富集分析(Gene Set Enrichment Analysis, GSEA)和单样本基因集富集分析(Single Sample Gene SetEnrichment Analysis, ssGSEA)分析造成高低风险组预后差异的原因。结果 TCGA 数据共筛选出2 922 个差异表达基因,GEO 数据共筛选出3 075 个(均P<0.05)。经单因素COX 回归分析,TCGA 筛选出68 个基因(HR=0.13 ~ 7.2,均P<0.05),GEO 筛选出413 个基因(HR=0.17 ~ 215.1,均P < 0.05),两组数据预后差异均有统计学意义的有9个基因:GOLGA7B,MTFR2,TPM2,PIWIL4,EPHX4,PRICKLE1,DIO2,FUT4 和COL4A3(其中TCGA 数据库HR=0.506 ~ 2.760, GEO 数据库HR=0.428 ~ 1.992,均P<0.05)。LASSO 回归成功构建6 基因预后风险模型,模型风险分数=0.464× 表达量MTFR2 + 0.550× 表达量TPM2-0.511× 表达量PIWIL4-0.097× 表达量PRICKLE1 + 0.215× 表达量DIO2-0.313× 表达量COL4A3,训练集中风险分数中值为1.43。Kaplan-Meier 生存分析表明在总生存率上,高风险组低于低风险组(P<0.001)。ROC 曲线提示,1,3,5 年AUC 分别为0.971(cutoff=0.22),0.921(cutoff=2.33)和0.701(cutoff=1.52),模型预测能力良好。单因素COX 回归风险分数HR=5.18(95%CI:2.15 ~ 12.49), P<0.001,多因素COX 回归风险分数HR=72.5(95%CI:4.52 ~ 1 162.9), P=0.002。验证集中模型风险分数中值为2.48。Kaplan-Meier 生存分析表明,高风险组生存率低于低风险组(P=0.004)。ROC 结果显示1,3,5 年AUC 分别为0.908(cutoff=3.23),0.851(cutoff=1.02)和0.752(cutoff=2.70),单因素COX 回归风险分数HR=2.76(95%CI:1.65 ~ 4.60), P<0.001,多因素COX 回归风险分数HR=4.68(95%CI:2.13 ~ 10.3),P<0.001,风险模型效能得到验证。GO,KEGG,GSEA 和ssGSEA 分析结果表明造成高低风险组预后差异的原因可能与机体免疫反应的抑制有关( 均P<0.05)。结论 此次构建的预后风险模型在评估ICCA患者预后上具有一定的价值,为临床诊疗提供参考。
Abstract:
Objective to construct a prognostic risk model of intrahepatic cholangiocarcinoma (ICCA) based on TCGA(the cancer genome atlas)and GEO(gene expression omnibus)database, and to screen ICCA prognostic related genes. Methods The data of 31 cases of ICCA tissues and 9 cases of para-carcinoma tissues in TCGA database were used as training set, and the data of 30 cases of ICCA tissues and 27 cases of para-carcinoma tissues in GEO database were used as verification set. The differentially expressed genes were filtered by R software “DESeq2” package. The filtering conditions were as follows: the absolute value of difference multiple was more than 2, and the correction P< 0.05. Univariate COX regression analysis was used to screen the genes with statistically significant prognosis differences in both groups. LASSO regression analysis was used to construct the prognostic risk model of ICCA. The risk scores of training set and verification set were calculated and divided into high risk group and low risk group according to the median. Kaplan-Meier survival curve and time-dependent receiver operating characteristic (ROC) curve were drawn. The risk score and clinicopathological information were analyzed by univariate and multivariate COX regression analysis, and a line chart was drawn to comprehensively evaluate and verify the effectiveness of the model. Gene Ontology (GO), Kyoto Encyclopedia of Gene and Genomes (KEGG), Gene Set Enrichment Analysis (GSEA) and Single Sample Gene Set Enrichment Analysis (ssGSEA) were used to analyze the reasons for the difference in prognosis between high and low risk groups. Results A total of 2 922 differentially expressed genes were screened by TCGA data and 3 075 genes were screened by GEO data (all P<0.05). Univariate COX regression analysis showed that 68 genes were screened by TCGA(HR=0.13 ~ 7.2, all P<0.05) and 413 genes were screened by GEO (HR=0.17 ~ 215.1, all P<0.05). There were 9 genes with significant prognosis in both groups: GOLGA7B,MTFR2,TPM2,PIWIL4,EPHX4,PRICKLE1,DIO2,FUT4 and COL4A3 (TCGA-HR=0.506 ~ 2.760, GEO-HR=0.428 ~ 1.992, all P<0.05). A six-gene prognostic risk model was successfully constructed by LASSO regression. The model Risk Score=0.464×EXPMTFR2+0.550×EXPTPM2-0.511×EXPPIWIL4- 0.097×EXPPRICKLE1+0.215×EXPDIO2-0.313×EXPCOL4A3.In training set, the median of risk score was 1.43. Kaplan-Meier survival analysis showed that the overall survival rate in the high risk group was lower than that in the low risk group (P < 0.001). The ROC curve showed that the AUC of 1,3 and 5 years were 0.971 (cutoff value=0.22), 0.921 (cutoff value=2.33) and 0.701 (cutoff value=1.52), indicating that the model had good predictive ability. Univariate COX regression risk score HR=5.18 (95%CI:2.15 ~ 12.49, P < 0.001), multivariate COX regression risk score HR=72.5 (95%CI:4.52 ~ 1162.9, P=0.002). In the verification set, the median risk score of the model was 2.48. Kaplan-Meier survival analysis showed that the survival rate in the high risk group was lower than that in the low risk group(P=0.004). The results of ROC showed that the AUC values of 1,3 and 5 years were 0.908(cutoff value=3.23), 0.851(cutoff value=1.02) and 0.752(cutoff value=2.7), the univariate COX regression risk score HR=2.76 (95%CI:1.65 ~ 4.60, P<0.001), and the multivariate COX regression risk score HR=4.68 (95%CI:2.13 ~ 10.3, P<0.001). The effectiveness of the risk model was verified. The results of GO,KEGG,GSEA and ssGSEA analysis showed that the reason for the difference in prognosis between high and low risk groups might be related to the inhibition of immune response(all P<0.05). Conclusion The prognostic risk model constructed this time has a certain value in evaluating the prognosis of patients with ICCA and provides reference for clinical diagnosis and treatment.

参考文献/References:

[1] HEWITT D B, BROWN Z J, PAWLIK T M. Surgical management of intrahepatic cholangiocarcinoma [J]. Expert Review of Anticancer Therapy, 2022, 22(1):27- 38.
[2] BUCKHOLZ A P, BROWN R S. Cholangiocarcinoma: Diagnosis and Management [J]. Clinics in Liver Disease, 2020, 24(3):421-436.
[3] ZHANG X F, BEAL E W, BAGANTE F, et al. Early versus late recurrence of intrahepatic cholangiocarcinoma after resection with curative intent[J]. The British Journal of Surgery, 2018, 105(7): 848-856.
[4] MORENO-BETANCUR M, CARLI N J B ,BRILLEMAN S L, et al. Survival analysis with time-dependent covariates subject to missing data or measurement error: Multiple Imputation for Joint Modeling (MIJM)[J]. Biostatistics, 2018, 19(4): 479- 496.
[5] CHAROENTONG P, FINOTELLO F, ANGELOVA M, et al. Pan-cancer immunogenomic analyses reveal genotype-immunophenotype relationships and predictors of response to checkpoint blockade[J]. Cell Reports, 2017, 18(1): 248-262.
[6] BARBIE D A, TAMAYO P, BOEHM J S, et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1[J]. Nature, 2009,462(7269): 108-112.
[7] LI Dan, JI Yanmei, GUO Jialong, et al. Upregulated expression of MTFR2 as a novel biomarker predicts poor prognosis in hepatocellular carcinoma by bioinformatics analysis[J]. Future Oncology, 2021,17(24): 3187-3201.
[8] CHEN Cheng, TANG Yang, QU Wendong, et al. Evaluation of clinical value and potential mechanism of MTFR2 in lung adenocarcinoma via bioinformatics[J]. BMC Cancer, 2021, 21(1): 619.
[9] LU Guanming, LAI Yuanhui, WANG Tian-tian, et al. Mitochondrial fission regulator 2 (MTFR2) promotes growth, migration, invasion and tumour progression in breast cancer cells[J]. Aging, 2019, 11(22): 10203- 10219.
[10] LU Wenjie, ZANG Rukun, DU Yuanna, et al. Overexpression of MTFR2 predicts poor prognosis of breast cancer[J]. Cancer Management and Research, 2020, 12: 11095-11102.
[11] ZHANG Jinfeng, ZHANG Jian, XU Shouping, et al. Hypoxia-induced TPM2 methylation is associated with chemoresistance and poor prognosis in breast cancer[J]. Cellular Physiology and Biochemistry, 2018, 45(2): 692-705.
[12] ZHAO Xiaotong, ZHU Yan, ZHOU Jiefu, et al. Development of a novel 7 immune-related genes prognostic model for oral cancer: A study based on TCGA database[J]. Oral Oncology, 2021, 112: 105088.
[13] ZOU Wenbo, WANG Zizheng, ZHANG Xiuping,et al. PIWIL4 and SUPT5H combine to predict prognosis and immune landscape in intrahepatic cholangiocarcinoma[J]. Cancer Cell International,2021, 21(1): 657.
[14] ILIEV R, STANIK M, FEDORKO M, et al. Decreased expression levels of PIWIL1, PIWIL2, and PIWIL4 are associated with worse survival in renal cell carcinoma patients[J]. OncoTargets and Therapy, 2016, 9: 217- 222.
[15] YANG Xiting, WU Qiuji, WU Fengyang, et al. Differential expression of COL4A3 and collagen in upward and downward progressing types of nasopharyngeal carcinoma[J]. Oncology Letters, 2021, 21(3): 223.
[16] NIE Xiaocui, WANG Jianping, ZHU Wan, et al. COL4A3 expression correlates with pathogenesis,pathologic behaviors, and prognosis of gastric carcinomas[J]. Human Pathology, 2013, 44(1): 77-86.
[17] DYBERG C, PAPACHRISTOU P, HAUG B H, et al. Planar cell polarity gene expression correlates with tumor cell viability and prognostic outcome in neuroblastoma[J]. BMC Cancer, 2016, 16(1): 259.
[18] 黄健斌. 基于TCGA 数据库肝内胆管细胞癌的 4-mRNA 预后风险模型构建[D].广州:南方医科 大学, 2021. HUANG Jianbin. Identification of prognostic four-mRNA signature model in intrahepatic cholangiocarcinoma based on TCGA database [D]. Guangzhou:Southern Medical University, 2021.
[19] YE Yufu, ZHOU Lin, XIE Xiajun, et al. Interaction of B7-H1 on intrahepatic cholangiocarcinoma cells with PD-1 on tumor-infiltrating T cells as a mechanism of immune evasion[J]. Journal of Surgical Oncology,2009, 100(6): 500-504.
[20] SURIYO T, FUANGTHONG M, ARTPRADIT C, et al. Inhibition of T-cell-mediated immune response via the PD-1/ PD-L1 axis in cholangiocarcinoma cells[J]. European Journal of Pharmacology, 2021, 897: 173960.
[21] KITANO Y, OKABE H, YAMASHITA Y I, et al. Tumour-infiltrating inflammatory and immune cells in patients with extrahepatic cholangiocarcinoma[J]. British Journal of Cancer, 2018, 118(2): 171-180.
[22] 遆振宇, 高小鹏, 千东维, 等.血清sPDL1 水平和 外周血NLR 在判断晚期胆管癌患者生存预后中的 意义[J]. 现代检验医学杂志, 2019, 34(6): 41-46. TI Zhenyu, GAO Xiaopeng, QIAN Dongwei, et al. Soluble programmed death-ligand 1 (sPDL1) and Neutrophilto-Lymphocyte ratio (NLR) predicts prognostic survival in advanced biliary tract cancer patients treated with palliative chemotherapy[J]. Journal of Modern Laboratory Medicine, 2019, 34(6): 41-46.
[23] ZENG Fanli, CHEN Jingfang. Application of immune checkpoint inhibitors in the treatment of cholangiocarcinoma[J]. Technology in Cancer Research & Treatment, 2021, 20: 15330338211039952.

相似文献/References:

[1]陈龙梅,杨振华.基于GEO数据库对类风湿性关节炎相关基因筛选及生物信息学分析[J].现代检验医学杂志,2021,36(02):49.[doi:doi:10.3969/j.issn.1671-7414.2021.02.012]
 CHEN Long-mei,YANG Zhen-hua.Gene Screening and Bioinformatics Analysis of Rheumatoid Arthritis Based on GEO Database[J].Journal of Modern Laboratory Medicine,2021,36(03):49.[doi:doi:10.3969/j.issn.1671-7414.2021.02.012]
[2]吴良银a,李文丽b,刘 俊b.基于GEO数据的病毒相关性肝癌潜在生物基因标志物的筛选及生物信息学分析[J].现代检验医学杂志,2021,36(06):106.[doi:10.3969/j.issn.1671-7414.2021.06.022]
 WU Liang-yin,LI Wen-li,LIU Jun.Screening and Bioinformatics Analysis of Potential Biomarkers for Virus-associated Hepatocellular Carcinoma Based on GEO Data[J].Journal of Modern Laboratory Medicine,2021,36(03):106.[doi:10.3969/j.issn.1671-7414.2021.06.022]
[3]张涛元,丁雪梅,李 俏,等.人非小细胞肺癌组织中转录因子E2F 家族表达与临床病理特征及预后的相关性分析[J].现代检验医学杂志,2022,37(04):87.[doi:10.3969/j.issn.1671-7414.2022.04.017]
 ZHANG Tao-yuan,DING Xue-mei,LI Qiao,et al.Correlation Analysis of Transcription Factor E2F Family Expression with Clinicopathological Features and Prognosis in Human Non-small Cell Lung Cancer[J].Journal of Modern Laboratory Medicine,2022,37(03):87.[doi:10.3969/j.issn.1671-7414.2022.04.017]
[4]侯芳霞,刘 琳,张 维,等.基于GEO 数据库筛选稳定性心绞痛患者外周血关键差异基因及诊断模型构建[J].现代检验医学杂志,2022,37(06):19.[doi:10.3969/j.issn.1671-7414.2022.06.004]
 HOU Fang-xia,LIU Lin,ZHANG Wei,et al.Identification of Hub Genes and Differential Expression Genes for Peripheral Blood Samples of Stable Angina Pectoris Based on GEO Databases[J].Journal of Modern Laboratory Medicine,2022,37(03):19.[doi:10.3969/j.issn.1671-7414.2022.06.004]
[5]侯 丽,张 丽,唐 婧,等.基于GEO 对多发性骨髓瘤关键基因生物信息学分析及免疫浸润模式与验证[J].现代检验医学杂志,2023,38(05):23.[doi:10.3969/j.issn.1671-7414.2023.05.005]
 HOU Li,ZHANG Li,TANG Jing,et al.Bioinformatics Analysis and Verify Core Genes and Immune Infiltration Patterns in Multiple Myeloma Based on GEO[J].Journal of Modern Laboratory Medicine,2023,38(03):23.[doi:10.3969/j.issn.1671-7414.2023.05.005]
[6]曹 君,金婕妤,张 胜,等.生物信息学方法筛选IL-3和IL-3+SCF诱导的小鼠骨髓来源肥大细胞的差异表达基因及相关信号通路分析[J].现代检验医学杂志,2024,39(01):16.[doi:10.3969/j.issn.1671-7414.2024.01.004]
 CAO Jun,JIN Jieyu,ZHANG Sheng,et al.Screening of IL-3 and IL-3+SCF Induce Differentially Expressed Genes and Signaling Pathways in Bone Marrow-derived Mast Cells Based on Bioinformatics[J].Journal of Modern Laboratory Medicine,2024,39(03):16.[doi:10.3969/j.issn.1671-7414.2024.01.004]
[7]刁 迅,范绮雨,耿良栋,等.基于生物信息学分析双硫死亡相关基因PDLIM1 mRNA在多种肿瘤中的表达及临床应用价值[J].现代检验医学杂志,2024,39(01):36.[doi:10.3969/j.issn.1671-7414.2024.01.007]
 DIAO Xun,FAN Qiyu,GENG Liangdong,et al.Analysis of Expression in Disulfidptosis-Related Gene PDLIM1 mRNA in Various Tumors and Its Clinical Application Value Based on Bioinformatics[J].Journal of Modern Laboratory Medicine,2024,39(03):36.[doi:10.3969/j.issn.1671-7414.2024.01.007]
[8]钟双泽,陈尚金,林汉胜,等.基于TCGA 数据库生物信息学分析构建肾癌N6- 甲基腺苷相关LncRNA 配对模型及其预后预测价值研究[J].现代检验医学杂志,2024,39(02):68.[doi:10.3969/j.issn.1671-7414.2024.02.013]
 ZHONG Shuangze,CHEN Shangjin,LIN Hansheng,et al.Construction of N6-methyladenosine Related LncRNA Pairing Model for Renal Cell Carcinoma Based on Bioinformatics Analysis of TCGA Database and Its Prognostic Value Research[J].Journal of Modern Laboratory Medicine,2024,39(03):68.[doi:10.3969/j.issn.1671-7414.2024.02.013]

备注/Memo

备注/Memo:
基金项目: 云南省高层次卫生健康技术人才培养支持计划(D-2018041);昆明医科大学硕士研究生创新基金(2022S253):胆管癌中CPT2 表达下调提高顺铂抗性并通过ROS/NFkapaB 通路促进肿瘤生长研究。
作者简介:毛俊(1996-),男,硕士研究生,检验技师,研究方向:肿瘤分子生物学,E-mail:m1604777034@163.com。
通讯作者:胡莹(1977-),女,博士研究生,副主任医师,研究方向:临床分子生物学及微生物学,E-mail:hy2002@126.com
更新日期/Last Update: 2023-05-15