
基于人工智能的肺癌NOG/PDX模型驱动基因匹配预测
Yayi He, Haoyue Guo, Li Diao, Yu Chen, Junjie Zhu, Hiran C. Fernando, Diego Gonzalez Rivas, Hui Qi, Chunlei Dai, Xuzhen Tang, Jun Zhu, Jiawei Dai, Kan He, Dan Chan, Yang Yang
工程(英文) ›› 2022, Vol. 15 ›› Issue (8) : 102-114.
基于人工智能的肺癌NOG/PDX模型驱动基因匹配预测
Prediction of Driver Gene Matching in Lung Cancer NOG/PDX Models Based on Artificial Intelligence
患者源性肿瘤异种移植物(PDX)是癌症药物发现和筛查的有力工具。然而,目前的研究对PDX的基因型错配知之甚少,导致PDX使用过程中产生巨大的经济损失。在此,本研究建立了53 例肺癌患者的PDX模型,基因型匹配率为79.2%(42/53)。此外,检查了17 个临床病理学特征,并基于最低赤池信息量准则(AIC)、最小绝对收缩和选择算子(LASSO)-逻辑回归(LR)、支持向量机(SVM)递归特征消除(SVM-RFE)、极端梯度增强(XGBoost)、梯度增强和分类特征(CatBoost),以及合成少数过采样技术(SMOTE)输入逐步逻辑回归模型。最后,通过100 个试验组的准确度、受试者工作特征曲线下面积(AUC)和F1 评分评价所有模型的性能。两个多变量 LR 模型显示,年龄、驱动基因突变的数量、表皮生长因子受体(EGFR)基因突变、既往化疗的类型、既往酪氨酸激酶抑制剂(TKI)治疗和样本来源是强有力的预测因素。此外,CatBoost (平均精度= 0.960;平均AUC = 0.939;平均F1 分数= 0.908)和八特征SVM-RFE(平均精度= 0.950;平均AUC = 0.934;平均F1 分数= 0.903)在算法中表现出最好的性能。同时,除CatBoost 外,SMOTE的应用提高了大多数模型的预测能力。基于SMOTE,单一模型的集成分类器达到了最高的准确度(平均值= 0.975)、AUC(平均值= 0.949)和F1 评分(平均值= 0.938)。总之,本文建立了一个最佳预测模型来筛选肺癌患者的NOD/Shi-scid白细胞介素-2受体(IL-2R) γnull(NOG)/PDX模型,并为建立预测模型提供了一种通用方法。
Patient-derived tumor xenografts (PDXs) are a powerful tool for drug discovery and screening in cancer. However, current studies have led to little understanding of genotype mismatches in PDXs, leading to massive economic losses. Here, we established PDX models from 53 lung cancer patients with a genotype matching rate of 79.2% (42/53). Furthermore, 17 clinicopathological features were examined and input in stepwise logistic regression (LR) models based on the lowest Akaike information criterion (AIC), least absolute shrinkage and selection operator (LASSO)-LR, support vector machine (SVM) recursive feature elimination (SVM-RFE), extreme gradient boosting (XGBoost), gradient boosting and categorical features (CatBoost), and the synthetic minority oversampling technique (SMOTE). Finally, the performance of all models was evaluated by the accuracy, area under the receiver operating characteristic curve (AUC), and F1 score in 100 testing groups. Two multivariable LR models revealed that age, number of driver gene mutations, epidermal growth factor receptor (EGFR) gene mutations, type of prior chemotherapy, prior tyrosine kinase inhibitor (TKI) therapy, and the source of the sample were powerful predictors. Moreover, CatBoost (mean accuracy = 0.960; mean AUC = 0.939; mean F1 score = 0.908) and the eight-feature SVM (mean accuracy = 0.950; mean AUC = 0.934; mean F1 score = 0.903) showed the best performance among the algorithms. Meanwhile, application of the SMOTE improved the predictive capability of most models, except CatBoost. Based on the SMOTE, the ensemble classifier of single models achieved the highest accuracy (mean = 0.975), AUC (mean = 0.949), and F1 score (mean = 0.938). In conclusion, we established an optimal predictive model to screen lung cancer patients for NOD/Shi-scid, interleukin-2 receptor (IL-2R) γnull (NOG)/PDX models and offer a general approach for building predictive models.
机器学习 / 患者源性肿瘤异种移植物 / NOG小鼠
Machine learning / Patient-derived tumor xenografts / NOG mice
[1] |
Siegel RL, Miller KD, Fuchs HE, Jemal A. Cancer statistics, 2021. CA Cancer J Clin 2021;71(1):7–33.
|
[2] |
Zappa C, Mousa SA. Non-small cell lung cancer: current treatment and future advances. Transl Lung Cancer Res 2016;5(3):288–300.
|
[3] |
Politi K, Herbst RS. Lung cancer in the era of precision medicine. Clin Cancer Res 2015;21(10):2213–20.
|
[4] |
Reck M, Rodríguez–Abreu D, Robinson AG, Hui R, Csoszi T, Fülöp A, et al. } Updated analysis of keynote-024: pembrolizumab versus platinum-based chemotherapy for advanced non-small-cell lung cancer with pd-l1 tumor proportion score of 50% or greater. J Clin Oncol 2019;37(7):537–46.
|
[5] |
Mitsudomi T, Morita S, Yatabe Y, Negoro S, Okamoto I, Tsurutani J, et al. Gefitinib versus cisplatin plus docetaxel in patients with non-small-cell lung cancer harbouring mutations of the epidermal growth factor receptor (WJTOG3405): an open label, randomised phase 3 trial. Lancet Oncol 2010;11(2):121–8.
|
[6] |
Rosell R, Carcereny E, Gervais R, Vergnenegre A, Massuti B, Felip E, et al. Erlotinib versus standard chemotherapy as first-line treatment for European patients with advanced EGFR mutation-positive non-small-cell lung cancer (EURTAC): a multicentre, open-label, randomised phase 3 trial. Lancet Oncol 2012;13(3):239–46.
|
[7] |
Soda M, Choi YL, Enomoto M, Takada S, Yamashita Y, Ishikawa S, et al. Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature 2007;448(7153):561–6.
|
[8] |
Zhao ZR, Wang JF, Lin YB, Wang F, Fu S, Zhang SL, et al. Mutation abundance affects the efficacy of EGFR tyrosine kinase inhibitor readministration in non-small-cell lung cancer with acquired resistance. Med Oncol 2014;31 (1):810.
|
[9] |
Lim ZF, Ma PC. Emerging insights of tumor heterogeneity and drug resistance mechanisms in lung cancer targeted therapy. J Hematol Oncol 2019;12(1):134.
|
[10] |
Liang W, Guo M, Pan Z, Cai X, Li C, Zhao Y, et al. Association between certain non-small cell lung cancer driver mutations and predictive markers for chemotherapy or programmed death-ligand 1 inhibition. Cancer Sci 2019;110 (6):2014–21.
|
[11] |
Cutz JC, Guan J, Bayani J, Yoshimoto M, Xue H, Sutcliffe M, et al. Establishment in severe combined immunodeficiency mice of subrenal capsule xenografts and transplantable tumor lines from a variety of primary human lung cancers: potential models for studying tumor progression-related changes. Clin Cancer Res 2006;12(13):4043–54.
|
[12] |
Cassidy JW, Caldas C, Bruna A. Maintaining tumor heterogeneity in patientderived tumor xenografts. Cancer Res 2015;75(15):2963–8.
|
[13] |
Chijiwa T, Kawai K, Noguchi A, Sato H, Hayashi A, Cho H, et al. Establishment of patient-derived cancer xenografts in immunodeficient NOG mice. Int J Oncol 2015;47(1):61–70.
|
[14] |
Ny L, Rizzo LY, Belgrano V, Karlsson J, Jespersen H, Carstam L, et al. Supporting clinical decision making in advanced melanoma by preclinical testing in personalized immune-humanized xenograft mouse models. Ann Oncol 2020;31(2):266–73.
|
[15] |
Jespersen H, Lindberg MF, Donia M, Söderberg EMV, Andersen R, Keller U, et al. Clinical responses to adoptive T-cell transfer can be modeled in an autologous immune-humanized mouse model. Nat Commun 2017;8(1):707.
|
[16] |
Fichtner I, Rolff J, Soong R, Hoffmann J, Hammer S, Sommer A, et al. Establishment of patient-derived non-small cell lung cancer xenografts as models for the identification of predictive biomarkers. Clin Cancer Res 2008;14 (20):6456–68.
|
[17] |
John T, Kohler D, Pintilie M, Yanagawa N, Pham NA, Li M, et al. The ability to form primary tumor xenografts is predictive of increased risk of disease recurrence in early-stage non-small cell lung cancer. Clin Cancer Res 2011;17 (1):134–41.
|
[18] |
Zhang XC, Zhang J, Li M, Huang XS, Yang XN, Zhong WZ, et al. Establishment of patient-derived non-small cell lung cancer xenograft models with genetic aberrations within EGFR, KRAS and FGFR1: useful tools for preclinical studies of targeted therapies. J Transl Med 2013;11(1):168.
|
[19] |
Izumchenko E, Paz K, Ciznadija D, Sloma I, Katz A, Vasquez-Dunddel D, et al. Patient-derived xenografts effectively capture responses to oncology therapy in a heterogeneous cohort of patients with solid tumors. Ann Oncol 2017;28 (10):2595–605.
|
[20] |
Yu SM, Jung SH, Chung YJ. Comparison of the genetic alterations between primary colorectal cancers and their corresponding patient-derived xenograft tissues. Genomics Inform 2018;16(2):30–5.
|
[21] |
Hidalgo M, Amant F, Biankin AV, Budinská E, Byrne AT, Caldas C, et al. Patientderived xenograft models: an emerging platform for translational cancer research. Cancer Discov 2014;4(9):998–1013.
|
[22] |
Park B, Jeong BC, Choi L, Kwon GY, Lim JE, Seo SI, et al. Development and characterization of a bladder cancer xenograft model using patient-derived tumor tissue. Cancer Sci 2013;104(5):631–8.
|
[23] |
Xu H, Zhao X, Shi Y, Li X, Qian Y, Zou J, et al. Development and validation of a simple-to-use clinical nomogram for predicting obstructive sleep apnea. BMC Pulm Med 2019;19(1):1.
|
[24] |
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. Smote: synthetic minority over-sampling technique. J Artif Intell Res 2002;16(1):321–57.
|
[25] |
Tomaschek F, Hendrix P, Baayen RH. Strategies for addressing collinearity in multivariate linguistic data. J Phonetics 2018;71:249–67.
|
[26] |
Sain H, Purnami SW. Combine sampling support vector machine for imbalanced data classification. Procedia Comput Sci 2015;72:59–66.
|
[27] |
McFadden D, Papagiannakopoulos T, Taylor-Weiner A, Stewart C, Carter S, Cibulskis K, et al. Genetic and clonal dissection of murine small cell lung carcinoma progression by genome sequencing. Cell 2014;156 (6):1298–311.
|
[28] |
Fu S, Zhao J, Bai H, Duan J, Wang Z, An T, et al. High-fidelity of non-small cell lung cancer xenograft models derived from bronchoscopy-guided biopsies. Thorac Cancer 2016;7(1):100–10.
|
[29] |
Meng X, Gao Y, Yang L, Jing H, Teng F, Huang Z, et al. Immune microenvironment differences between squamous and non-squamous nonsmall-cell lung cancer and their influence on the prognosis. Clin Lung Cancer 2019;20(1):48–58.
|
[30] |
Jamal-Hanjani M, Wilson GA, McGranahan N, Birkbak NJ, Watkins TBK, Veeriah S, et al. TRACERx Consortium. Tracking the evolution of non-small-cell lung cancer. N Engl J Med 2017;376(22):2109–21.
|
[31] |
Milholland B, Auton A, Suh Y, Vijg J. Age-related somatic mutations in the cancer genome. Oncotarget 2015;6(28):24627–35.
|
[32] |
Fane M, Weeraratna AT. How the ageing microenvironment influences tumour progression. Nat Rev Cancer 2020;20(2):89–106.
|
[33] |
Mattar M, McCarthy CR, Kulick AR, Qeriqi B, Guzman S, de Stanchina E. Establishing and maintaining an extensive library of patient-derived xenograft models. Front Oncol 2018;8:19.
|
[34] |
Raynaud F, Mina M, Tavernari D, Ciriello G, Kool M. Pan-cancer inference of intra-tumor heterogeneity reveals associations with different forms of genomic instability. PLoS Genet 2018;14(9):e1007669.
|
[35] |
Lin A, Wei T, Meng H, Luo P, Zhang J. Role of the dynamic tumor microenvironment in controversies regarding immune checkpoint inhibitors for the treatment of non-small cell lung cancer (NSCLC) with EGFR mutations. Mol Cancer 2019;18(1):139.
|
[36] |
Novosiadly R, Schaer D, Lu Z, Amaladas N, Luo S, Capen A, et al. P3.07-006 pemetrexed exerts intratumor immunomodulatory effects and enhances efficacy of immune checkpoint blockade in MC38 syngeneic mouse tumor model. J Thorac Oncol 2017;12(11):S2300.
|
[37] |
Jia Y, Li X, Jiang T, Zhao S, Zhao C, Zhang L, et al. EGFR-targeted therapy alters the tumor microenvironment in EGFR-driven lung tumors: implications for combination therapies. Int J Cancer 2019;145 (5):1432–44.
|
[38] |
Wang F, Diao XY, Zhang X, Shao Q, Feng YF, An X, et al. Identification of genetic alterations associated with primary resistance to EGFR-TKIs in advanced nonsmall-cell lung cancer patients with EGFR sensitive mutations. Cancer Commun 2019;39(1):7.
|
[39] |
Shaikhina T, Lowe D, Daga S, Briggs D, Higgins R, Khovanova N. Machine learning for predictive modelling based on small data in biomedical engineering. IFAC-PapersOnLine 2015;48(20):469–74.
|
[40] |
Cohen ME, Hudson DL. New chaotic methods for biomedical signal analysis. In: Naguib RNG, Solaiman B, Nagy G, Le Guillou C, Roa L, Beltrame F, editors. Proceedings of the 2000 IEEE EMBS International Conference on Information Technology Applications in Biomedicine; 2000 Nov 9–10; Arlington, TX, USA. York City: Curran Associates; 2002.
|
/
〈 |
|
〉 |