| BMC Medical Genomics | |
| Cancer gene expression profiles associated with clinical outcomes to chemotherapy treatments | |
| Andrew Garazha1  Victor Tkachev1  Maxim Sorokin2  Nicolas Borisov3  Anton Buzdin4  | |
| [1] Department of Bioinformatics and Molecular Networks, OmicsWay Corporation, 91788, Walnut, CA, USA;Department of Bioinformatics and Molecular Networks, OmicsWay Corporation, 91788, Walnut, CA, USA;I.M. Sechenov First Moscow State Medical University (Sechenov University), 119991, Moscow, Russia;Department of Bioinformatics and Molecular Networks, OmicsWay Corporation, 91788, Walnut, CA, USA;Moscow Institute of Physics and Technology, 141701, Dolgoprudny, Moscow Oblast, Russia;Department of Bioinformatics and Molecular Networks, OmicsWay Corporation, 91788, Walnut, CA, USA;Moscow Institute of Physics and Technology, 141701, Dolgoprudny, Moscow Oblast, Russia;I.M. Sechenov First Moscow State Medical University (Sechenov University), 119991, Moscow, Russia;Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, 117997, Moscow, Russia; | |
| 关键词: Machine learning; Transcriptomics; Gene expression; RNA sequencing; Microarrays; Molecular diagnostics; Biomarkers detection; Cancer; Clinical oncology; Personalized medicine; Chemotherapy; | |
| DOI : 10.1186/s12920-020-00759-0 | |
| 来源: Springer | |
PDF
|
|
【 摘 要 】
BackgroundMachine learning (ML) methods still have limited applicability in personalized oncology due to low numbers of available clinically annotated molecular profiles. This doesn’t allow sufficient training of ML classifiers that could be used for improving molecular diagnostics.MethodsWe reviewed published datasets of high throughput gene expression profiles corresponding to cancer patients with known responses on chemotherapy treatments. We browsed Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA) and Tumor Alterations Relevant for GEnomics-driven Therapy (TARGET) repositories.ResultsWe identified data collections suitable to build ML models for predicting responses on certain chemotherapeutic schemes. We identified 26 datasets, ranging from 41 till 508 cases per dataset. All the datasets identified were checked for ML applicability and robustness with leave-one-out cross validation. Twenty-three datasets were found suitable for using ML that had balanced numbers of treatment responder and non-responder cases.ConclusionsWe collected a database of gene expression profiles associated with clinical responses on chemotherapy for 2786 individual cancer cases. Among them seven datasets included RNA sequencing data (for 645 cases) and the others – microarray expression profiles. The cases represented breast cancer, lung cancer, low-grade glioma, endothelial carcinoma, multiple myeloma, adult leukemia, pediatric leukemia and kidney tumors. Chemotherapeutics included taxanes, bortezomib, vincristine, trastuzumab, letrozole, tipifarnib, temozolomide, busulfan and cyclophosphamide.
【 授权许可】
CC BY
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202104243766921ZK.pdf | 717KB |
PDF