期刊论文详细信息
Frontiers in Molecular Biosciences
Uniformly shaped harmonization combines human transcriptomic data from different platforms while retaining their biological properties and differential gene expression patterns
Molecular Biosciences
Ella Kim1  Betul Karademir-Yilmaz2  Denis Kuzmin3  Alexander Simonov4  Maxim Sorokin5  Anton Buzdin6  Nicolas Borisov7  Victor Tkachev8 
[1] Clinic for Neurosurgery, Laboratory of Experimental Neurooncology, Johannes Gutenberg University Medical Centre, Mainz, Germany;Department of Biochemistry, School of Medicine/Genetic and Metabolic Diseases Research and Investigation Center (GEMHAM) Marmara University, Istanbul, Türkiye;Moscow Institute of Physics and Technology, Dolgoprudny, Russia;Moscow Institute of Physics and Technology, Dolgoprudny, Russia;Oncobox Ltd., Moscow, Russia;Moscow Institute of Physics and Technology, Dolgoprudny, Russia;Oncobox Ltd., Moscow, Russia;World-Class Research Center “Digital Biodesign and Personalized Healthcare”, Sechenov First Moscow State Medical University, Moscow, Russia;Moscow Institute of Physics and Technology, Dolgoprudny, Russia;World-Class Research Center “Digital Biodesign and Personalized Healthcare”, Sechenov First Moscow State Medical University, Moscow, Russia;Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia;PathoBiology Group, European Organization for Research and Treatment of Cancer (EORTC), Brussels, Belgium;Omicsway Corp, Walnut, CA, United States;Moscow Institute of Physics and Technology, Dolgoprudny, Russia;Oncobox Ltd., Moscow, Russia;
关键词: gene expression;    transcriptional profiles;    RNA sequencing;    microarray hybridization;    data normalization and harmonization;    platform bias;    cancer transcriptomics;    correlation analysis;   
DOI  :  10.3389/fmolb.2023.1237129
 received in 2023-06-08, accepted in 2023-08-28,  发布年份 2023
来源: Frontiers
PDF
【 摘 要 】

Introduction: Co-normalization of RNA profiles obtained using different experimental platforms and protocols opens avenue for comprehensive comparison of relevant features like differentially expressed genes associated with disease. Currently, most of bioinformatic tools enable normalization in a flexible format that depends on the individual datasets under analysis. Thus, the output data of such normalizations will be poorly compatible with each other. Recently we proposed a new approach to gene expression data normalization termed Shambhala which returns harmonized data in a uniform shape, where every expression profile is transformed into a pre-defined universal format. We previously showed that following shambhalization of human RNA profiles, overall tissue-specific clustering features are strongly retained while platform-specific clustering is dramatically reduced.Methods: Here, we tested Shambhala performance in retention of fold-change gene expression features and other functional characteristics of gene clusters such as pathway activation levels and predicted cancer drug activity scores.Results: Using 6,793 cancer and 11,135 normal tissue gene expression profiles from the literature and experimental datasets, we applied twelve performance criteria for different versions of Shambhala and other methods of transcriptomic harmonization with flexible output data format. Such criteria dealt with the biological type classifiers, hierarchical clustering, correlation/regression properties, stability of drug efficiency scores, and data quality for using machine learning classifiers.Discussion: Shambhala-2 harmonizer demonstrated the best results with the close to 1 correlation and linear regression coefficients for the comparison of training vs validation datasets and more than two times lesser instability for calculation of drug efficiency scores compared to other methods.

【 授权许可】

Unknown   
Copyright © 2023 Borisov, Tkachev, Simonov, Sorokin, Kim, Kuzmin, Karademir-Yilmaz and Buzdin.

【 预 览 】
附件列表
Files Size Format View
RO202310125430256ZK.pdf 3995KB PDF download
  文献评价指标  
  下载次数:1次 浏览次数:0次