期刊论文详细信息
PeerJ
Cross-platform normalization of microarray and RNA-seq data for machine learning applications
article
Jeffrey A. Thompson1  Jie Tan1  Casey S. Greene1 
[1] Department of Genetics, Geisel School of Medicine at Dartmouth;Quantitative Biomedical Sciences Program, Geisel School of Medicine at Dartmouth;Molecular and Cellular Biology, Geisel School of Medicine at Dartmouth;Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania;Institute for Translational Medicine and Therapeutics, University of Pennsylvania;Institute for Biomedical Informatics, University of Pennsylvania
关键词: Gene expression;    Normalization;    RNA-sequencing;    Microarray;    Machine learning;    Quantile normalization;    Cross-platform normalization;    Training;    Distribution;    Nonparanormal transformation;   
DOI  :  10.7717/peerj.1621
学科分类:社会科学、人文和艺术(综合)
来源: Inra
PDF
【 摘 要 】

Large, publicly available gene expression datasets are often analyzed with the aid of machine learning algorithms. Although RNA-seq is increasingly the technology of choice, a wealth of expression data already exist in the form of microarray data. If machine learning models built from legacy data can be applied to RNA-seq data, larger, more diverse training datasets can be created and validation can be performed on newly generated data. We developed Training Distribution Matching (TDM), which transforms RNA-seq data for use with models constructed from legacy platforms. We evaluated TDM, as well as quantile normalization, nonparanormal transformation, and a simple log2 transformation, on both simulated and biological datasets of gene expression. Our evaluation included both supervised and unsupervised machine learning approaches. We found that TDM exhibited consistently strong performance across settings and that quantile normalization also performed well in many circumstances. We also provide a TDM package for the R programming language.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202307100015690ZK.pdf 1499KB PDF download
  文献评价指标  
  下载次数:3次 浏览次数:2次