BMC Bioinformatics | |
DBNorm: normalizing high-density oligonucleotide microarray data based on distributions | |
Daniel Catchpoole1 David Skillicorn2 Qinxue Meng3 Paul J. Kennedy3 | |
[1] Children’s Cancer Research Unit, The Children’s Hospital at Westmead;School of Computing, Queen’s University at Kingston;School of Software, Faculty of Engineering and Information Technology and the Centre for Artificial Intelligence, University of Technology Sydney (UTS); | |
关键词: Normalization; Distribution; Gene expression data; R; | |
DOI : 10.1186/s12859-017-1912-5 | |
来源: DOAJ |
【 摘 要 】
Abstract Background Data from patients with rare diseases is often produced using different platforms and probe sets because patients are widely distributed in space and time. Aggregating such data requires a method of normalization that makes patient records comparable. Results This paper proposed DBNorm, implemented as an R package, is an algorithm that normalizes arbitrarily distributed data to a common, comparable form. Specifically, DBNorm merges data distributions by fitting functions to each of them, and using the probability of each element drawn from the fitted distribution to merge it into a global distribution. DBNorm contains state-of-the-art fitting functions including Polynomial, Fourier and Gaussian distributions, and also allows users to define their own fitting functions if required. Conclusions The performance of DBNorm is compared with z-score, average difference, quantile normalization and ComBat on a set of datasets, including several that are publically available. The performance of these normalization methods are compared using statistics, visualization, and classification when class labels are known based on a number of self-generated and public microarray datasets. The experimental results show that DBNorm achieves better normalization results than conventional methods. Finally, the approach has the potential to be applicable outside bioinformatics analysis.
【 授权许可】
Unknown