BMC Bioinformatics | |
Missing value imputation for microRNA expression data by using a GO-based similarity measure | |
Proceedings | |
Zhuangdi Xu1  Yang Yang2  Dandan Song3  | |
[1] Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dongchuan Rd., 200240, Shanghai, China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dongchuan Rd., 200240, Shanghai, China;Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, 200240, Shanghai, China;School of Computer Science and Technology, Beijing Institute of Technology, 100081, Beijing, China; | |
关键词: Gene Ontology; Directed Acyclic Graph; Semantic Similarity; Functional Similarity; Imputation Method; | |
DOI : 10.1186/s12859-015-0853-0 | |
来源: Springer | |
【 摘 要 】
BackgroundMissing values are commonly present in microarray data profiles. Instead of discarding genes or samples with incomplete expression level, missing values need to be properly imputed for accurate data analysis. The imputation methods can be roughly categorized as expression level-based and domain knowledge-based. The first type of methods only rely on expression data without the help of external data sources, while the second type incorporates available domain knowledge into expression data to improve imputation accuracy.In recent years, microRNA (miRNA) microarray has been largely developed and used for identifying miRNA biomarkers in complex human disease studies. Similar to mRNA profiles, miRNA expression profiles with missing values can be treated with the existing imputation methods. However, the domain knowledge-based methods are hard to be applied due to the lack of direct functional annotation for miRNAs. With the rapid accumulation of miRNA microarray data, it is increasingly needed to develop domain knowledge-based imputation algorithms specific to miRNA expression profiles to improve the quality of miRNA data analysis.ResultsWe connect miRNAs with domain knowledge of Gene Ontology (GO) via their target genes, and define miRNA functional similarity based on the semantic similarity of GO terms in GO graphs. A new measure combining miRNA functional similarity and expression similarity is used in the imputation of missing values. The new measure is tested on two miRNA microarray datasets from breast cancer research and achieves improved performance compared with the expression-based method on both datasets.ConclusionsThe experimental results demonstrate that the biological domain knowledge can benefit the estimation of missing values in miRNA profiles as well as mRNA profiles. Especially, functional similarity defined by GO terms annotated for the target genes of miRNAs can be useful complementary information for the expression-based method to improve the imputation accuracy of miRNA array data. Our method and data are available to the public upon request.
【 授权许可】
CC BY
© Yang et al. 2016
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311100361222ZK.pdf | 886KB | download |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]