期刊论文

【摘要】

BackgroundHigh-throughput sequencing data are widely collected and analyzed in the study of complex diseases in quest of improving human health. Well-studied algorithms mostly deal with single data source, and cannot fully utilize the potential of these multi-omics data sources. In order to provide a holistic understanding of human health and diseases, it is necessary to integrate multiple data sources. Several algorithms have been proposed so far, however, a comprehensive comparison of data integration algorithms for classification of binary traits is currently lacking.ResultsIn this paper, we focus on two common classes of integration algorithms, graph-based that depict relationships with subjects denoted by nodes and relationships denoted by edges, and kernel-based that can generate a classifier in feature space. Our paper provides a comprehensive comparison of their performance in terms of various measurements of classification accuracy and computation time. Seven different integration algorithms, including graph-based semi-supervised learning, graph sharpening integration, composite association network, Bayesian network, semi-definite programming-support vector machine (SDP-SVM), relevance vector machine (RVM) and Ada-boost relevance vector machine are compared and evaluated with hypertension and two cancer data sets in our study.In general, kernel-based algorithms create more complex models and require longer computation time, but they tend to perform better than graph-based algorithms. The performance of graph-based algorithms has the advantage of being faster computationally.ConclusionsThe empirical results demonstrate that composite association network, relevance vector machine, and Ada-boost RVM are the better performers. We provide recommendations on how to choose an appropriate algorithm for integrating data from multiple sources.

【授权许可】

CC BY
© The Author(s). 2017

【预览】

附件列表
Files	Size	Format	View
RO202311104847704ZK.pdf	1246KB	PDF	download
MediaObjects/12894_2023_1313_MOESM4_ESM.xlsx	14KB	Other	download
Fig. 8	3631KB	Image	download
MediaObjects/13046_2023_2865_MOESM6_ESM.tif	2738KB	Other	download
41512_2023_158_Article_IEq9.gif	1KB	Image	download
12951_2015_155_Article_IEq6.gif	1KB	Image	download
Fig. 6	488KB	Image	download
Fig. 1	196KB	Image	download
Fig. 6	601KB	Image	download
Fig. 2	283KB	Image	download
Fig. 2	650KB	Image	download
Fig. 6	514KB	Image	download
Fig. 8	2130KB	Image	download
MediaObjects/12888_2023_5289_MOESM1_ESM.docx	690KB	Other	download
Fig. 1	224KB	Image	download
41512_2023_158_Article_IEq20.gif	1KB	Image	download
Fig. 1	439KB	Image	download
12951_2017_270_Article_IEq3.gif	1KB	Image	download
Fig. 2	786KB	Image	download
Fig. 2	422KB	Image	download
MediaObjects/13068_2023_2403_MOESM2_ESM.xls	1986KB	Other	download

【图表】

Fig. 2

Fig. 2

12951_2017_270_Article_IEq3.gif

Fig. 1

41512_2023_158_Article_IEq20.gif

Fig. 1

Fig. 8

Fig. 6

Fig. 2

Fig. 2

Fig. 6

Fig. 1

Fig. 6

12951_2015_155_Article_IEq6.gif

41512_2023_158_Article_IEq9.gif

Fig. 8

【参考文献】

[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]

BMC Bioinformatics
A comparison of graph- and kernel-based –omics data integration algorithms for classifying complex traits
Research Article
Hongyu Zhao¹ Herbert Pang² Kang K. Yan²
[1] Department of Biostatistics, Yale University, New Haven, CT, USA;School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China;
关键词: Bayesian network; Relevance vector machine; Graph-based semi-supervised learning; Semi-definite programming (SDP)-support vector machine; Multiple data sources; Classification;
DOI : 10.1186/s12859-017-1982-4
received in 2017-06-05, accepted in 2017-11-26, 发布年份 2017
来源: Springer
PDF


	文献评价指标
	下载次数：8次	浏览次数：0次

【 摘 要 】

【 授权许可】

【 预 览 】

【 图 表 】

【 参考文献 】

【摘要】

【授权许可】

【预览】

【图表】

【参考文献】