期刊论文详细信息
BMC Bioinformatics
A comparison of graph- and kernel-based –omics data integration algorithms for classifying complex traits
Research Article
Hongyu Zhao1  Herbert Pang2  Kang K. Yan2 
[1] Department of Biostatistics, Yale University, New Haven, CT, USA;School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China;
关键词: Bayesian network;    Relevance vector machine;    Graph-based semi-supervised learning;    Semi-definite programming (SDP)-support vector machine;    Multiple data sources;    Classification;   
DOI  :  10.1186/s12859-017-1982-4
 received in 2017-06-05, accepted in 2017-11-26,  发布年份 2017
来源: Springer
PDF
【 摘 要 】

BackgroundHigh-throughput sequencing data are widely collected and analyzed in the study of complex diseases in quest of improving human health. Well-studied algorithms mostly deal with single data source, and cannot fully utilize the potential of these multi-omics data sources. In order to provide a holistic understanding of human health and diseases, it is necessary to integrate multiple data sources. Several algorithms have been proposed so far, however, a comprehensive comparison of data integration algorithms for classification of binary traits is currently lacking.ResultsIn this paper, we focus on two common classes of integration algorithms, graph-based that depict relationships with subjects denoted by nodes and relationships denoted by edges, and kernel-based that can generate a classifier in feature space. Our paper provides a comprehensive comparison of their performance in terms of various measurements of classification accuracy and computation time. Seven different integration algorithms, including graph-based semi-supervised learning, graph sharpening integration, composite association network, Bayesian network, semi-definite programming-support vector machine (SDP-SVM), relevance vector machine (RVM) and Ada-boost relevance vector machine are compared and evaluated with hypertension and two cancer data sets in our study.In general, kernel-based algorithms create more complex models and require longer computation time, but they tend to perform better than graph-based algorithms. The performance of graph-based algorithms has the advantage of being faster computationally.ConclusionsThe empirical results demonstrate that composite association network, relevance vector machine, and Ada-boost RVM are the better performers. We provide recommendations on how to choose an appropriate algorithm for integrating data from multiple sources.

【 授权许可】

CC BY   
© The Author(s). 2017

【 预 览 】
附件列表
Files Size Format View
RO202311104847704ZK.pdf 1246KB PDF download
MediaObjects/12894_2023_1313_MOESM4_ESM.xlsx 14KB Other download
Fig. 8 3631KB Image download
MediaObjects/13046_2023_2865_MOESM6_ESM.tif 2738KB Other download
41512_2023_158_Article_IEq9.gif 1KB Image download
12951_2015_155_Article_IEq6.gif 1KB Image download
Fig. 6 488KB Image download
Fig. 1 196KB Image download
Fig. 6 601KB Image download
Fig. 2 283KB Image download
Fig. 2 650KB Image download
Fig. 6 514KB Image download
Fig. 8 2130KB Image download
MediaObjects/12888_2023_5289_MOESM1_ESM.docx 690KB Other download
Fig. 1 224KB Image download
41512_2023_158_Article_IEq20.gif 1KB Image download
Fig. 1 439KB Image download
12951_2017_270_Article_IEq3.gif 1KB Image download
Fig. 2 786KB Image download
Fig. 2 422KB Image download
MediaObjects/13068_2023_2403_MOESM2_ESM.xls 1986KB Other download
【 图 表 】

Fig. 2

Fig. 2

12951_2017_270_Article_IEq3.gif

Fig. 1

41512_2023_158_Article_IEq20.gif

Fig. 1

Fig. 8

Fig. 6

Fig. 2

Fig. 2

Fig. 6

Fig. 1

Fig. 6

12951_2015_155_Article_IEq6.gif

41512_2023_158_Article_IEq9.gif

Fig. 8

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  文献评价指标  
  下载次数:8次 浏览次数:0次