期刊论文详细信息
BMC Bioinformatics
Two-way learning with one-way supervision for gene expression data
Methodology Article
David M. Mutch1  Monica H. T. Wong2  Paul D. McNicholas2 
[1] Department of Human Health and Nutritional Sciences, University of Guelph, N1G 2W1, Guelph, ON, Canada;Department of Mathematics and Statistics, McMaster University, L8S 4L8, Hamilton, ON, Canada;
关键词: Biclustering;    Biomarker discovery;    Finite mixture models;    Microarray gene expression;    Surrogate tissue;   
DOI  :  10.1186/s12859-017-1564-5
 received in 2016-08-05, accepted in 2017-02-24,  发布年份 2017
来源: Springer
PDF
【 摘 要 】

BackgroundA family of parsimonious Gaussian mixture models for the biclustering of gene expression data is introduced. Biclustering is accommodated by adopting a mixture of factor analyzers model with a binary, row-stochastic factor loadings matrix. This particular form of factor loadings matrix results in a block-diagonal covariance matrix, which is a useful property in gene expression analyses, specifically in biomarker discovery scenarios where blood can potentially act as a surrogate tissue for other less accessible tissues. Prior knowledge of the factor loadings matrix is useful in this application and is reflected in the one-way supervised nature of the algorithm. Additionally, the factor loadings matrix can be assumed to be constant across all components because of the relationship desired between the various types of tissue samples. Parameter estimates are obtained through a variant of the expectation-maximization algorithm and the best-fitting model is selected using the Bayesian information criterion. The family of models is demonstrated using simulated data and two real microarray data sets. The first real data set is from a rat study that investigated the influence of diabetes on gene expression in different tissues. The second real data set is from a human transcriptomics study that focused on blood and immune tissues. The microarray data sets illustrate the biclustering family’s performance in biomarker discovery involving peripheral blood as surrogate biopsy material.ResultsThe simulation studies indicate that the algorithm identifies the correct biclusters, most optimally when the number of observation clusters is known. Moreover, the biclustering algorithm identified biclusters comprised of biologically meaningful data related to insulin resistance and immune function in the rat and human real data sets, respectively.ConclusionsInitial results using real data show that this biclustering technique provides a novel approach for biomarker discovery by enabling blood to be used as a surrogate for hard-to-obtain tissues.

【 授权许可】

CC BY   
© The Author(s) 2017

【 预 览 】
附件列表
Files Size Format View
RO202311102327445ZK.pdf 3390KB PDF download
Fig. 1 395KB Image download
350KB Image download
Fig. 4 463KB Image download
Fig. 9 519KB Image download
Fig. 1 205KB Image download
40517_2023_273_Article_IEq4.gif 1KB Image download
MediaObjects/40249_2023_1146_MOESM1_ESM.png 4112KB Other download
40517_2023_273_Article_IEq6.gif 1KB Image download
【 图 表 】

40517_2023_273_Article_IEq6.gif

40517_2023_273_Article_IEq4.gif

Fig. 1

Fig. 9

Fig. 4

Fig. 1

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  • [54]
  • [55]
  • [56]
  • [57]
  • [58]
  • [59]
  • [60]
  • [61]
  • [62]
  文献评价指标  
  下载次数:4次 浏览次数:0次