| BMC Bioinformatics | |
| Integrating biological knowledge into variable selection: an empirical Bayes approach with an application in cancer biology | |
| Methodology Article | |
| Paul T Spellman1  Joe W Gray1  Richard M Neve2  Safiyyah Ziyad3  Wen-Lin Kuo3  Nora Bayani3  Steven M Hill4  Sach Mukherjee4  | |
| [1] Center for Spatial Systems Biomedicine, Oregon Health & Science University, 97239, Portland, OR, USA;Genentech Inc, 94080, San Francisco, CA, USA;Life Sciences Division, Lawrence Berkeley National Laboratory, 94720, Berkeley, CA, Alameda;The Netherlands Cancer Institute, 1066 CX, Amsterdam, The Netherlands;Centre for Complexity Science, University of Warwick, CV4 7AL, Coventry, UK;Department of Statistics, University of Warwick, CV4 7AL, Coventry, UK; | |
| 关键词: Lasso; Markov Random Field; Marginal Likelihood; Inclusion Probability; Bayesian Variable Selection; | |
| DOI : 10.1186/1471-2105-13-94 | |
| received in 2011-06-13, accepted in 2012-04-19, 发布年份 2012 | |
| 来源: Springer | |
PDF
|
|
【 摘 要 】
BackgroundAn important question in the analysis of biochemical data is that of identifying subsets of molecular variables that may jointly influence a biological response. Statistical variable selection methods have been widely used for this purpose. In many settings, it may be important to incorporate ancillary biological information concerning the variables of interest. Pathway and network maps are one example of a source of such information. However, although ancillary information is increasingly available, it is not always clear how it should be used nor how it should be weighted in relation to primary data.ResultsWe put forward an approach in which biological knowledge is incorporated using informative prior distributions over variable subsets, with prior information selected and weighted in an automated, objective manner using an empirical Bayes formulation. We employ continuous, linear models with interaction terms and exploit biochemically-motivated sparsity constraints to permit exact inference. We show an example of priors for pathway- and network-based information and illustrate our proposed method on both synthetic response data and by an application to cancer drug response data. Comparisons are also made to alternative Bayesian and frequentist penalised-likelihood methods for incorporating network-based information.ConclusionsThe empirical Bayes method proposed here can aid prior elicitation for Bayesian variable selection studies and help to guard against mis-specification of priors. Empirical Bayes, together with the proposed pathway-based priors, results in an approach with a competitive variable selection performance. In addition, the overall procedure is fast, deterministic, and has very few user-set parameters, yet is capable of capturing interplay between molecular players. The approach presented is general and readily applicable in any setting with multiple sources of biological prior knowledge.
【 授权许可】
CC BY
© Hill et al.; licensee BioMed Central Ltd. 2012
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202311108581981ZK.pdf | 1373KB |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
- [43]
- [44]
- [45]
- [46]
PDF