全部资源

1 Estimating the individualized HIV-1 genetic barrier to resistance using a nelfinavir fitness landscape [期刊论文]

BMC Bioinformatics,2010年

Eric Van Wijngaerden, Yves Moreau, Ricardo J Camacho, Soo-Yon Rhee, Robert W Shafer, Koen Deforche, Gertjan Beheydt, Philippe Lemey, Kristof Theys, Kristel van Laethem, Anne-Mieke Vandamme

LicenseType:CC BY |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundFailure on Highly Active Anti-Retroviral Treatment is often accompanied with development of antiviral resistance to one or more drugs included in the treatment. In general, the virus is more likely to develop resistance to drugs with a lower genetic barrier. Previously, we developed a method to reverse engineer, from clinical sequence data, a fitness landscape experienced by HIV-1 under nelfinavir (NFV) treatment. By simulation of evolution over this landscape, the individualized genetic barrier to NFV resistance may be estimated for an isolate.ResultsWe investigated the association of estimated genetic barrier with risk of development of NFV resistance at virological failure, in 201 patients that were predicted fully susceptible to NFV at baseline, and found that a higher estimated genetic barrier was indeed associated with lower odds for development of resistance at failure (OR 0.62 (0.45 - 0.94), per additional mutation needed, p = .02).ConclusionsThus, variation in individualized genetic barrier to NFV resistance may impact effective treatment options available after treatment failure. If similar results apply for other drugs, then estimated genetic barrier may be a new clinical tool for choice of treatment regimen, which allows consideration of available treatment options after virological failure.

连接1

2 Candidate gene prioritization by network analysis of differential expression using machine learning approaches [期刊论文]

BMC Bioinformatics,2010年

Daniela Nitsch, Bart de Moor, Yves Moreau, Fabian Ojeda, Joana P Gonçalves

LicenseType:CC BY |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundDiscovering novel disease genes is still challenging for diseases for which no prior knowledge - such as known disease genes or disease-related pathways - is available. Performing genetic studies frequently results in large lists of candidate genes of which only few can be followed up for further investigation. We have recently developed a computational method for constitutional genetic disorders that identifies the most promising candidate genes by replacing prior knowledge by experimental data of differential gene expression between affected and healthy individuals.To improve the performance of our prioritization strategy, we have extended our previous work by applying different machine learning approaches that identify promising candidate genes by determining whether a gene is surrounded by highly differentially expressed genes in a functional association or protein-protein interaction network.ResultsWe have proposed three strategies scoring disease candidate genes relying on network-based machine learning approaches, such as kernel ridge regression, heat kernel, and Arnoldi kernel approximation. For comparison purposes, a local measure based on the expression of the direct neighbors is also computed. We have benchmarked these strategies on 40 publicly available knockout experiments in mice, and performance was assessed against results obtained using a standard procedure in genetics that ranks candidate genes based solely on their differential expression levels (Simple Expression Ranking). Our results showed that our four strategies could outperform this standard procedure and that the best results were obtained using the Heat Kernel Diffusion Ranking leading to an average ranking position of 8 out of 100 genes, an AUC value of 92.3% and an error reduction of 52.8% relative to the standard procedure approach which ranked the knockout gene on average at position 17 with an AUC value of 83.7%.ConclusionIn this study we could identify promising candidate genes using network based machine learning approaches even if no knowledge is available about the disease or phenotype.

连接1

3 Gene prioritization and clustering by multi-view text mining [期刊论文]

BMC Bioinformatics,2010年

Leon-Charles Tranchevent, Bart De Moor, Shi Yu, Yves Moreau

LicenseType:CC BY |

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

BackgroundText mining has become a useful tool for biologists trying to understand the genetics of diseases. In particular, it can help identify the most interesting candidate genes for a disease for further experimental analysis. Many text mining approaches have been introduced, but the effect of disease-gene identification varies in different text mining models. Thus, the idea of incorporating more text mining models may be beneficial to obtain more refined and accurate knowledge. However, how to effectively combine these models still remains a challenging question in machine learning. In particular, it is a non-trivial issue to guarantee that the integrated model performs better than the best individual model.ResultsWe present a multi-view approach to retrieve biomedical knowledge using different controlled vocabularies. These controlled vocabularies are selected on the basis of nine well-known bio-ontologies and are applied to index the vast amounts of gene-based free-text information available in the MEDLINE repository. The text mining result specified by a vocabulary is considered as a view and the obtained multiple views are integrated by multi-source learning algorithms. We investigate the effect of integration in two fundamental computational disease gene identification tasks: gene prioritization and gene clustering. The performance of the proposed approach is systematically evaluated and compared on real benchmark data sets. In both tasks, the multi-view approach demonstrates significantly better performance than other comparing methods.ConclusionsIn practical research, the relevance of specific vocabulary pertaining to the task is usually unknown. In such case, multi-view text mining is a superior and promising strategy for text-based disease gene identification.

连接1

4 Candidate gene prioritization by network analysis of differential expression using machine learning approaches [期刊论文]

BMC Bioinformatics,2010年

Yves Moreau, Bart de Moor, Fabian Ojeda, Joana P Gonçalves, Daniela Nitsch

英文

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

Background

Discovering novel disease genes is still challenging for diseases for which no prior knowledge - such as known disease genes or disease-related pathways - is available. Performing genetic studies frequently results in large lists of candidate genes of which only few can be followed up for further investigation. We have recently developed a computational method for constitutional genetic disorders that identifies the most promising candidate genes by replacing prior knowledge by experimental data of differential gene expression between affected and healthy individuals.

To improve the performance of our prioritization strategy, we have extended our previous work by applying different machine learning approaches that identify promising candidate genes by determining whether a gene is surrounded by highly differentially expressed genes in a functional association or protein-protein interaction network.

Results

We have proposed three strategies scoring disease candidate genes relying on network-based machine learning approaches, such as kernel ridge regression, heat kernel, and Arnoldi kernel approximation. For comparison purposes, a local measure based on the expression of the direct neighbors is also computed. We have benchmarked these strategies on 40 publicly available knockout experiments in mice, and performance was assessed against results obtained using a standard procedure in genetics that ranks candidate genes based solely on their differential expression levels (Simple Expression Ranking). Our results showed that our four strategies could outperform this standard procedure and that the best results were obtained using the Heat Kernel Diffusion Ranking leading to an average ranking position of 8 out of 100 genes, an AUC value of 92.3% and an error reduction of 52.8% relative to the standard procedure approach which ranked the knockout gene on average at position 17 with an AUC value of 83.7%.

Conclusion

In this study we could identify promising candidate genes using network based machine learning approaches even if no knowledge is available about the disease or phenotype.

连接1

5 L2-norm multiple kernel learning and its application to biomedical data fusion [期刊论文]

BMC Bioinformatics,2010年

Yves Moreau, Bart De Moor, Johan AK Suykens, Leon-Charles Tranchevent, Anneleen Daemen, Tillmann Falck, Shi Yu

英文

预览 | 原文链接 | 全文 [ 浏览：0 下载：1 ]

摘要
图表
参考文献

Background

This paper introduces the notion of optimizing different norms in the dual problem of support vector machines with multiple kernels. The selection of norms yields different extensions of multiple kernel learning (MKL) such as L_∞, L₁, and L₂MKL. In particular, L₂MKL is a novel method that leads to non-sparse optimal kernel coefficients, which is different from the sparse kernel coefficients optimized by the existing L_∞MKL method. In real biomedical applications, L₂MKL may have more advantages over sparse integration method for thoroughly combining complementary information in heterogeneous data sources.

Results

We provide a theoretical analysis of the relationship between the L₂optimization of kernels in the dual problem with the L₂coefficient regularization in the primal problem. Understanding the dual L₂problem grants a unified view on MKL and enables us to extend the L₂method to a wide range of machine learning problems. We implement L₂MKL for ranking and classification problems and compare its performance with the sparse L_∞and the averaging L₁MKL methods. The experiments are carried out on six real biomedical data sets and two large scale UCI data sets. L₂MKL yields better performance on most of the benchmark data sets. In particular, we propose a novel L₂MKL least squares support vector machine (LSSVM) algorithm, which is shown to be an efficient and promising classifier for large scale data sets processing.

Conclusions

This paper extends the statistical framework of genomic data fusion based on MKL. Allowing non-sparse weights on the data sources is an attractive option in settings where we believe most data sources to be relevant to the problem at hand and want to avoid a "winner-takes-all" effect seen in L_∞MKL, which can be detrimental to the performance in prospective studies. The notion of optimizing L₂kernels can be straightforwardly extended to ranking, classification, regression, and clustering algorithms. To tackle the computational burden of MKL, this paper proposes several novel LSSVM based MKL algorithms. Systematic comparison on real data sets shows that LSSVM MKL has comparable performance as the conventional SVM MKL algorithms. Moreover, large scale numerical experiments indicate that when cast as semi-infinite programming, LSSVM MKL can be solved more efficiently than SVM MKL.

Availability

The MATLAB code of algorithms implemented in this paper is downloadable from http://homes.esat.kuleuven.be/~sistawww/bioi/syu/l2lssvm.html webcite.

连接1

6 Estimating the individualized HIV-1 genetic barrier to resistance using a nelfinavir fitness landscape [期刊论文]

BMC Bioinformatics,2010年

Anne-Mieke Vandamme, Eric Van Wijngaerden, Robert W Shafer, Soo-Yon Rhee, Ricardo J Camacho, Philippe Lemey, Kristel van Laethem, Yves Moreau, Gertjan Beheydt, Koen Deforche, Kristof Theys

英文

预览 | 原文链接 | 全文 [ 浏览：0 下载：0 ]

摘要
图表
参考文献

Background

Failure on Highly Active Anti-Retroviral Treatment is often accompanied with development of antiviral resistance to one or more drugs included in the treatment. In general, the virus is more likely to develop resistance to drugs with a lower genetic barrier. Previously, we developed a method to reverse engineer, from clinical sequence data, a fitness landscape experienced by HIV-1 under nelfinavir (NFV) treatment. By simulation of evolution over this landscape, the individualized genetic barrier to NFV resistance may be estimated for an isolate.

Results

We investigated the association of estimated genetic barrier with risk of development of NFV resistance at virological failure, in 201 patients that were predicted fully susceptible to NFV at baseline, and found that a higher estimated genetic barrier was indeed associated with lower odds for development of resistance at failure (OR 0.62 (0.45 - 0.94), per additional mutation needed, p = .02).

Conclusions

Thus, variation in individualized genetic barrier to NFV resistance may impact effective treatment options available after treatment failure. If similar results apply for other drugs, then estimated genetic barrier may be a new clinical tool for choice of treatment regimen, which allows consideration of available treatment options after virological failure.

连接1