Machine Learning in Systems Biology: MLSB 2007 | |
Proceedings Towards structured output prediction of enzyme function | |
生物科学;医药卫生 | |
Katja Astikainen ; Liisa Holm ; Esa Pitkänen ; Sandor Szedmak ; Juho Rousu | |
Others : http://www.biomedcentral.com/content/pdf/1753-6561-2-S4-S2.pdf PID : 49554 |
|
来源: CEUR | |
【 摘 要 】
Background: In this paper we describe work in progress in developing kernel methods for enzyme function prediction. Our focus is in de veloping so called stru ctured output prediction methods, where the enzymatic reaction is thecombinatorial target object for prediction. We compared two structured output prediction me thods, the Hierarchical Max-Margin Markovalgorithm (HM3) and the Maximum Margin Regression algori thm (MMR) in hierarchical classification of enzyme function. As sequence features we us e various string kernels and the GTG feature set derived from the global alignment trace graph of protein sequences.Results:In our experiments, in predicting enzyme EC classification we obtain over 85% accuracy (predicting the four digit EC code) and over 91% microlabel F1score (predicting individual EC digits). In predicting the Gold Standard enzyme families,we obtain over 79% accuracy (predicting family correctly) and over 89% mi crolabel F1 score (predicting supe rfamilies and families). In the latter case, structured output methods are signif icantly more accuratethan nearest neighbor classifier. A polynomial kernel over the GTG feature set turned out to be a prerequisite for accurate function prediction. Combining GTG with string kernels boosted accuracy slightly in the case of EC class prediction.Conclusion:Structured output prediction with GTG fe atures is shown to be computationally feasible and to have accuracyon par with state-of-the-artapproaches in enzyme function prediction.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Proceedings Towards structured output prediction of enzyme function | 342KB | download |