期刊论文详细信息
BMC Bioinformatics
Active learning for human protein-protein interaction prediction
Research
Madhavi K Ganapathiraju1  Thahir P Mohamed1  Jaime G Carbonell2 
[1] Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA;Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA;Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, USA;
关键词: Feature Vector;    Active Learning;    Random Forest;    Label Data;    Protein Pair;   
DOI  :  10.1186/1471-2105-11-S1-S57
来源: Springer
PDF
【 摘 要 】

BackgroundBiological processes in cells are carried out by means of protein-protein interactions. Determining whether a pair of proteins interacts by wet-lab experiments is resource-intensive; only about 38,000 interactions, out of a few hundred thousand expected interactions, are known today. Active machine learning can guide the selection of pairs of proteins for future experimental characterization in order to accelerate accurate prediction of the human protein interactome.ResultsRandom forest (RF) has previously been shown to be effective for predicting protein-protein interactions. Here, four different active learning algorithms have been devised for selection of protein pairs to be used to train the RF. With labels of as few as 500 protein-pairs selected using any of the four active learning methods described here, the classifier achieved a higher F-score (harmonic mean of Precision and Recall) than with 3000 randomly chosen protein-pairs. F-score of predicted interactions is shown to increase by about 15% with active learning in comparison to that with random selection of data.ConclusionActive learning algorithms enable learning more accurate classifiers with much lesser labelled data and prove to be useful in applications where manual annotation of data is formidable. Active learning techniques demonstrated here can also be applied to other proteomics applications such as protein structure prediction and classification.

【 授权许可】

Unknown   
© Mohamed et al; licensee BioMed Central Ltd. 2010. This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

【 预 览 】
附件列表
Files Size Format View
RO202311102663414ZK.pdf 1823KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  文献评价指标  
  下载次数:0次 浏览次数:0次