期刊论文详细信息
BMC Bioinformatics
BLProt: prediction of bioluminescent proteins based on support vector machine and relieff feature selection
Research Article
Ganesan Pugalenthi1  Kai-Uwe Kalies2  Mehrnaz Khodam Hazrati3  Thomas Martinetz4  Krishna Kumar Kandaswamy5 
[1] Bioinformatics Group, Bioscience Core Lab, King Abdullah University of Science and Technology (KAUST), Kingdom of Saudi Arabia;Centre for Structural and Cell Biology in Medicine, Institute of Biology, University of Lübeck, Germany;Graduate School for Computing in Medicine and Life Sciences, University of Lübeck, 23538, Lübeck, Germany;Institute for Signal Processing, University of Lübeck, 23538, Lübeck, Germany;Institute for Neuro- and Bioinformatics, University of Lübeck, 23538, Lübeck, Germany;Institute for Neuro- and Bioinformatics, University of Lübeck, 23538, Lübeck, Germany;Graduate School for Computing in Medicine and Life Sciences, University of Lübeck, 23538, Lübeck, Germany;
关键词: Support Vector Machine;    Support Vector Machine Classifier;    Feature Subset;    Training Error;    Luciferin;   
DOI  :  10.1186/1471-2105-12-345
 received in 2010-11-22, accepted in 2011-08-17,  发布年份 2011
来源: Springer
PDF
【 摘 要 】

BackgroundBioluminescence is a process in which light is emitted by a living organism. Most creatures that emit light are sea creatures, but some insects, plants, fungi etc, also emit light. The biotechnological application of bioluminescence has become routine and is considered essential for many medical and general technological advances. Identification of bioluminescent proteins is more challenging due to their poor similarity in sequence. So far, no specific method has been reported to identify bioluminescent proteins from primary sequence.ResultsIn this paper, we propose a novel predictive method that uses a Support Vector Machine (SVM) and physicochemical properties to predict bioluminescent proteins. BLProt was trained using a dataset consisting of 300 bioluminescent proteins and 300 non-bioluminescent proteins, and evaluated by an independent set of 141 bioluminescent proteins and 18202 non-bioluminescent proteins. To identify the most prominent features, we carried out feature selection with three different filter approaches, ReliefF, infogain, and mRMR. We selected five different feature subsets by decreasing the number of features, and the performance of each feature subset was evaluated.ConclusionBLProt achieves 80% accuracy from training (5 fold cross-validations) and 80.06% accuracy from testing. The performance of BLProt was compared with BLAST and HMM. High prediction accuracy and successful prediction of hypothetical proteins suggests that BLProt can be a useful approach to identify bioluminescent proteins from sequence information, irrespective of their sequence similarity. The BLProt software is available at http://www.inb.uni-luebeck.de/tools-demos/bioluminescent%20protein/BLProt

【 授权许可】

CC BY   
© Kandaswamy et al; licensee BioMed Central Ltd. 2011

【 预 览 】
附件列表
Files Size Format View
RO202311096036636ZK.pdf 663KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  文献评价指标  
  下载次数:1次 浏览次数:0次