期刊论文详细信息
BMC Bioinformatics
PGA: an R/Bioconductor package for identification of novel peptides using a customized database derived from RNA-Seq
Software
Xin Liu1  Shaohang Xu1  Ruo Zhou1  Bo Wen1  Xun Xu1  Siqi Liu2  Bing Zhang3  Xiaojing Wang3 
[1] BGI-Shenzhen, 518083, Shenzhen, China;BGI-Shenzhen, 518083, Shenzhen, China;Beijing Institute of Genomics, Chinese Academy of Sciences, 100101, Beijing, China;Department of Biomedical Informatics, Vanderbilt University School of Medicine, 37232, Nashville, TN, USA;
关键词: Proteomics;    RNA-Seq;    MS/MS;    Peptide identification;    Proteogenomics;   
DOI  :  10.1186/s12859-016-1133-3
 received in 2015-06-27, accepted in 2016-06-09,  发布年份 2016
来源: Springer
PDF
【 摘 要 】

BackgroundPeptide identification based upon mass spectrometry (MS) is generally achieved by comparison of the experimental mass spectra with the theoretically digested peptides derived from a reference protein database. Obviously, this strategy could not identify peptide and protein sequences that are absent from a reference database. A customized protein database on the basis of RNA-Seq data is thus proposed to assist with and improve the identification of novel peptides. Correspondingly, development of a comprehensive pipeline, which provides an end-to-end solution for novel peptide detection with the customized protein database, is necessary.ResultsA pipeline with an R package, assigned as a PGA utility, was developed that enables automated treatment to the tandem mass spectrometry (MS/MS) data acquired from different MS platforms and construction of customized protein databases based on RNA-Seq data with or without a reference genome guide. Hence, PGA can identify novel peptides and generate an HTML-based report with a visualized interface. On the basis of a published dataset, PGA was employed to identify peptides, resulting in 636 novel peptides, including 510 single amino acid polymorphism (SAP) peptides, 2 INDEL peptides, 49 splice junction peptides, and 75 novel transcript-derived peptides. The software is freely available from http://bioconductor.org/packages/PGA/, and the example reports are available at http://wenbostar.github.io/PGA/.ConclusionsThe pipeline of PGA, aimed at being platform-independent and easy-to-use, was successfully developed and shown to be capable of identifying novel peptides by searching the customized protein database derived from RNA-Seq data.

【 授权许可】

CC BY   
© The Author(s). 2016

【 预 览 】
附件列表
Files Size Format View
RO202311101487519ZK.pdf 699KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  文献评价指标  
  下载次数:9次 浏览次数:1次