期刊论文详细信息
BMC Genomics
Whole human genome proteogenomic mapping for ENCODE cell line data: identifying protein-coding regions
Research Article
Jainab Khatun1  Wendy J Spitzer1  Brian A Risk1  Ashley Secrest1  Morgan C Giddings2  Li Wang3  John A Wrobel3  Ling Xie3  Yanbao Yu3  Xian Chen4  Harsha P Gunawardena4 
[1] College of Arts and Sciences, Boise State University, Boise, ID, USA;College of Arts and Sciences, Boise State University, Boise, ID, USA;Department of Biochemistry & Biophysics, UNC School of Medicine, Chapel Hill, NC, USA;Department of Biochemistry & Biophysics, UNC School of Medicine, Chapel Hill, NC, USA;Department of Biochemistry & Biophysics, UNC School of Medicine, Chapel Hill, NC, USA;Program in Molecular Biology & Biotechnology, UNC School of Medicine, Chapel Hill, NC, USA;
关键词: Proteogenomic mapping;    MS/MS spectra;    Genome annotation;    Proteomics;    Genomics;   
DOI  :  10.1186/1471-2164-14-141
 received in 2012-08-10, accepted in 2013-02-22,  发布年份 2013
来源: Springer
PDF
【 摘 要 】

BackgroundProteogenomic mapping is an approach that uses mass spectrometry data from proteins to directly map protein-coding genes and could aid in locating translational regions in the human genome. In concert with the ENcyclopedia of DNA Elements (ENCODE) project, we applied proteogenomic mapping to produce proteogenomic tracks for the UCSC Genome Browser, to explore which putative translational regions may be missing from the human genome.ResultsWe generated ~1 million high-resolution tandem mass (MS/MS) spectra for Tier 1 ENCODE cell lines K562 and GM12878 and mapped them against the UCSC hg19 human genome, and the GENCODE V7 annotated protein and transcript sets. We then compared the results from the three searches to identify the best-matching peptide for each MS/MS spectrum, thereby increasing the confidence of the putative new protein-coding regions found via the whole genome search. At a 1% false discovery rate, we identified 26,472, 24,406, and 13,128 peptides from the protein, transcript, and whole genome searches, respectively; of these, 481 were found solely via the whole genome search. The proteogenomic mapping data are available on the UCSC Genome Browser at http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeUncBsuProt.ConclusionsThe whole genome search revealed that ~4% of the uniquely mapping identified peptides were located outside GENCODE V7 annotated exons. The comparison of the results from the disparate searches also identified 15% more spectra than would have been found solely from a protein database search. Therefore, whole genome proteogenomic mapping is a complementary method for genome annotation when performed in conjunction with other searches.

【 授权许可】

CC BY   
© Khatun et al.; licensee BioMed Central Ltd. 2013

【 预 览 】
附件列表
Files Size Format View
RO202311103606883ZK.pdf 1221KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  文献评价指标  
  下载次数:4次 浏览次数:0次