科技报告详细信息
Reducing the Cost of Protein Identifications from Mass Spectrometry Databases
Logan, B. ; Kontothanassis, L. ; Goddeau, D. ; Moreno, P.J. ; Hookway, R. ; Sarracino, D.
HP Development Company
关键词: mass spectrometry;    machine learning;    workflow management;    noise filtering;   
RP-ID  :  HPL-2004-139
学科分类:计算机科学(综合)
美国|英语
来源: HP Labs
PDF
【 摘 要 】

We present two techniques to improve the computational efficiency of protein discovery from mass spectrometry databases: noise filtering and hierarchical searching. Our approaches are orthogonal to existing algorithms and are based on the observation that typical mass spectrometry data contains a large amount of noise that can lead to wasteful computation. Our first improvement uses standard machine learning techniques with novel feature vectors derived from the mass spectra to identify and filter the noisy spectra. We demonstrate this approach results in computational gains of around 38% with less than 10% loss of peptides. Additionally we present a hierarchical searching scheme in which most samples are matched against a small database at low computational cost, leaving only a small number of samples to be searched against larger databases. Combining this scheme with the machine learning filters leads to a further performance improvement of 3%. Notes: Copyright IEEE. To be published in and presented at the IEEE Engineering in Medicine and Biology Society Conference (EMBS), 1-5 September 2004, San Francisco, CA 6 Pages

【 预 览 】
附件列表
Files Size Format View
RO201804100000995LZ 90KB PDF download
  文献评价指标  
  下载次数:9次 浏览次数:22次