| BMC Bioinformatics | |
| A cross-validation scheme for machine learning algorithms in shotgun proteomics | |
| Research | |
| William Stafford Noble1  Viktor Granholm2  Lukas Käll3  | |
| [1] Department of Genome Sciences, University of Washington, Seattle, USA;Department of Computer Science and Engineering, University of Washington, Seattle, USA;Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden;Science for Life Laboratory, School of Biotechnology, KTH - Royal Institute of Technology, Solna, Sweden; | |
| 关键词: Machine Learning Algorithm; Shotgun Proteomics; Decoy Database; Estimate Error Rate; Incorrect Match; | |
| DOI : 10.1186/1471-2105-13-S16-S3 | |
| 来源: Springer | |
PDF
|
|
【 摘 要 】
Peptides are routinely identified from mass spectrometry-based proteomics experiments by matching observed spectra to peptides derived from protein databases. The error rates of these identifications can be estimated by target-decoy analysis, which involves matching spectra to shuffled or reversed peptides. Besides estimating error rates, decoy searches can be used by semi-supervised machine learning algorithms to increase the number of confidently identified peptides. As for all machine learning algorithms, however, the results must be validated to avoid issues such as overfitting or biased learning, which would produce unreliable peptide identifications. Here, we discuss how the target-decoy method is employed in machine learning for shotgun proteomics, focusing on how the results can be validated by cross-validation, a frequently used validation scheme in machine learning. We also use simulated data to demonstrate the proposed cross-validation scheme's ability to detect overfitting.
【 授权许可】
Unknown
© Granholm et al.; licensee BioMed Central Ltd. 2012. This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202311105389986ZK.pdf | 2001KB |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
PDF