Tandem mass spectrometry (MS/MS) encompasses the enzymatic digestion of proteins, usually with trypsin, followed by additional fragmentation of the resulting peptides into ions. The mass spectrum that relates the peak intensity (or abundance) to the mass-to-charge (m/z) of the ions is then used to deduce the sequence of amino acids in the ion and corresponding peptide. Attempts have been made to identify factors that influence the ion peak intensity which have been challenged by high dimensional and multi-factorial nature of the MS/MS data. The objective of this study was to identify and characterize the variables associated with ion intensity and validate the findings on separate data sets which were accomplished by implementing ten-fold cross- validation. Ion intensity measurements from 6,548,340 ion fragments formed from 61,543 peptides corresponding to 7,761 proteins obtained from the National institute of Standards and Technology were analyzed. The ion data set was divided into 10 data sub-sets and a 10-fold cross-validation analysis was undertaken. The identification and characterization of the explanatory variables in each of the 10 training data sets was accomplished by applying linear fixed-effect model framework. A stepwise variable selection approach was used to identify the explanatory variables that were associated with ion intensity. Results from the stepwise analysis were used in a final mixed effects model including protein as a random effect to allow consideration of the covariation between ion fragments from the same protein. Several factors had a significant (p-value < 0.00005) association with ion intensity across all 10 data sets. Charge state of the precursor and resulting fragment ion was associated with ion intensity. The highest intensities were observed both in low charged peptides that produce low charged ions. The numbers of basic amino acids in the peptide and resulting ion were associated with ion intensity. Peptides with no basic amino acids were associated with highest intensities; however ions with lower number of basic amino acids had lower ion intensity relative to ions with one or more basic amino acids. The numbers of Proline (P) on the peptide and resulting ion were also associated with ion intensity. Peptides and ions with lower number of P had been associated with highest ion intensities. Several residues and group of amino acids were consistently associated with ion intensity. The property that was highly significantly associated with intensity most frequently at various locations relative to the N or C termini was residue charge. Residues that have neutral charge proximal to the N terminus were associated with higher intensities. The results from this study expand the understanding on peptide fragmentation patterns and could be used to improve algorithms for peptide identification and simulation in MS/MS experiments.
【 预 览 】
附件列表
Files
Size
Format
View
Multifactorial modeling of ion abundance in tandem mass spectrometry experiments