Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 18 (1), 162-168

Quantifying Homologous Proteins and Proteoforms

Affiliations

Quantifying Homologous Proteins and Proteoforms

Dmitry Malioutov et al. Mol Cell Proteomics.

Abstract

Many proteoforms-arising from alternative splicing, post-translational modifications (PTM), or paralogous genes-have distinct biological functions, such as histone PTM proteoforms. However, their quantification by existing bottom-up mass-spectrometry (MS) methods is undermined by peptide-specific biases. To avoid these biases, we developed and implemented a first-principles model (HIquant) for quantifying proteoform stoichiometries. We characterized when MS data allow inferring proteoform stoichiometries by HIquant and derived an algorithm for optimal inference. We applied this algorithm to infer proteoform stoichiometries in two experimental systems that supported rigorous bench-marking: alkylated proteoforms spiked-in at known ratios and endogenous histone 3 PTM proteoforms quantified relative to internal heavy standards. When compared with the benchmarks, the proteoform stoichiometries interfered by HIquant without using external standards had relative error of 5-15% for simple proteoforms and 20-30% for complex proteoforms. A HIquant server is implemented at: https://web.northeastern.edu/slavov/2014HIquant/.

Keywords: Algorithms; Bioinformatics; Bioinformatics Software; Mass Spectrometry; Mathematical Modeling.

Figures

Fig. 1.
Fig. 1.
Model for inferring stoichiometries among proteoforms and paralogous proteins independently from peptide-specific biases. A, One shared (X2) and three unique (X1, X3, and X4) peptides of H3 proteoforms illustrate a very simple case of HIquant. HIquant models the peptide levels measured across conditions (x) as a supposition of the protein levels (p), scaled by unknown peptide-specific biases/nuisances (z). These coupled equations can be written in a matrix form whose solution infers the methylation stoichiometry independently from the nuisances (z). B, The general form of the model for K proteoforms (or homologous proteins) with M peptides quantified across N conditions can be formulated and solved. In many, albeit not all, cases an optimal and unique solution can be found, even in the absence of unique peptides; see supplemental Fig. S1 and Supplemental Information.
Fig. 2.
Fig. 2.
HIquant accurately quantifies ratios across alkylated proteoforms of a spiked-in standard. A, Schematic diagram of a validation experiment. We prepared a gold standard of proteoforms from the dynamic universal proteomics standard (UPS2) whose cysteines were covalently modified either with iodoacetamide or with vinylpyridine. Upon digestion, these modified UPS proteins generate many shared peptides (peptides not containing cysteine) and a few unique peptides (peptides containing cysteine). The modified UPS2 proteins were mixed with one another at known ratios (n), mixed with yeast lysate, digested and quantified by MS. The proteoform ratios that HIquant inferred from the MS data () were compared with the mixing ratios. B, The ratios across the alkylated isoforms of UPS2 inferred by HIquant (, y axis) accurately reflect the mixing ratios (n, x axis). C, The mixing and inferred ratios in panel B span 2-orders of magnitude, which is much larger than the dynamic range of relative error. To zoom in on the relative errors, we plotted a distribution of log2(n/n̂) for 1, 500 HIquant problems generated by sampling with replacement peptides from all UPS2 proteins. For HIquant, this distribution indicates small error, with median error below 11%. However the ratios estimated just from the precursor intensities of the unique peptides for each proteoform show significantly higher relative error, mostly likely because of peptide-specific variability in digestion and ionization.
Fig. 3.
Fig. 3.
HIquant accurately infers stoichiometries and confidence intervals across PTM site occupancies of histone 3. A, Histone 3 peptides were quantified by SRM across 7 perturbations, and the fractional site occupancies for K4 methylation estimated by two methods: Estimates inferred by HIquant without using external standards are plotted against the corresponding estimates based on MasterMix external standards with known concentrations (29). Each marker shape corresponds to the PTM site(s) shown in the legend; methylation is denoted with “me” and acetylation with “ac” followed by the number of methyl/acetyl groups. B, The validation method from (a) was extended to another set of more complex fractional site occupancies on K9 methylation and K14 acetylation.

Similar articles

See all similar articles

Cited by 4 articles

Publication types

LinkOut - more resources

Feedback