Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Sep;13(9):777-83.
doi: 10.1038/nmeth.3954. Epub 2016 Aug 1.

TRIC: an automated alignment strategy for reproducible protein quantification in targeted proteomics

Affiliations

TRIC: an automated alignment strategy for reproducible protein quantification in targeted proteomics

Hannes L Röst et al. Nat Methods. 2016 Sep.

Abstract

Next-generation mass spectrometric (MS) techniques such as SWATH-MS have substantially increased the throughput and reproducibility of proteomic analysis, but ensuring consistent quantification of thousands of peptide analytes across multiple liquid chromatography-tandem MS (LC-MS/MS) runs remains a challenging and laborious manual process. To produce highly consistent and quantitatively accurate proteomics data matrices in an automated fashion, we developed TRIC (http://proteomics.ethz.ch/tric/), a software tool that utilizes fragment-ion data to perform cross-run alignment, consistent peak-picking and quantification for high-throughput targeted proteomics. TRIC reduced the identification error compared to a state-of-the-art SWATH-MS analysis without alignment by more than threefold at constant recall while correcting for highly nonlinear chromatographic effects. On a pulsed-SILAC experiment performed on human induced pluripotent stem cells, TRIC was able to automatically align and quantify thousands of light and heavy isotopic peak groups. Thus, TRIC fills a gap in the pipeline for automated analysis of massively parallel targeted proteomics data sets.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing financial interests.

Figures

Figure 1
Figure 1. TRIC: Alignment algorithm for targeted proteomics data.
(a) In a targeted proteomics experiment, each run is typically analyzed individually, giving rise to multiple putative peak groups per run that may not be directly mappable due to chromatographic shifts. (b) The TRIC algorithm selects a set of high-confidence “anchor points” (peptides) for pairwise non-linear alignment and chromatographic distance estimation. (c) Based on the chromatographic distance, an optimal guidance tree (I) is computed (nodes are runs, edges are pairwise alignments). Next (II), the algorithm uses a starting point (1) to transfer identification confidence to nearby runs (iterations 2 and 3) using the guidance tree (III). In an optional last step (IV), runs without suitable peakgroups are re-visited to perform optional noise re-quantification (integration of all fragment ion signal at the aligned position is integrated; orange circles). (d) The confidence transfer step uses a starting peakgroup (top run) to select a narrow region in a neighboring run (gray region in second run) from which a peak gets selected. This procedure is repeated across all runs to identify the correct peak or establish peak boundaries in runs without any analyte signal (bottom run). In a real application, the alignment order may not be linear but follow the guidance tree.
Figure 2
Figure 2. Identification and alignment accuracy of TRIC on manually annotated data.
We used a set of over 7,000 manually validated peakgroups to validate the TRIC algorithm. (a) FDR-Recall plot displaying recall rate versus the false discovery rate allows evaluation of the performance of TRIC compared to the naïve approach of using a fixed q-value cutoff applied to each run individually. As mis-classified peaks cannot be recovered even at high score cutoffs, a recall of 100 % cannot be reached. (b) Error rates at reported FDR cutoffs of 1 % for the naïve approach and TRIC without RT alignment (None), linear alignment (Linear) and non-linear k-nearest neighbor alignment (LLD). (c) The error of reported retention times are plotted without (top) and with (bottom) non-linear alignment on a sample run. (d) The cumulative fraction of peaks having less than a given error in retention time is plotted. TRIC with k-nearest neighbor smoothing (LLD) achieves high peak counts at low RT errors and outperforms linear or no alignment.
Figure 3
Figure 3. Analysis of a microbial dataset investigating S. pyogenes virulence.
A dataset of 12 runs of S. pyogenes exposed to human plasma was analyzed with TRIC. (a) The data matrix occupancy is higher after alignment with TRIC (fewer missing values are observed). (b) The computed guidance tree captures orthogonal information to injection order (root mean square deviation between runs is indicated for each edge). Control samples are in blue and plasma-exposed samples are in red (note that the tree is substantially different from injection order as samples were shot in three batches: R1, R2 and R3). (c) Number of precursors appearing in a specific number of runs before (left; N=95 685) and after (right; N=120 348) running TRIC; fully aligned precursors increased by 39 % while precursors found in only a single run decreased by 93.7 %. (d) The cumulative number of the number of peptides quantified using a fixed 0.01 q-value cutoff without alignment (left) and after applying TRIC and a minimal q-value cutoff of 0.0015 (right). While TRIC decreased the variance of the number of identifications across runs, the cumulative number of peptides also saturates more quickly indicating less accumulation of false positive identifications.
Figure 4
Figure 4. Pulsed-SILAC experiment performed on human iPSCs.
A human iPS cell line was exposed to a pulse of heavy amino acids and sampled at four time points in duplicates (see Panel b). (a) The RT difference between the light and heavy signal as a function of the intensity. Aligned values reported by TRIC (in red) have lower intensity and higher RT error (distribution on top only displays values below 104 in intensity) (c) Standard deviation of the RT difference between heavy and light pairs with and without TRIC alignment. For the analysis without alignment, a simple FDR cutoff was applied (naïve approach). Alignment increases the number of quantified SILAC pairs at the cost of slightly higher variance. Pairs from both replicates are aggregated. No heavy-light pairs are expected at t=0 as heavy amino acids were added afterwards. (d) The number of isotopic SILAC pairs quantified per sample increases through the TRIC alignment, especially for the earlier time points with little heavy isotope signal. For each timepoint, average values across two replicates are shown with standard deviation.
Figure 5
Figure 5. Protein turnover rates in human iPSCs.
Targeted proteomics analysis of protein turnover in human iPSCs. (a) Relative isotopic abundance (RIA) is plotted for an example protein, Importin Alpha, with 5 peptides (dashed lines). The median of all decay curves fitted through 1.0 at timepoint zero for all peptides estimates protein-level kloss. (b) Global protein turnover rates are estimated after correction for protein dilution. (c) Proteins in GO category “cell adhesion proteins” show significantly higher turnover than expected (p <10−7). All peptides of the respective proteins exhibit substantially higher degradation rates than the base distribution (shown on the very right). Only proteins with two or more peptides are shown (box indicates first and third quartile with median shown in black; whiskers extend to the most extreme data point which is no more than 1.5 times the length of the box away from the box).

Similar articles

Cited by

References

    1. Hudson TJ, et al. International network of cancer genome projects. Nature. 2010;464:993–998. - PMC - PubMed
    1. McLendon R, et al. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008;455:1061–1068. - PMC - PubMed
    1. Alexandrov LB, et al. Signatures of mutational processes in human cancer. Nature. 2013;500:415–421. - PMC - PubMed
    1. Haines JL, et al. Complement factor H variant increases the risk of age-related macular degeneration. Science. 2005;308:419–421. - PubMed
    1. International Consortium for Blood Pressure Genome-Wide Association Studies and others. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature. 2011;478:103–109. - PMC - PubMed

MeSH terms

LinkOut - more resources