Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 6 (28), 24797-822

Dissecting tRNA-derived Fragment Complexities Using Personalized Transcriptomes Reveals Novel Fragment Classes and Unexpected Dependencies


Dissecting tRNA-derived Fragment Complexities Using Personalized Transcriptomes Reveals Novel Fragment Classes and Unexpected Dependencies

Aristeidis G Telonis et al. Oncotarget.


We analyzed transcriptomic data from 452 healthy men and women representing five different human populations and two races, and, 311 breast cancer samples from The Cancer Genome Atlas. Our studies revealed numerous constitutive, distinct fragments with overlapping sequences and quantized lengths that persist across dozens of individuals and arise from the genomic loci of all nuclear and mitochondrial human transfer RNAs (tRNAs). Surprisingly, we discovered that the tRNA fragments' length, starting and ending points, and relative abundance depend on gender, population, race and also on amino acid identity, anticodon, genomic locus, tissue, disease, and disease subtype. Moreover, the length distribution of mitochondrially-encoded tRNAs differs from that of nuclearly-encoded tRNAs, and the specifics of these distributions depend on tissue. Notably, tRNA fragments from the same anticodon do not have correlated abundances. We also report on a novel category of tRNA fragments that significantly contribute to the differences we observe across tissues, genders, populations, and races: these fragments, referred to as i-tRFs, are abundant in human tissues, wholly internal to the respective mature tRNA, and can straddle the anticodon. HITS-CLIP data analysis revealed that tRNA fragments are loaded on Argonaute in a cell-dependent manner, suggesting cell-dependent functional roles through the RNA interference pathway. We validated experimentally two i-tRF molecules: the first was found in 21 of 22 tested breast tumor and adjacent normal samples and was differentially abundant between health and disease whereas the second was found in all eight tested breast cancer cell lines.

Keywords: Argonaute; human genome; mitochondrial tRNA; nuclear tRNA; tRNA fragment.

Conflict of interest statement


The authors declare they have no known conflicts of interest in this work.


Figure 1
Figure 1. Atypical tRNA fragment lengths
Fragment lengths in the 452 individuals of the LCL dataset (A-D) and the 311 breast samples (EH). A and E: the length distribution for “internal” fragments only. B and F: the length distribution for 5′-tRFs only. C and G: the length distribution for 3′-tRFs. D and H: the length distribution for all fragments combined. See also text for a detailed explanation of these three shown regions. Error bars are present but barely visible in this Figure and capture standard error across the 452 individuals (A-D) and across the 311 breast samples (E-H). Note the rightmost label of the X-axis in E-H: we opted for this label in order to indicate that the observed 30-mers are likely truncated versions of longer-length fragments.
Figure 2
Figure 2. Distribution of starting position and lengths for i-tRFs
3D graphs showing the starting positions of the internal tRNA fragments, their span and lengths in the LCL (A) and BRCA (B) datasets. The positions are numbered with reference to the +1 position of the mature tRNA. The representative positions for the D- and T-loops as well as for the anticodon loop are highlighted with green boxes. The coloring of each bar is proportional to the relative abundance of each length of the fragments starting at that specific position as indicated by the respective color-key below each graph. The thickness of the projections on the right wall of the graph is proportional to the number of fragments spanning the specific position. For the LCL dataset, only the top 50% most expressed internal fragments are shown.
Figure 3
Figure 3. Uncorrelated abundances
Heatmap of the Pearson correlation coefficient for statistically significant fragments. A: Case of tRNA fragments that arise from the nuclear AspGTC (trna10 on chromosome 12) anticodon in the LCL dataset. B: Case of tRNA fragments, mostly i-tRFs, which arise from the mitochondrial GluTTC anticodon in the BRCA dataset. Several mini-clusters are evident in each heatmap: however, there is correlation across the mini-clusters of the same tRNA (see text for a detailed explanation). Orange-colored labels mark the i-tRFs.
Figure 4
Figure 4. Fragment lengths in the breast datasets
Atypical tRF lengths in normal and tumor breast datasets. A: length distributions for the i-tRFs. B: length distributions for 5′-tRFs only. C: length distribution for the 3′-tRFs. D: length distribution for all the fragments combined. Green curve: normal dataset fragments. Red curve: tumor dataset fragments. For the 19-mer and 30-mer 5′-tRFs as well as for the 20-mer i-tRFs the statistical significance (Mann-Witney U-test; p-val < 10−3) between the normal and the tumor datasets is indicated. Error bars capture standard error across the analyzed groups of datasets.
Figure 5
Figure 5. Dependence on tissue and tissue-state
Looking at tRFs that are present in two tissues we find that they have tissue- and tissue-state specific abundances. A: PCA (unsupervised) of the abundance levels of the 200 tRNA fragments that are common to female LCL datasets and to the normal breast datasets can distinguish between the two tissues. B: PLS-DA (supervised) of the abundance levels of the 437 tRFs found in the BRCA dataset can distinguish between the two groups. See also text.
Figure 6
Figure 6. Dependence on race
Race-dependent abundance profiles for statistically significant tRNA fragments. A: Principal components analysis of fragment expression in LCLs. The CEU population (white) is represented by the yellow points whereas the YRI population (black) is represented by the magenta points. Both men and women from the two populations were included in this analysis. The number next to the label of each axis indicates the amount of variance that the corresponding principal component explains. B: Partial Least Squares – Discriminant Analysis on the tRNA fragments in the 78 triple-negative-breast-cancer datasets. The yellow points represent white patients where the magenta dots represent black patients. See also text for details. C: Relative abundances of 36-mer i-tRFs for the FIN and YRI populations (left panel) and 18-mer 3′-tRFs (middle panel) and 33-mer 3′-tRFs (right panel) for the CEU and YRI datasets. The differences for all three comparisons are statistically significant as indicated by the respective p-value on each graph (Mann-Whitney U-test). Error bars capture the standard error of the relative abundance of each type of fragments for n = 93 (CEU) and n = 95 (YRI) datasets. D: Map of the nucleotides of differentially expressed fragments between the YRI and the CEU populations as projected on the respective mature tRNA. Each base is colored based on the number of distinct fragments containing it. As reference for the LysCTT, the trna10 of this anticodon on chromosome 16 was used. The full list of significantly differentiated fragments between the two populations is included in Supplementary Table S8.
Figure 7
Figure 7. Dependence on gender
Differences in the abundance of tRNA fragments between men and women. A: Detail from the length distributions for YRI men and women for internal fragments. B: Detail from the length distributions for TSI men and women for CCA-ending fragments. The difference in abundance is statistically significant in both comparisons (Mann-Whitney U-test). Error bars in (A) and (B) capture standard error across the analyzed groups of datasets. C: PLS-DA graph of TSI men and TSI women showing a trend for gender-specific tRNA profiles. The important fragments for the projection (VIP score > 1.5) are provided in Supplementary Table S9.
Figure 8
Figure 8. Dependence on disease state
Differences in the tRNA profiles between normal and disease states (in white individuals only). A: PLS-DA graph for the discrimination of normal and triple positive datasets. B: PLS-DA graph for the discrimination of normal and triple negative datasets. C: PLS-DA can also discriminate between the two subtypes. D: The fragments that are important for each separation were used to identify disease subtype-specific abundance changes. The number of fragments with higher or lower abundance is indicated next to each arrow; the number of i-tRFs in each case is shown parenthesized. Each arrow represents a comparison between two groups: the start of the arrow indicates the “control” group compared to which the fragments in the “target” group (end of arrow) have altered abundance. A detailed list of the fragments is given in Supplementary Table S10.
Figure 9
Figure 9. Ago-loading dependence on cell sub-type
Cell-line-specific Ago-loaded tRF profiles. Unsupervised PCA (A) and Hierarchical Clustering (B) discriminated the replicates of two model cell lines for triple negative (MDA-MB-231) and triple positive (BT-474) breast tumors. Kendall's tau coefficient was used as distance metric for the dendrogram in (B).
Figure 10
Figure 10. Internal fragments in breast samples and breast cell lines
Experimental validation of two internal fragments. A: Quantification of the i-tRF from the nuclear AspGTC anticodon in 11 breast tumor and 11 adjacent normal breast samples. N.D.: not determined; in this case, the fragment's expression was too low to be detected. Stars indicate statistically significant changes in abundance (p-val < 0.01; Student's t-test) between the tumor and adjacent normal tissue of the same subject. In all cases there were n = 3 repetitions of the experiments. Error bars show the standard deviation. B: Quantification of the i-tRF from the nuclear GlyTCC anticodon in eight different normal and breast cancer cell lines. Column height represents the average expression value and error bars the standard deviation of at least 10 independent measurements in each sample. On the right hand-side of (A) and (B), the examined fragment is highlighted in red. The anticodon triplet is highlighted by the black box. The genomic coordinates of the depicted AspGTC tRNA are from 125424264 to 125424193, inclusive, on chromosome 12, while for the depicted GlyTCC tRNA are from 8124866 to 8124937, inclusive, on chromosome 17. ER: Estrogen Receptor status, PR: Progesterone Receptor status, HER2: Human Epidermal Growth Factor Receptor 2 status.

Similar articles

See all similar articles

Cited by 41 PubMed Central articles

See all "Cited by" articles


    1. McCarthy JJ, McLeod HL, Ginsburg GS. Genomic medicine: a decade of successes, challenges, and opportunities. Science translational medicine. 2013;5:189sr184. - PubMed
    1. Tyner JW. Functional genomics for personalized cancer therapy. Science translational medicine. 2014;6:243fs226. - PMC - PubMed
    1. Blandino G, Fazi F, Donzelli S, Kedmi M, Sas-Chen A, Muti P, Strano S, Yarden Y. Tumor suppressor microRNAs: a novel non-coding alliance against cancer. FEBS letters. 2014;588:2639–2652. - PubMed
    1. Weinhold N, Jacobsen A, Schultz N, Sander C, Lee W. Genome-wide analysis of noncoding regulatory mutations in cancer. Nature genetics. 2014;46:1160–1165. - PMC - PubMed
    1. Gebetsberger J, Polacek N. Slicing tRNAs to boost functional ncRNA diversity. RNA biology. 2013;10:1798–1806. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources