Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Apr 3;14(4):1880-7.
doi: 10.1021/pr501286b. Epub 2015 Mar 11.

Most highly expressed protein-coding genes have a single dominant isoform

Affiliations

Most highly expressed protein-coding genes have a single dominant isoform

Iakes Ezkurdia et al. J Proteome Res. .

Abstract

Although eukaryotic cells express a wide range of alternatively spliced transcripts, it is not clear whether genes tend to express a range of transcripts simultaneously across cells, or produce dominant isoforms in a manner that is either tissue-specific or regardless of tissue. To date, large-scale investigations into the pattern of transcript expression across distinct tissues have produced contradictory results. Here, we attempt to determine whether genes express a dominant splice variant at the protein level. We interrogate peptides from eight large-scale human proteomics experiments and databases and find that there is a single dominant protein isoform, irrespective of tissue or cell type, for the vast majority of the protein-coding genes in these experiments, in partial agreement with the conclusions from the most recent large-scale RNAseq study. Remarkably, the dominant isoforms from the experimental proteomics analyses coincided overwhelmingly with the reference isoforms selected by two completely orthogonal sources, the consensus coding sequence variants, which are agreed upon by separate manual genome curation teams, and the principal isoforms from the APPRIS database, predicted automatically from the conservation of protein sequence, structure, and function.

Keywords: Alternative splicing; Dominant isoforms; Large-scale proteomics; Protein function; Protein structure; RNAseq.

PubMed Disclaimer

Conflict of interest statement

Notes

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
The main proteomics, CCDS, and APPRIS isoforms differ because of the gene model. (A) The 3′ exons from two KIAA1468 transcripts. The arrows highlight the differences in the two transcripts (KIAA1468-001 and KIAA1468-002), inserted exon and exon 18 in KIAA1468-002 and a pair of mutually exclusively spliced exons (exons 21a and 21b). We find peptides for both mutually spliced exons but not for exon 18. Both CCDS and APPRIS select KIAA1468-001 because it does not have exon 18, but there are more peptides for mutually exclusive exon 21b from KIAA1468-002 than for exon 21a. (B) The orthologues found by the APPRIS database that align without gaps to the sequence of KIAA1468-001. (C) The structure of a protein similar to that encoded by KIAA1468, 1B3U. The region coded by the mutually exclusive homologous exons is shown in orange, the region where exon 18 from KIAA1468-002 would produce an insertion is shown in purple.
Figure 2
Figure 2
The main proteomics isoform, the longest isoform, and the 5-fold dominant variants for CRIP2 and PSMD13. (A) Transcripts from the GENCODE gene model of CRIP2. The transcripts selected by the proteomics experiment (CRIP2-001 in orange), the RNAseq experiment (CRIP2-008 in blue), and the longest variant (CRIP2-002 in green) are compared and the number of peptides detected for each isoform are shown in the right. The exons in yellow show the exons for which a 3D structure has been solved. (B) The orthologues found by the APPRIS database that align without gaps to the sequence of CRIP2-001. (C) The structure of the first domain of CRIP2, 2CU8, highly similar to domain 2 of CRIP2. (D) Transcripts from the GENCODE gene model of PSMD13. The transcripts selected by the proteomics experiment (PSMD13-001 in orange), the RNaseq experiment (PSMD13-002 in blue), and the longest variant (PSMD13-003 in green) are compared and the number of peptides detected for each isoform are shown in the right. The nonfilled exons are not translated. (E) Model organism orthologues that align without gaps to the sequence of PSMD13-001 only. (F) The structure of a protein similar to PSMD13-001, 4CR4. The residues that would be coded by nonsense-mediated decay variant PSMD-002, if translated, are shown in blue.

Similar articles

Cited by

References

    1. Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. 2008;40:1413–1415. - PubMed
    1. Wang E, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore S, Schroth G, Burge C. Alternative isoform regulation in human tissue transcriptomes. Nature. 2008;456:470–476. - PMC - PubMed
    1. Harrow J, Denoeud F, Frankish A, Reymond A, Chen CK, Chrast J, Lagarde J, Gilbert JG, Storey R, Swarbreck D, et al. GENCODE: producing a reference annotation for ENCODE. Genome Biol. 2006;7:S4. - PMC - PubMed
    1. Johnson J, Castle J, Garrett-Engele P, Kan Z, Loerch P, Armour C, Santos R, Schadt E, Stoughton R, Shoemaker D. Genome-Wide Survey of Human Alternative Pre-mRNA Splicing with Exon Junction Microarrays. Science. 2003;302:2141–2144. - PubMed
    1. Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, Kokocinski F, Aken BL, Barrell D, Zadissa A, Searle S, et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 2012;22:760–774. - PMC - PubMed

Publication types