The clinical importance of tandem exon duplication-derived substitutions

Nucleic Acids Res. 2021 Aug 20;49(14):8232-8246. doi: 10.1093/nar/gkab623.


Most coding genes in the human genome are annotated with multiple alternative transcripts. However, clear evidence for the functional relevance of the protein isoforms produced by these alternative transcripts is often hard to find. Alternative isoforms generated from tandem exon duplication-derived substitutions are an exception. These splice events are rare, but have important functional consequences. Here, we have catalogued the 236 tandem exon duplication-derived substitutions annotated in the GENCODE human reference set. We find that more than 90% of the events have a last common ancestor in teleost fish, so are at least 425 million years old, and twenty-one can be traced back to the Bilateria clade. Alternative isoforms generated from tandem exon duplication-derived substitutions also have significantly more clinical impact than other alternative isoforms. Tandem exon duplication-derived substitutions have >25 times as many pathogenic and likely pathogenic mutations as other alternative events. Tandem exon duplication-derived substitutions appear to have vital functional roles in the cell and may have played a prominent part in metazoan evolution.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Alternative Splicing / genetics
  • Animals
  • Evolution, Molecular*
  • Exons / genetics
  • Fishes / genetics*
  • Gene Duplication / genetics
  • Genome, Human / genetics*
  • Humans
  • Molecular Sequence Annotation
  • Protein Isoforms / genetics*
  • Sequence Alignment


  • Protein Isoforms