Comparative Annotation Toolkit (CAT)-simultaneous clade and personal genome annotation

Genome Res. 2018 Jul;28(7):1029-1038. doi: 10.1101/gr.233460.117. Epub 2018 Jun 8.


The recent introductions of low-cost, long-read, and read-cloud sequencing technologies coupled with intense efforts to develop efficient algorithms have made affordable, high-quality de novo sequence assembly a realistic proposition. The result is an explosion of new, ultracontiguous genome assemblies. To compare these genomes, we need robust methods for genome annotation. We describe the fully open source Comparative Annotation Toolkit (CAT), which provides a flexible way to simultaneously annotate entire clades and identify orthology relationships. We show that CAT can be used to improve annotations on the rat genome, annotate the great apes, annotate a diverse set of mammals, and annotate personal, diploid human genomes. We demonstrate the resulting discovery of novel genes, isoforms, and structural variants-even in genomes as well studied as rat and the great apes-and how these annotations improve cross-species RNA expression experiments.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Genome, Human / genetics*
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Molecular Sequence Annotation / methods
  • RNA / genetics
  • Rats


  • RNA