APPRIS 2017: principal isoforms for multiple gene sets

Nucleic Acids Res. 2018 Jan 4;46(D1):D213-D217. doi: 10.1093/nar/gkx997.


The APPRIS database ( uses protein structural and functional features and information from cross-species conservation to annotate splice isoforms in protein-coding genes. APPRIS selects a single protein isoform, the 'principal' isoform, as the reference for each gene based on these annotations. A single main splice isoform reflects the biological reality for most protein coding genes and APPRIS principal isoforms are the best predictors of these main proteins isoforms. Here, we present the updates to the database, new developments that include the addition of three new species (chimpanzee, Drosophila melangaster and Caenorhabditis elegans), the expansion of APPRIS to cover the RefSeq gene set and the UniProtKB proteome for six species and refinements in the core methods that make up the annotation pipeline. In addition APPRIS now provides a measure of reliability for individual principal isoforms and updates with each release of the GENCODE/Ensembl and RefSeq reference sets. The individual GENCODE/Ensembl, RefSeq and UniProtKB reference gene sets for six organisms have been merged to produce common sets of splice variants.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing
  • Amino Acid Sequence
  • Animals
  • Databases, Genetic*
  • Humans
  • Models, Molecular
  • Molecular Sequence Annotation
  • Protein Conformation
  • Protein Isoforms / chemistry
  • Protein Isoforms / genetics*
  • Proteome / genetics
  • Reproducibility of Results
  • Sequence Alignment


  • Protein Isoforms
  • Proteome