SUPERFAMILY 1.75 including a domain-centric gene ontology method

Nucleic Acids Res. 2011 Jan;39(Database issue):D427-34. doi: 10.1093/nar/gkq1130. Epub 2010 Nov 9.

Abstract

The SUPERFAMILY resource provides protein domain assignments at the structural classification of protein (SCOP) superfamily level for over 1400 completely sequenced genomes, over 120 metagenomes and other gene collections such as UniProt. All models and assignments are available to browse and download at http://supfam.org. A new hidden Markov model library based on SCOP 1.75 has been created and a previously ignored class of SCOP, coiled coils, is now included. Our scoring component now uses HMMER3, which is in orders of magnitude faster and produces superior results. A cloud-based pipeline was implemented and is publicly available at Amazon web services elastic computer cloud. The SUPERFAMILY reference tree of life has been improved allowing the user to highlight a chosen superfamily, family or domain architecture on the tree of life. The most significant advance in SUPERFAMILY is that now it contains a domain-based gene ontology (GO) at the superfamily and family levels. A new methodology was developed to ensure a high quality GO annotation. The new methodology is general purpose and has been used to produce domain-based phenotypic ontologies in addition to GO.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Protein*
  • Genes
  • Phenotype
  • Phylogeny
  • Protein Structure, Tertiary*
  • Proteins / chemistry
  • Proteins / classification*
  • Proteins / genetics
  • Sequence Analysis, Protein
  • Software

Substances

  • Proteins