Gene3D: merging structure and function for a Thousand genomes

Nucleic Acids Res. 2010 Jan;38(Database issue):D296-300. doi: 10.1093/nar/gkp987. Epub 2009 Nov 11.

Abstract

Over the last 2 years the Gene3D resource has been significantly improved, and is now more accurate and with a much richer interactive display via the Gene3D website (http://gene3d.biochem.ucl.ac.uk/). Gene3D provides accurate structural domain family assignments for over 1100 genomes and nearly 10,000,000 proteins. A hidden Markov model library, constructed from the manually curated CATH structural domain hierarchy, is used to search UniProt, RefSeq and Ensembl protein sequences. The resulting matches are refined into simple multi-domain architectures using a recently developed in-house algorithm, DomainFinder 3 (available at: ftp://ftp.biochem.ucl.ac.uk/pub/gene3d_data/DomainFinder3/). The domain assignments are integrated with multiple external protein function descriptions (e.g. Gene Ontology and KEGG), structural annotations (e.g. coiled coils, disordered regions and sequence polymorphisms) and family resources (e.g. Pfam and eggNog) and displayed on the Gene3D website. The website allows users to view descriptions for both single proteins and genes and large protein sets, such as superfamilies or genomes. Subsets can then be selected for detailed investigation or associated functions and interactions can be used to expand explorations to new proteins. Gene3D also provides a set of services, including an interactive genome coverage graph visualizer, DAS annotation resources, sequence search facilities and SOAP services.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Computational Biology / methods*
  • Computational Biology / trends
  • Databases, Genetic*
  • Databases, Nucleic Acid*
  • Databases, Protein
  • Genome, Archaeal
  • Genome, Bacterial
  • Genome, Viral
  • Humans
  • Information Storage and Retrieval / methods
  • Internet
  • Markov Chains
  • Protein Structure, Tertiary
  • Sequence Analysis, DNA
  • Software