Survey of current protein family databases and their application in comparative, structural and functional genomics

J Chromatogr B Analyt Technol Biomed Life Sci. 2005 Feb 5;815(1-2):97-107. doi: 10.1016/j.jchromb.2004.11.010.

Abstract

The last two decades have witnessed significant expansions in the databases storing information on the sequences and structures of proteins. This has led to the creation of many excellent protein family resources, which classify proteins according to their evolutionary relationship. These have allowed extensive insights into evolution and particularly how protein function mutates and evolves over time. Such analyses have greatly assisted the inheritance of functional annotations between experimentally characterised and uncharacterised genes. Moreover, the development of bioinformatics tools acts as a companion to the new technologies emerging in biology, such as transcriptomics and proteomics. The latter enable researchers to analyse gene expression profiles and interactions on a genome-wide scale, generating vast datasets of proteins, many of which include experimentally uncharacterised proteins. Protein family/function databases can be used to help interpret this data and allow us to benefit more fully from these technologies. This review aims to summarise the most popular sequence- and structure-based protein family databases. We also cover their application to comparative genomics and the functional annotation of the genomes.

Publication types

  • Review

MeSH terms

  • Biological Evolution
  • Databases, Protein*
  • Genomics / methods*
  • Protein Structure, Tertiary*
  • Proteins* / classification
  • Proteins* / genetics

Substances

  • Proteins