Automated protein function prediction--the genomic challenge

Brief Bioinform. 2006 Sep;7(3):225-42. doi: 10.1093/bib/bbl004. Epub 2006 May 23.


Overwhelmed with genomic data, biologists are facing the first big post-genomic question--what do all genes do? First, not only is the volume of pure sequence and structure data growing, but its diversity is growing as well, leading to a disproportionate growth in the number of uncharacterized gene products. Consequently, established methods of gene and protein annotation, such as homology-based transfer, are annotating less data and in many cases are amplifying existing erroneous annotation. Second, there is a need for a functional annotation which is standardized and machine readable so that function prediction programs could be incorporated into larger workflows. This is problematic due to the subjective and contextual definition of protein function. Third, there is a need to assess the quality of function predictors. Again, the subjectivity of the term 'function' and the various aspects of biological function make this a challenging effort. This article briefly outlines the history of automated protein function prediction and surveys the latest innovations in all three topics.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Algorithms
  • Animals
  • Computational Biology* / methods
  • Databases, Protein*
  • Electronic Data Processing* / methods
  • Genomics* / methods
  • Humans
  • Pattern Recognition, Automated* / methods
  • Proteins / genetics
  • Proteins / metabolism*


  • Proteins