Intrinsic errors in genome annotation

Trends Genet. 2001 Aug;17(8):429-31. doi: 10.1016/s0168-9525(01)02348-4.


Genome sequencing is usually followed by routine annotation of protein function based on the assumption that similar sequences will have similar functions. Here, we introduce a simple calculation to estimate the magnitude of any possible annotation errors. We counted the number of discrepancies in the annotation of well-established sets of similar proteins and extrapolated these values to the pairs of similar sequences used for the annotation of different microbial genomes. We conclude that the number of potential errors in the prediction of detailed functions is higher than is usually believed.

MeSH terms

  • Binding Sites
  • Databases, Factual*
  • Genome*
  • Haemophilus influenzae / genetics
  • Methanococcus / genetics
  • Mycoplasma / genetics
  • Reproducibility of Results
  • Software