Computational prediction of eukaryotic protein-coding genes

Nat Rev Genet. 2002 Sep;3(9):698-709. doi: 10.1038/nrg890.

Abstract

The human genome sequence is the book of our life. Buried in this large volume are our genes, which are scattered as small DNA fragments throughout the genome and comprise a small percentage of the total text. Finding these indistinct 'needles' in a vast genomic 'haystack' can be extremely challenging. In response to this challenge, computational prediction approaches have proliferated in recent years that predict the location and structure of genes. Here, I discuss these approaches and explain why they have become essential for the analyses of newly sequenced genomes.

Publication types

  • Research Support, U.S. Gov't, P.H.S.
  • Review

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Base Sequence
  • Exons / genetics
  • Gene Expression
  • Genes*
  • Genetic Code
  • Mammals
  • Markov Chains
  • Models, Genetic
  • Proteins / genetics*
  • Sequence Homology, Amino Acid
  • Software

Substances

  • Proteins