Characteristics of the tomato nuclear genome as determined by sequencing undermethylated EcoRI digested fragments

Theor Appl Genet. 2005 Dec;112(1):72-84. doi: 10.1007/s00122-005-0107-z. Epub 2005 Oct 6.


A collection of 9,990 single-pass nuclear genomic sequences, corresponding to 5 Mb of tomato DNA, were obtained using methylation filtration (MF) strategy and reduced to 7,053 unique undermethylated genomic islands (UGIs) distributed as follows: (1) 59% non-coding sequences, (2) 28% coding sequences, (3) 12% transposons-96% of which are class I retroelements, and (4) 1% organellar sequences integrated into the nuclear genome over the past approximately 100 million years. A more detailed analysis of coding UGIs indicates that the unmethylated portion of tomato genes extends as far as 676 bp upstream and 766 bp downstream of coding regions with an average of 174 and 171 bp, respectively. Based on the analysis of the UGI copy distribution, the undermethylated portion of the tomato genome is determined to account for the majority of the unmethylated genes in the genome and is estimated to constitute 61+/-15 Mb of DNA (approximately 5% of the entire genome)--which is significantly less than the 220 Mb estimated for gene-rich euchromatic arms of the tomato genome. This result indicates that, while most genes reside in the euchromatin, a significant portion of euchromatin is methylated in the intergenic spacer regions. Implications of the results for sequencing the genome of tomato and other solanaceous species are discussed.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Cell Nucleus / metabolism*
  • DNA Methylation
  • Deoxyribonuclease EcoRI / metabolism
  • Genome, Plant*
  • Genomic Islands
  • Molecular Sequence Data
  • Organelles / genetics
  • Sequence Analysis, DNA*
  • Solanum lycopersicum / genetics*


  • Deoxyribonuclease EcoRI