Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 485 (7400), 635-41

The Tomato Genome Sequence Provides Insights Into Fleshy Fruit Evolution


The Tomato Genome Sequence Provides Insights Into Fleshy Fruit Evolution

Tomato Genome Consortium. Nature.


Tomato (Solanum lycopersicum) is a major crop plant and a model system for fruit development. Solanum is one of the largest angiosperm genera and includes annual and perennial plants from diverse habitats. Here we present a high-quality genome sequence of domesticated tomato, a draft sequence of its closest wild relative, Solanum pimpinellifolium, and compare them to each other and to the potato genome (Solanum tuberosum). The two tomato genomes show only 0.6% nucleotide divergence and signs of recent admixture, but show more than 8% divergence from potato, with nine large and several smaller inversions. In contrast to Arabidopsis, but similar to soybean, tomato and potato small RNAs map predominantly to gene-rich chromosomal regions, including gene promoters. The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness.


Figure 1
Figure 1
A. Multi-dimensional topography of tomato chromosome 1 (chromosomes 2-12 are shown in Supplementary Figure 1). (a) Left: contrast-reversed, DAPI-stained pachytene chromosome; centre and right: FISH signals for repeat sequences on diagrammatic pachytene chromosomes: TGR1 purple, TGR4 blue, telomere repeat red, Cot 100 DNA (including most repeats) green. (b) Frequency distribution of recombination nodules representing crossovers on 249 chromosomes. Red stars mark 5 cM intervals starting from the end of the short arm (top). Scale is in micrometers. (c) FISH-based locations of selected BACs (horizontal blue lines on left). (d) Kazusa F2-2000 linkage map. Blue lines to the left connect linkage map markers on the (c) BAC-FISH map, (e) heat maps and (f) DNA pseudomolecule. (e) From left to right: linkage map distance (cM/Mb, turquoise); repeated sequences (% nucleotides/500 kb, purple); genes (% nucleotides/500 kb, blue); chloroplast insertions; RNA-Seq reads from leaves and breaker fruits of S. lycopersicum and S. pimpinellifolium (number of reads/500 kb, green and red, respectively); microRNA genes (transcripts per million/500 kb, black); small RNAs (thin horizontal black lines, sum of hits-normalized abundances). Horizontal grey lines represent gaps in the pseudomolecule (f). (f) DNA pseudomolecule consisting of nine scaffolds. Unsequenced gaps (approximately 9.8 Mb, Supplementary Table 13) are indicated by white horizontal lines. Tomato genes identified by map-based cloning (Supplementary Table 14) are indicated on the right. For more details, see legend to Supplementary Figure 1. B. Syntenic relationships in the Solanaceae. COSII-based comparative maps of potato, eggplant, pepper and Nicotiana with respect to the tomato genome (Supplementary section 4.5, Supplementary Fig. 14). Each tomato chromosome is assigned a different colour and orthologous chromosome segment(s) in other species are shown in the same colour. White dots indicate approximate centromere locations. Each black arrow indicates an inversion relative to tomato and “+1”indicates a minimum of one inversion. Each black bar beside a chromosome indicates translocation breakpoints relative to tomato. Chromosome lengths are not to scale, but segments within chromosomes are. C. Tomato-potato syntenic relationships. Dot plot of tomato and potato genomic sequences based on collinear blocks Supplementary Section 4.1). Red and blue dots represent gene pairs with statistically significant high and low ω (Ka/Ks) in collinear blocks, which average Ks≤0.5, respectively. Green and magenta dots represent genes in collinear blocks which average 0.5<Ks≤1.5 and Ks>1.5, respectively. Yellow dots represent all other gene pairs. Blocks circled in red are examples of pan-eudicot triplication. Inserts represent schematic drawings of BAC-FISH patterns of cytologically demonstrated chromosome inversions (also in Supplementary Fig. 15).
Figure 2
Figure 2. The Solanum whole genome triplication
A. Based on alignments of multiple tomato genome segments to single grape genome segments, the tomato genome is partitioned into three non-overlapping ‘subgenomes’ (T1, T2, T3), each represented by one axis in the 3D plot. The ancestral gene order of each subgenome is inferred according to orthologous grape regions, with tomato chromosomal affinities shown by red-shaded (inner) bars. Segments tracing to pan-eudicot triplication (γ) are shown by green-shaded (outer) bars with colours representing the seven putative pre-γ eudicot ancestral chromosomes, also coded a-g. B. Speciation and polyploidisation in eudicot lineages. Confirmed whole-genome duplications and triplications are shown with annotated circles, including “T” (this paper) and previously discovered events α, β, γ,,. Dashed circles represent one or more suspected polyploidies reported in previous publications that need further support from genome assemblies,. Grey branches indicate unpublished genomes. Black and red error bars bracket, respectively, the likely timings of divergence of major asterid lineages and of “T”. The post-“T” subgenomes, designated T1, T2, and T3, are further detailed in Supplementary Fig. 10.
Figure 3
Figure 3. Whole genome triplications set the stage for fruit-specific gene neofunctionalisation
The genes shown represent a fruit ripening control network regulated by transcription factors (MADS-RIN, CNR) necessary for production of the ripening hormone ethylene, the production of which is regulated by ACC synthase (ACS). Ethylene interacts with ethylene receptors (ETRs) to drive expression changes in output genes, including phytoene synthase (PSY), the rate-limiting step in carotenoid biosynthesis. Light, acting through phytochromes, controls fruit pigmentation through an ethylene-independent pathway. Paralogous gene pairs with different physiological roles (MADS1/RIN, PHYB1/PHYB2, ACS2/ACS6, ETR3/ETR4, PSY1/PSY2), were generated during the eudicot (γ, black circle) or the more recent, Solanum (T, red circle) triplications. Complete dendrograms of the respective protein families are shown in Supplementary Figures 16 and 17.
Figure 4
Figure 4. The tomato genome allows systems approaches to fruit biology
A. Xyloglucan transglucosylase-hydrolases (XTHs) differentially expressed between mature green and ripe fruits (Supplementary Section 5.7). These XTH genes and many others are expressed in ripening fruits and are linked with the Solanum triplication, marked with a red circle on the phylogenetic tree. Red lines on the tree denote paralogs derived from the Solanum triplication, and blue lines are tandem duplications. B. Developmentally regulated accumulation of sRNAs mapping to the promoter region of a fruit-regulated cell wall gene (Pectin acetylesterase, Solyc08g005800). Variation of abundance of sRNAs (left) and mRNA expression levels from the corresponding gene (right) over a tomato fruit developmental series (T1 – bud, T2 – flower, T3 – fruit 1- 3mm, T4 – fruit 5-7mm, T5 – fruit 11-13mm, T6 – fruit mature green, T7 – breaker, T8 – breaker+3days, T9 – breaker+7days). The promoter regions are grouped in 100nt windows. For each window the size class distribution of sRNAs is shown (21 – red, 22 – green, 23 – orange, 24 – blue). The height of the box corresponding to the first time point shows the cumulative sRNA abundance in log scale. The height of the following boxes is proportional to the log offset fold change (offset = 20) relative to the first time point. The expression profile of the mRNA is shown in log2 scale.

Similar articles

See all similar articles

Cited by 903 PubMed Central articles

See all "Cited by" articles


    1. Frodin DG. History and concepts of big plant genera. Taxon. 2004;53:753–776.
    1. Peralta IE, Knapp S, Spooner DM. Taxonomy of tomatoes: A revision of wild tomatoes (Solanum section Lycopersicon) and their outgroup relatives in sections Juglandifolia and Lycopersicoides. Systematic Botany Monographs. 2008;84:1–186.
    1. Michaelson MJ, Price HJ, Ellison JR, Johnston JS. Comparison of plant DNA contents determined by Feulgen microspectrophotometry and laser flow cytometry. American journal of botany. 1991:183–188.
    1. The_Arabidopsis_Genome_Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408:796–815. - PubMed
    1. Paterson AH, et al. The Sorghum bicolor genome and the diversification of grasses. Nature. 2009;457:551–556. - PubMed

Publication types