Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Nov 29;491(7426):705-10.
doi: 10.1038/nature11650.

Analysis of the Bread Wheat Genome Using Whole-Genome Shotgun Sequencing

Free PMC article

Analysis of the Bread Wheat Genome Using Whole-Genome Shotgun Sequencing

Rachel Brenchley et al. Nature. .
Free PMC article


Bread wheat (Triticum aestivum) is a globally important crop, accounting for 20 per cent of the calories consumed by humans. Major efforts are underway worldwide to increase wheat production by extending genetic diversity and analysing key traits, and genomic resources can accelerate progress. But so far the very large size and polyploid complexity of the bread wheat genome have been substantial barriers to genome analysis. Here we report the sequencing of its large, 17-gigabase-pair, hexaploid genome using 454 pyrosequencing, and comparison of this with the sequences of diploid ancestral and progenitor genomes. We identified between 94,000 and 96,000 genes, and assigned two-thirds to the three component genomes (A, B and D) of hexaploid wheat. High-resolution synteny maps identified many small disruptions to conserved gene order. We show that the hexaploid genome is highly dynamic, with significant loss of gene family members on polyploidization and domestication, and an abundance of gene fragments. Several classes of genes involved in energy harvesting, metabolism and growth are among expanded gene families that could be associated with crop productivity. Our analyses, coupled with the identification of extensive genetic variation, provide a resource for accelerating gene discovery and improving this major crop.


Figure 1
Figure 1. Coverage of OG Representatives by wheat 454 sequence reads and simulated 454 reads from rice and maize
a. Coverage of OG Representatives by repeat-masked wheat 454 sequence reads (black line), wheat LCG (black dashed line), the OA (blue line), together with rice genes (red line) and maize simulated reads (green line). b. Median coverage depth over protein coding regions of OG Representatives (N terminus = 0; C = 100). The colour coding is the same as in panel 1A, except simulated hexaploid reads from rice (red line) were used. c. The distribution of wheat gene copy numbers from the OA.
Figure 2
Figure 2. Alignment of wheat 454 reads, SNPs and genetic maps to the Brachypodium distachyon genome
The inner circle represent gene order on the 5 Brachypodium chromosomes. Track 1 illustrates conservation between wheat 454 reads and Brachypodium genes, shown as a window of genes present in wheat. Tracks 2-4 show SNP density (the mean number of SNPs per gene in a window of 20 genes) in the A (track 2), B (track 3) and D (track 4) genomes of wheat. Tracks 5-7 display wheat synteny with Brachypodium for the A (track 5) B (track 6) and D (track 7) genomes. Genetic markers(shown in darker colours) were colour-coded by wheat chromosome. Gaps between markers were filled in to show synteny (lighter colours).
Figure 3
Figure 3. Gene family sizes in orthologous assemblies of hexaploid wheat, Ae. tauschii, simulated maize and hexaploid rice
The boxes and whiskers contain 50% and 90% of the OA genes respectively. The box colours indicate the number of genes in diploid gene families of different sizes (x axis). The black lines represent expected gene family sizes; the red line is the gene family size determined from the OA, derived by polynomial regression fit. Only gene families with up to ten members are shown. a. Maize gene family sizes predicted from orthologous assembly of simulated 454 reads. b. Rice gene family sizes predicted from orthologous assembly of simulated 454 reads derived from triplicated rice genes. c. Ae. tauschii gene family sizes obtained from orthologous assembly of repeat-masked 454 reads. Expanded gene families are shown as green dots. d. Wheat gene family sizes in the OA. e. Amalgamation of wheat and Ae.tauschii gene copy numbers. The black line shows the expected gene copy numbers for wheat and Ae. tauschii respectively. The red line shows the regression fit for wheat, and the blue line for Ae.tauschii. The grey zone between these lines estimates the extent of gene loss in hexaploid wheat.
Figure 4
Figure 4. Pseudogene identification and analysis
a. Visualization of an OG Representative and associated wheat sequences. The top track shows the hit count profile of mapped 454 reads. The lower tracks shows sub-assemblies of three wheat genes and a stacked region of gene fragments. Read depth is represented by the heat map. b. The distribution shows coverage of the OG Representative by Pfam-containing gene fragments and pseudogenes. The blue and red lines represent stacks with and without protein domains, respectively. c. The distribution shows protein identity between sub-assemblies forming stacks of gene fragments. The blue and red lines represent stacks with and without protein domains, respectively, and the black line represents sub-assemblies forming genes.

Comment in

Similar articles

See all similar articles

Cited by 376 articles

See all "Cited by" articles


    1. WASDE World Agricultural Supply and Demand. 2012.
    1. FAOSTAT . Food and Agriculture Organisation of the United Nations; Rome, Italy: 2011.
    1. Nesbitt M, Samuel D. From staple crops to extinction? The archaeology and history of hulled wheats. International Plant Genetic Resources Institute; 1996.
    1. Dvorak J, Akhunov ED, Akhunov AR, Deal KR, Luo MC. Molecular characterization of a diagnostic DNA marker for domesticated tetraploid wheat provides evidence for gene flow from wild tetraploid wheat to hexaploid wheat. Molecular Biology and Evolution. 2006;23:1386–1396. - PubMed
    1. Salse J, et al. New insights into the origin of the B genome of hexaploid wheat: evolutionary relationships at the SPA genomic region with the S genome of the diploid relative Aegilops speltoides. BMC Genomics. 2008;9:555. - PMC - PubMed

Publication types

MeSH terms