The draft genome of a diploid cotton Gossypium raimondii

Nat Genet. 2012 Oct;44(10):1098-103. doi: 10.1038/ng.2371. Epub 2012 Aug 26.


We have sequenced and assembled a draft genome of G. raimondii, whose progenitor is the putative contributor of the D subgenome to the economically important fiber-producing cotton species Gossypium hirsutum and Gossypium barbadense. Over 73% of the assembled sequences were anchored on 13 G. raimondii chromosomes. The genome contains 40,976 protein-coding genes, with 92.2% of these further confirmed by transcriptome data. Evidence of the hexaploidization event shared by the eudicots as well as of a cotton-specific whole-genome duplication approximately 13-20 million years ago was observed. We identified 2,355 syntenic blocks in the G. raimondii genome, and we found that approximately 40% of the paralogous genes were present in more than 1 block, which suggests that this genome has undergone substantial chromosome rearrangement during its evolution. Cotton, and probably Theobroma cacao, are the only sequenced plant species that possess an authentic CDN1 gene family for gossypol biosynthesis, as revealed by phylogenetic analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Biosynthetic Pathways / genetics
  • Chromosomes, Plant
  • DNA Transposable Elements
  • Diploidy*
  • Evolution, Molecular
  • Genes, Plant*
  • Genome, Plant
  • Gossypium / enzymology
  • Gossypium / genetics*
  • High-Throughput Nucleotide Sequencing
  • Microsatellite Repeats
  • Molecular Sequence Annotation
  • Phylogeny
  • Sequence Analysis, DNA
  • Synteny
  • Terminal Repeat Sequences
  • Transcriptome


  • DNA Transposable Elements