Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 46 (6), 567-72

Genome Sequence of the Cultivated Cotton Gossypium Arboreum


Genome Sequence of the Cultivated Cotton Gossypium Arboreum

Fuguang Li et al. Nat Genet.


The complex allotetraploid nature of the cotton genome (AADD; 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled the Gossypium arboreum (AA; 2n = 26) genome, a putative contributor of the A subgenome. A total of 193.6 Gb of clean sequence covering the genome by 112.6-fold was obtained by paired-end sequencing. We further anchored and oriented 90.4% of the assembly on 13 pseudochromosomes and found that 68.5% of the genome is occupied by repetitive DNA sequences. We predicted 41,330 protein-coding genes in G. arboreum. Two whole-genome duplications were shared by G. arboreum and Gossypium raimondii before speciation. Insertions of long terminal repeats in the past 5 million years are responsible for the twofold difference in the sizes of these genomes. Comparative transcriptome studies showed the key role of the nucleotide binding site (NBS)-encoding gene family in resistance to Verticillium dahliae and the involvement of ethylene in the development of cotton fiber cells.

Similar articles

See all similar articles

Cited by 297 PubMed Central articles

See all "Cited by" articles


    1. Nature. 2012 Nov 29;491(7426):711-6 - PubMed
    1. Nature. 2013 Apr 4;496(7443):87-90 - PubMed
    1. Plant Cell. 2006 Mar;18(3):651-64 - PubMed
    1. Plant J. 2009 Jul;59(1):52-62 - PubMed
    1. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W265-8 - PubMed

Publication types