Serial gene losses and foreign DNA underlie size and sequence variation in the plastid genomes of diatoms

Genome Biol Evol. 2014 Mar;6(3):644-54. doi: 10.1093/gbe/evu039.


Photosynthesis by diatoms accounts for roughly one-fifth of global primary production, but despite this, relatively little is known about their plastid genomes. We report the completely sequenced plastid genomes for eight phylogenetically diverse diatoms and show them to be variable in size, gene and foreign sequence content, and gene order. The genomes contain a core set of 122 protein-coding genes, with 15 additional genes exhibiting complex patterns of 1) gene losses at varying phylogenetic scales, 2) functional transfers to the nucleus, 3) gene duplication, divergence, and differential retention of paralogs, and 4) acquisitions of putatively functional recombinase genes from resident plasmids. The newly sequenced genomes also contain several previously unreported genes, highlighting how poorly characterized diatom plastid genomes are overall. Genome size variation reflects major expansions of the inverted repeat region in some cases but, more commonly, large-scale expansions of intergenic regions, many of which contain unique open reading frames of likely foreign origin. Although many gene clusters are conserved across species, rearrangements appear to be frequent in most lineages.

Keywords: chloroplast; diatoms; genomes; horizontal gene transfer; plastid.

MeSH terms

  • Chromosome Mapping
  • DNA / genetics
  • DNA / isolation & purification*
  • DNA, Intergenic
  • Diatoms / classification
  • Diatoms / genetics*
  • Evolution, Molecular
  • Gene Deletion*
  • Gene Duplication
  • Gene Order
  • Gene Rearrangement
  • Genome, Plastid*
  • Open Reading Frames
  • Phylogeny
  • Sequence Analysis, DNA


  • DNA, Intergenic
  • DNA