Although the plastid genomes of diatoms maintain a conserved architecture and core gene set, considerable variation about this core theme exists and can be traced to several different processes. Gene duplication, pseudogenization, and loss, as well as intracellular transfer of genes to the nuclear genome, have all contributed to variation in gene content among diatom species. In addition, some noncoding sequences have highly restricted phylogenetic distributions that suggest a recent foreign origin. We sequenced the plastid genome of the marine diatom, Toxarium undulatum, and found that the genome contains three genes (chlB, chlL, and chlN) involved in light-independent chlorophyll a biosynthesis that were not previously known from diatoms. Phylogenetic and syntenic data suggest that these genes were differentially retained in this one lineage as they were repeatedly lost from most other diatoms. Unique among diatoms and other heterokont algae sequenced so far, the genome also contains a large group II intron within an otherwise intact psaA gene. Although the intron is most similar to one in the plastid-encoded psaA gene of some green algae, high sequence divergence between the diatom and green algal introns rules out recent shared ancestry. We conclude that the psaA intron was likely introduced into the plastid genome of T. undulatum, or some earlier ancestor, by horizontal transfer from an unknown donor. This genome further highlights the myriad processes driving variation in gene and intron content in the plastid genomes of diatoms, one of the world's foremost primary producers.
Keywords: Chlorophyll a; Diatoms; Intron; Plastid; Toxarium; psaA.