In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae

PLoS One. 2015 Nov 25;10(11):e0143424. doi: 10.1371/journal.pone.0143424. eCollection 2015.


The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57%) of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%). Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Evolution, Molecular
  • Fabaceae / classification
  • Fabaceae / genetics*
  • Genetic Variation*
  • Genome Size*
  • Genome, Plant*
  • Genomics* / methods
  • Phylogeny
  • Repetitive Sequences, Nucleic Acid*
  • Reproducibility of Results
  • Sequence Analysis, DNA
  • Terminal Repeat Sequences

Grant support

This work was supported by grants from the Czech Science Foundation [GBP501/12/G090] and the Czech Academy of Sciences [RVO:60077344] to JM and from the National Program of Sustainability I. [LO1204] to JD. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.