New reference genome sequences of hot pepper reveal the massive evolution of plant disease-resistance genes by retroduplication

Genome Biol. 2017 Nov 1;18(1):210. doi: 10.1186/s13059-017-1341-9.


Background: Transposable elements are major evolutionary forces which can cause new genome structure and species diversification. The role of transposable elements in the expansion of nucleotide-binding and leucine-rich-repeat proteins (NLRs), the major disease-resistance gene families, has been unexplored in plants.

Results: We report two high-quality de novo genomes (Capsicum baccatum and C. chinense) and an improved reference genome (C. annuum) for peppers. Dynamic genome rearrangements involving translocations among chromosomes 3, 5, and 9 were detected in comparison between C. baccatum and the two other peppers. The amplification of athila LTR-retrotransposons, members of the gypsy superfamily, led to genome expansion in C. baccatum. In-depth genome-wide comparison of genes and repeats unveiled that the copy numbers of NLRs were greatly increased by LTR-retrotransposon-mediated retroduplication. Moreover, retroduplicated NLRs are abundant across the angiosperms and, in most cases, are lineage-specific.

Conclusions: Our study reveals that retroduplication has played key roles for the massive emergence of NLR genes including functional disease-resistance genes in pepper plants.

Keywords: Disease-resistance gene; Genome evolution; LTR-retrotransposon; NLR; Retroduplication.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Capsicum / genetics*
  • Chromosomes, Plant / genetics
  • Disease Resistance / genetics*
  • Evolution, Molecular*
  • Gene Duplication*
  • Genes, Plant*
  • Genetic Speciation
  • Molecular Sequence Annotation
  • Multigene Family
  • NLR Proteins / genetics
  • Open Reading Frames / genetics
  • Phylogeny
  • Plant Diseases / genetics*
  • Plant Diseases / immunology*
  • Reference Standards
  • Retroelements / genetics*
  • Sequence Analysis, RNA
  • Species Specificity
  • Terminal Repeat Sequences / genetics


  • NLR Proteins
  • Retroelements