ExpansionHunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data

Genome Biol. 2020 Apr 28;21(1):102. doi: 10.1186/s13059-020-02017-z.


Repeat expansions are responsible for over 40 monogenic disorders, and undoubtedly more pathogenic repeat expansions remain to be discovered. Existing methods for detecting repeat expansions in short-read sequencing data require predefined repeat catalogs. Recent discoveries emphasize the need for methods that do not require pre-specified candidate repeats. To address this need, we introduce ExpansionHunter Denovo, an efficient catalog-free method for genome-wide repeat expansion detection. Analysis of real and simulated data shows that our method can identify large expansions of 41 out of 44 pathogenic repeats, including nine recently reported non-reference repeat expansions not discoverable via existing methods.

Keywords: Fragile X syndrome; Friedreich ataxia; Genome-wide analysis; Huntington disease; Myotonic dystrophy type 1; Repeat expansions; Short tandem repeats; Whole-genome sequencing data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Case-Control Studies
  • DNA Repeat Expansion*
  • Fragile X Syndrome / genetics
  • Friedreich Ataxia / genetics
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Huntington Disease / genetics
  • Microsatellite Repeats
  • Myotonic Dystrophy / genetics
  • Software*
  • Whole Genome Sequencing

Grants and funding