Detecting cryptic clinically relevant structural variation in exome-sequencing data increases diagnostic yield for developmental disorders

Am J Hum Genet. 2021 Nov 4;108(11):2186-2194. doi: 10.1016/j.ajhg.2021.09.010. Epub 2021 Oct 8.

Abstract

Structural variation (SV) describes a broad class of genetic variation greater than 50 bp in size. SVs can cause a wide range of genetic diseases and are prevalent in rare developmental disorders (DDs). Individuals presenting with DDs are often referred for diagnostic testing with chromosomal microarrays (CMAs) to identify large copy-number variants (CNVs) and/or with single-gene, gene-panel, or exome sequencing (ES) to identify single-nucleotide variants, small insertions/deletions, and CNVs. However, individuals with pathogenic SVs undetectable by conventional analysis often remain undiagnosed. Consequently, we have developed the tool InDelible, which interrogates short-read sequencing data for split-read clusters characteristic of SV breakpoints. We applied InDelible to 13,438 probands with severe DDs recruited as part of the Deciphering Developmental Disorders (DDD) study and discovered 63 rare, damaging variants in genes previously associated with DDs missed by standard SNV, indel, or CNV discovery approaches. Clinical review of these 63 variants determined that about half (30/63) were plausibly pathogenic. InDelible was particularly effective at ascertaining variants between 21 and 500 bp in size and increased the total number of potentially pathogenic variants identified by DDD in this size range by 42.9%. Of particular interest were seven confirmed de novo variants in MECP2, which represent 35.0% of all de novo protein-truncating variants in MECP2 among DDD study participants. InDelible provides a framework for the discovery of pathogenic SVs that are most likely missed by standard analytical workflows and has the potential to improve the diagnostic yield of ES across a broad range of genetic diseases.

Keywords: bioinformatics; developmental disorders; diagnostics; insertions/deletions; structural variation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Child
  • Developmental Disabilities / diagnosis*
  • Developmental Disabilities / genetics*
  • Female
  • Humans
  • Male
  • Methyl-CpG-Binding Protein 2 / genetics
  • Whole Exome Sequencing / methods*

Substances

  • MECP2 protein, human
  • Methyl-CpG-Binding Protein 2