Challenges in structural variant calling in low-complexity regions

Gigascience. 2025 Jan 6:14:giaf154. doi: 10.1093/gigascience/giaf154.

Abstract

Background: Structural variants (SVs) are genomic differences $\ge$50 bp in length. They remain challenging to detect, even with long-sequence reads, and the sources of these difficulties are not well quantified.

Results: We identified 35.4 Mb of low-complexity regions (LCRs) in GRCh38. Although these regions cover only 1.2% of the genome, they contain 69.1% of confident SVs in sample HG002. Across long-read SV callers, 77.3-91.3% of erroneous SV calls occur within LCRs, with error rates increasing with LCR length.

Conclusion: SVs are enriched and difficult to call in LCRs. Special care needs to be taken for calling and analyzing these variants.

Keywords: evaluation; low-complexity regions; structural variant.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Genome, Human*
  • Genomic Structural Variation*
  • Genomics* / methods
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Sequence Analysis, DNA / methods