Background: The insertion element IS6110 is one of the main sources of genomic variability in Mycobacterium tuberculosis, the etiological agent of human tuberculosis. Although IS 6110 has been used extensively as an epidemiological marker, the identification of the precise chromosomal insertion sites has been limited by technical challenges. Here, we present IS-seq, a novel method that combines high-throughput sequencing using Illumina technology with efficient combinatorial sample multiplexing to simultaneously probe 519 clinical isolates, identifying almost all the flanking regions of the element in a single experiment.
Results: We identified a total of 6,976 IS6110 flanking regions on the different isolates. When validated using reference strains, the method had 100% specificity and 98% positive predictive value. The insertions mapped to both coding and non-coding regions, and in some cases interrupted genes thought to be essential for virulence or in vitro growth. Strains were classified into families using insertion sites, and high agreement with previous studies was observed.
Conclusions: This high-throughput IS-seq method, which can also be used to map insertions in other organisms, extends previous surveys of in vivo interrupted loci and provides a baseline for probing the consequences of disruptions in M. tuberculosis strains.