Patterns of transmission of drug-resistant tuberculosis (TB) remain poorly understood, despite over half a million incident cases worldwide in 2017. Modeling TB transmission networks can provide insight into drivers of transmission, but incomplete sampling of TB cases can pose challenges for inference from individual epidemiologic and molecular data. We assessed the effect of missing cases on a transmission network inferred from Mycobacterium tuberculosis sequencing data on extensively drug-resistant TB cases in KwaZulu-Natal, South Africa, diagnosed in 2011-2014. We tested scenarios in which cases were missing at random, missing differentially by clinical characteristics, or missing differentially by transmission (i.e., cases with many links were under- or oversampled). Under the assumption that cases were missing randomly, the mean number of transmissions per case in the complete network needed to be larger than 20, far higher than expected, to reproduce the observed network. Instead, the most likely scenario involved undersampling of high-transmitting cases, and models provided evidence for super-spreading. To our knowledge, this is the first analysis to have assessed support for different mechanisms of missingness in a TB transmission study, but our results are subject to the distributional assumptions of the network models we used. Transmission studies should consider the potential biases introduced by incomplete sampling and identify host, pathogen, or environmental factors driving super-spreading.
Keywords: bias analysis; drug-resistant tuberculosis; missing data; network modeling; tuberculosis; tuberculosis transmission; whole genome sequencing.
Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health 2020. This work is written by (a) US Government employee(s) and is in the public domain in the US.