Benchmarking the Effectiveness and Accuracy of Multiple Mitochondrial DNA Variant Callers: Practical Implications for Clinical Application

Eddie K K Ip; Michael Troup; Colin Xu; David S Winlaw; Sally L Dunwoodie; Eleni Giannoulatou

doi:10.3389/fgene.2022.692257

Benchmarking the Effectiveness and Accuracy of Multiple Mitochondrial DNA Variant Callers: Practical Implications for Clinical Application

Front Genet. 2022 Mar 8:13:692257. doi: 10.3389/fgene.2022.692257. eCollection 2022.

Authors

Eddie K K Ip^{1

2}, Michael Troup¹, Colin Xu³, David S Winlaw⁴, Sally L Dunwoodie^{1

2}, Eleni Giannoulatou^{1

2}

Affiliations

¹ Victor Chang Cardiac Research Institute, Sydney, NSW, Australia.
² St. Vincent's Clinical School, Sydney, NSW, Australia.
³ School of Computer Science and Engineering, Sydney, NSW, Australia.
⁴ Cardiothoracic Surgery, Cincinnati Children's Hospital Medical Centre, Heart Institute, Cincinnati, OH, United States.

Abstract

Mitochondrial DNA (mtDNA) mutations contribute to human disease across a range of severity, from rare, highly penetrant mutations causal for monogenic disorders to mutations with milder contributions to phenotypes. mtDNA variation can exist in all copies of mtDNA or in a percentage of mtDNA copies and can be detected with levels as low as 1%. The large number of copies of mtDNA and the possibility of multiple alternative alleles at the same DNA nucleotide position make the task of identifying allelic variation in mtDNA very challenging. In recent years, specialized variant calling algorithms have been developed that are tailored to identify mtDNA variation from whole-genome sequencing (WGS) data. However, very few studies have systematically evaluated and compared these methods for the detection of both homoplasmy and heteroplasmy. A publicly available synthetic gold standard dataset was used to assess four mtDNA variant callers (Mutserve, mitoCaller, MitoSeek, and MToolBox), and the commonly used Genome Analysis Toolkit "best practices" pipeline, which is included in most current WGS pipelines. We also used WGS data from 126 trios and calculated the percentage of maternally inherited variants as a metric of calling accuracy, especially for homoplasmic variants. We additionally compared multiple pathogenicity prediction resources for mtDNA variants. Although the accuracy of homoplasmic variant detection was high for the majority of the callers with high concordance across callers, we found a very low concordance rate between mtDNA variant callers for heteroplasmic variants ranging from 2.8% to 3.6%, for heteroplasmy thresholds of 5% and 1%. Overall, Mutserve showed the best performance using the synthetic benchmark dataset. The analysis of mtDNA pathogenicity resources also showed low concordance in prediction results. We have shown that while homoplasmic variant calling is consistent between callers, there remains a significant discrepancy in heteroplasmic variant calling. We found that resources like population frequency databases and pathogenicity predictors are now available for variant annotation but still need refinement and improvement. With its peculiarities, the mitochondria require special considerations, and we advocate that caution needs to be taken when analyzing mtDNA data from WGS data.

Keywords: benchmarking; heteroplasmic; homoplasmic; mitochondrial DNA; variant‐caller; whole-genome sequencing.