Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 1;35(7):1678-1689.
doi: 10.1093/molbev/msy059.

De Novo Mutations Resolve Disease Transmission Pathways in Clonal Malaria

Free PMC article

De Novo Mutations Resolve Disease Transmission Pathways in Clonal Malaria

Seth N Redmond et al. Mol Biol Evol. .
Free PMC article


Detecting de novo mutations in viral and bacterial pathogens enables researchers to reconstruct detailed networks of disease transmission and is a key technique in genomic epidemiology. However, these techniques have not yet been applied to the malaria parasite, Plasmodium falciparum, in which a larger genome, slower generation times, and a complex life cycle make them difficult to implement. Here, we demonstrate the viability of de novo mutation studies in P. falciparum for the first time. Using a combination of sequencing, library preparation, and genotyping methods that have been optimized for accuracy in low-complexity genomic regions, we have detected de novo mutations that distinguish nominally identical parasites from clonal lineages. Despite its slower evolutionary rate compared with bacterial or viral species, de novo mutation can be detected in P. falciparum across timescales of just 1-2 years and evolutionary rates in low-complexity regions of the genome can be up to twice that detected in the rest of the genome. The increased mutation rate allows the identification of separate clade expansions that cannot be found using previous genomic epidemiology approaches and could be a crucial tool for mapping residual transmission patterns in disease elimination campaigns and reintroduction scenarios.


<sc>Fig</sc>. 1.
Fig. 1.
Genome accessibility was assessed based on in silico comparisons of two P. falciparum genome assemblies. Variants were called from reads simulated from the Pf_Dd2 reference and aligned to the Pf3D7_v3 reference and assessed as correct or incorrect based on a 200 bp flanking region on either side. Blocks of 1 kb across the genome were defined as accessible only if they had a read coverage within two standard deviations of the median and if they had called all variants accurately. Comparisons were made between DISCOVAR and HaplotypeCaller with both 250 bp (HC250) and 100 bp reads (HC100). Subplots show (a) the genome-wide distribution of accessible blocks, (b) the proportions accessible to combinations of callers, and (c) the overall genome area accessible to each caller. The strong overlap of HC250 and DISCOVAR calls suggests that the majority of this improvement derives from the use of longer reads; accessibility increased from 64.0% with HC100 to 78.4% with DISCOVAR and 79.3% with HC250. 10.8 Mb of low complexity sequence was identified by DustMasker, of which 80.8% 81.4%, and 65.4% was found to be accessible by DISCOVAR, HC250, and HC100, respectively.
<sc>Fig</sc>. 2.
Fig. 2.
Phylogenetic resolution of parasite samples using standing variation versus de novo variants. (a) Maximum parsimony tree of standing variation (24-SNP molecular barcode—Daniels et al. 2015). The last two numerals in sample names indicate the year of collection. Clades are resolved, but samples within clades are indistinguishable. (b) More than 3,000 de novo variants were called via DISCOVAR that segregated the individuals into three IBD clades. Samples within clades 24 (red), 26 (blue), and 29 (green) were separated by 622, 828, and 1858, variants, respectively, with the closest individuals (Th106.09/Th074.13) distinguished by 88 de novo variants. Phylogenetic trees were calculated using 15276 SNPs and 11420 INDELs using maximum parsimony. Numbers on nodes indicate bootstrap support. Our ability to discern phylogeny using only de novo variants was high with bootstrap values above 0.5 for all nodes subtending samples collected at >1 year apart.
<sc>Fig</sc>. 3.
Fig. 3.
Root-to-tip distance in the phylogeny of clade 29 correlates with sampling time. The observed correlation of genetic distance and time (via Pearson’s product moment) indicates that many of our variants are de novo and that mutation occurs at a sufficiently high rate to resolve patterns of malaria transmission. Regression from sampling times indicates a common ancestor for the clade may have existed in approximately the year 2000.
<sc>Fig</sc>. 4.
Fig. 4.
Transmission networks were estimated for clade 29 based on SNP and INDEL distances combined using a minimum distance tree approach. Edges are labeled with pairwise genetic distance, as well as bootstrap values (superscript in parentheses) and reversion rate (subscript). Bootstrap values were calculated for all edges by sampling with replacement for 100 iterations and indicate strong support for many of these distance-derived relationships. Potential alternative edges derived from the bootstrap results are shown in gray. Nodes in later years are shown in darker colors. Concordance between the parsimony tree and distance based methods is notable; in both the phylogeny and transmission network sample Th106.09 is basal to subclade 29.1 and Th106.11 to subclade 29.2. Reversion rates are significantly higher for the edge joining Th106.09 to Th106.11 supporting the independence of subclade 29.2
<sc>Fig</sc>. 5.
Fig. 5.
Lower mutation rates and the restricted genome size of the core genome would generate a single new mutation on average every month. The increase in accessibility offered by 250-bp reads and DISCOVAR increases both genome size and measurable evolutionary rate, resulting in a new variant every 2.4 weeks in the low complexity genome and every 1.5 weeks overall. This makes P. falciparum comparable to other infectious agents like influenza or HIV where transmission networks may be informed by genome sequencing. Mutation rates derived from Biek et al. (2015).

Similar articles

See all similar articles

Cited by 3 articles


    1. Azarian T, Maraqa NF, Cook RL, Johnson JA, Bailey C, Wheeler S, Nolan D, Rathore MH, Morris JG, Salemi M. 2016. Genomic epidemiology of methicillin-resistant Staphylococcus aureus in a neonatal intensive care unit. PLoS One 11:e0164397. - PMC - PubMed
    1. Biek R, Pybus OG, Lloyd-Smith JO, Didelot X. 2015. Measurably evolving pathogens in the genomic era. Trends Ecol Evol. 306: 306–313. - PMC - PubMed
    1. Blum MGB, François O. 2005. On statistical tests of phylogenetic tree imbalance: the Sackin and other indices revisited. Math Biosci. 1952: 141–153. - PubMed
    1. Bopp SER, Manary MJ, Bright AT, Johnston GL, Dharia NV, Luna FL, McCormack S, Plouffe D, McNamara CW, Walker JR et al. , . 2013. Mitotic evolution of Plasmodium falciparum shows a stable core genome but recombination in antigen families. PLoS Genet. 92: e1003293. - PMC - PubMed
    1. Chenet SM, Taylor JE, Blair S, Zuluaga L, Escalante AA. 2015. Longitudinal analysis of Plasmodium falciparum genetic variation in Turbo, Colombia: implications for malaria control and elimination. Malar J. 14:363.. - PMC - PubMed

Publication types