Understanding the relatedness of individuals within or between populations is a common goal in biology. Increasingly, relatedness features in genetic epidemiology studies of pathogens. These studies are relatively new compared to those in humans and other organisms, but are important for designing interventions and understanding pathogen transmission. Only recently have researchers begun to routinely apply relatedness to apicomplexan eukaryotic malaria parasites, and to date have used a range of different approaches on an ad hoc basis. Therefore, it remains unclear how to compare different studies and which measures to use. Here, we systematically compare measures based on identity-by-state (IBS) and identity-by-descent (IBD) using a globally diverse data set of malaria parasites, Plasmodium falciparum and P. vivax, and provide marker requirements for estimates based on IBD. We formally show that the informativeness of polyallelic markers for relatedness inference is maximized when alleles are equifrequent. Estimates based on IBS are sensitive to allele frequencies, which vary across populations and by experimental design. For portability across studies, we thus recommend estimates based on IBD. To generate estimates with errors below an arbitrary threshold of 0.1, we recommend ∼100 polyallelic or 200 biallelic markers. Marker requirements are immediately applicable to haploid malaria parasites and other haploid eukaryotes. C.I.s facilitate comparison when different marker sets are used. This is the first attempt to provide rigorous analysis of the reliability of, and requirements for, relatedness inference in malaria genetic epidemiology. We hope it will provide a basis for statistically informed prospective study design and surveillance strategies.
Keywords: Plasmodium falciparum; Plasmodium vivax; genetic epidemiology; hidden Markov model; identity-by-descent; identity-by-state; independence model; malaria; relatedness.
Copyright © 2019 Taylor et al.