Pf7: an open dataset of Plasmodium falciparum genome variation in 20,000 worldwide samples

Wellcome Open Res. 2023 Jan 16:8:22. doi: 10.12688/wellcomeopenres.18681.1. eCollection 2023.


We describe the MalariaGEN Pf7 data resource, the seventh release of Plasmodium falciparum genome variation data from the MalariaGEN network. It comprises over 20,000 samples from 82 partner studies in 33 countries, including several malaria endemic regions that were previously underrepresented. For the first time we include dried blood spot samples that were sequenced after selective whole genome amplification, necessitating new methods to genotype copy number variations. We identify a large number of newly emerging crt mutations in parts of Southeast Asia, and show examples of heterogeneities in patterns of drug resistance within Africa and within the Indian subcontinent. We describe the profile of variations in the C-terminal of the csp gene and relate this to the sequence used in the RTS,S and R21 malaria vaccines. Pf7 provides high-quality data on genotype calls for 6 million SNPs and short indels, analysis of large deletions that cause failure of rapid diagnostic tests, and systematic characterisation of six major drug resistance loci, all of which can be freely downloaded from the MalariaGEN website.

Keywords: data resource; genomic epidemiology; genomics; malaria; plasmodium falciparum.

Associated data

  • figshare/10.6084/m9.figshare.21674321