Background: In the malaria vector Anopheles gambiae, understanding diversity in natural populations and genetic components of important phenotypes such as resistance to malaria infection is crucial for developing new malaria transmission blocking strategies. The design and interpretation of many studies here depends critically on Linkage disequilibrium (LD). For example in association studies, LD determines the density of Single Nucleotide Polymorphisms (SNPs) to be genotyped to represent the majority of the genomic information. Here, we aim to determine LD in wild An. gambiae s.l. populations in 4 genes potentially involved in mosquito immune responses against pathogens (Gambicin, NOS, REL2 and FBN9) using previously published and newly generated sequences.
Results: The level of LD between SNP pairs in cloned sequences of each gene was determined for 7 species (or incipient species) of the An. gambiae complex. In all tested genes and species, LD between SNPs was low: even at short distances (< 200 bp), most SNP pairs gave an r2 < 0.3. Mean r2 ranged from 0.073 to 0.766. In most genes and species LD decayed very rapidly with increasing inter-marker distance.
Conclusions: These results are of great interest for the development of large scale polymorphism studies, as LD generally falls below any useful limit. It indicates that very fine scale SNP detection will be required to give an overall view of genome-wide polymorphism. Perhaps a more feasible approach to genome wide association studies is to use targeted approaches using candidate gene selection to detect association to phenotypes of interest.