Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007;35(17):e113.
doi: 10.1093/nar/gkm621. Epub 2007 Aug 28.

Efficacy Assessment of SNP Sets for Genome-Wide Disease Association Studies

Affiliations
Free PMC article

Efficacy Assessment of SNP Sets for Genome-Wide Disease Association Studies

Andreas Wollstein et al. Nucleic Acids Res. .
Free PMC article

Abstract

The power of a genome-wide disease association study depends critically upon the properties of the marker set used, particularly the number and physical spacing of markers, and the level of inter-marker association due to linkage disequilibrium. Extending our previously devised theoretical framework for the entropy-based selection of genetic markers, we have developed a local measure of the efficacy of a marker set, relative to including a maximally polymorphic single nucleotide polymorphism (SNP) at the map position of interest. Using this quantitative criterion, we evaluated five currently available SNP sets, namely Affymetrix 100K and 500K, and Illumina 100K, 300K and 550K in the CEU, YRI and JPT + CHB HapMap populations. At 50% relative efficacy, the commercial marker sets cover between 19 and 68% of the human genome, depending upon the population under study. An optimal technology-independent 500K marker set constructed from HapMap for Caucasians, in contrast, would achieve 73% coverage at the same relative efficacy.

Figures

Figure 1.
Figure 1.
Distribution of local LD and SNP set efficacy on chromosome 19 in the CEU population. Panel A: swept radius 1/ε as estimated around each marker from the HapMap genotype data (median 1/ɛ: 191 kb, interquartile range: 166–230 kb). Panel B: relative efficacy τ of the HapMap set, calculated at 10 kb intervals, excluding gaps, centromers, telomers and heterochromatin. Panel C: as Panel B, but for the Affymetrix 100K set. Note: physical positions (in Megabases, Mb) are given according to NCBI build 35.
Figure 2.
Figure 2.
Distribution of local LD and SNP set efficacy on chromosome 12 in the CEU population. Panel A: swept radius 1/ε as estimated around each marker from the HapMap genotype data (median 1/ɛ: 181 kb, interquartile range: 159–214 kb). Panels B and C: see legend to Figure 1.
Figure 3.
Figure 3.
Relative efficacy of SNP sets on chromosome 12 in the CEU population. For each marker set, the blue histogram depicts the distribution of relative efficacy τ in the full genomic sequence and the coding regions, respectively (for definition, see main text). Frequencies have been normalized such that the modal frequency equals unity. The distribution of τ as obtained for a similarly sized, hypothetical marker set, constructed from HapMap by entropy-based marker selection, is included for each marker set (open histograms).
Figure 4.
Figure 4.
Relative efficacy of SNP sets on chromosome 19 in the CEU population. For details, see legend to Figure 3.
Figure 5.
Figure 5.
Chromosome-specific estimates of relative SNP set efficacy in full genomic (Panel A) and coding (Panel B) sequences. Chromosome-wide median τ values and interquartile ranges obtained for the CEU population are plotted in chromosomal order.
Figure 6.
Figure 6.
SNP set coverage of full genomic (Panel A) and coding (Panel B) sequences at 50% relative efficacy. The chromosome-wide coverage C0.5 is plotted in chromosomal order. HYP 500K: hypothetical, optimal marker set constructed from HapMap so as to include the same number of SNPs per chromosome as the Affymetrix 500K set.
Figure 7.
Figure 7.
SNP set coverage of full genomic (Panel A) and coding (Panel B) sequences at 80% relative efficacy. The chromosome-wide coverage C0.8 is plotted in chromosomal order. HYP 500K: hypothetical, optimal marker set constructed from HapMap so as to include the same number of SNPs per chromosome as the Affymetrix 500K set.

Similar articles

See all similar articles

Cited by 4 articles

References

    1. Risch N, Merikangas K. The future of genetic studies of complex human diseases. Science. 1996;273:1516–1517. - PubMed
    1. Myers S, Bottolo L, Freeman C, McVean G, Donnelly P. A fine-scale map of recombination rates and hotspots across the human genome. Science. 2005;310:321–324. - PubMed
    1. Jeffreys AJ, Neumann R. Factors influencing recombination frequency and distribution in a human meiotic crossover hotspot. Hum. Mol. Genet. 2005;14:2277–2287. - PubMed
    1. Jeffreys AJ, Neumann R, Panayi M, Myers S, Donnelly P. Human recombination hot spots hidden in regions of strong marker association. Nat. Genet. 2005;37:601–606. - PubMed
    1. Shannon CE. A mathematical theory of communication. Bell Syst. Tech.l J. 1948;27:379–423.

Publication types

Substances

Feedback