Statistical considerations for genome-wide scans: design and application of a novel software package POLYMORPHISM

Hum Hered. 2001;52(2):102-9. doi: 10.1159/000053361.


Objective: Given the cost and complexity of genome-wide scans, optimization of study design is of critical importance. Available algorithms only partially satisfy this need. We designed a software package called 'POLYMORPHISM' to meet these needs.

Methods: The program is designed to calculate linkage parameters for both 'single-point' and 'two-point' settings that are applicable also to incompletely informative microsatellite markers. In single-point analysis, the heterozygosity, polymorphism information content (PIC) and linkage information content (LIC) statistics based on marker allele frequencies are provided. In two-point analysis, joint PIC values for two markers, the conditional probability of detecting linkage phase, the frequency of double heterozygotes and the expected number of informative meioses are calculated.

Results: Results were obtained using S.A.G.E./DESPAIR (Design of Linkage Studies Based on Pairs of Relatives) in addition to applying this program to a Centre d'Etude du Polymorphisme pedigree-derived genotyping data set, which estimated critical parameters used in a two-stage genome scan. A single nucleotide polymorphism (SNP)-based one-stage genomic screen strategy is also considered.

Conclusions: LIC values are crucial for getting accurate estimates on those parameters that are important for a two-stage genome screening study. Optimization of the cost-effectiveness of an SNP-based genomic screen strategy is possible by modeling a balance between marker information content and marker density.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Alleles
  • Gene Frequency
  • Genetic Linkage
  • Humans
  • Microsatellite Repeats*
  • Polymorphism, Genetic*
  • Polymorphism, Single Nucleotide
  • Software*