Case-control association testing in the presence of unknown relationships

Genet Epidemiol. 2009 Dec;33(8):668-78. doi: 10.1002/gepi.20418.


Genome-wide association studies result in inflated false-positive results when unrecognized cryptic relatedness exists. A number of methods have been proposed for testing association between markers and disease with a correction for known pedigree-based relationships. However, in most case-control studies, relationships are generally unknown, yet the design is predicated on the assumption of at least ancestral relatedness among cases. Here, we focus on adjusting cryptic relatedness when the genealogy of the sample is unknown, particularly in the context of samples from isolated populations where cryptic relatedness may be problematic. We estimate cryptic relatedness using maximum-likelihood methods and use a corrected chi(2) test with estimated kinship coefficients for testing in the context of unknown cryptic relatedness. Estimated kinship coefficients characterize precisely the relatedness between truly related people, but are biased for unrelated pairs. The proposed test substantially reduces spurious positive results, producing a uniform null distribution of P-values. Especially with missing pedigree information, estimated kinship coefficients can still be used to correct non-independence among individuals. The corrected test was applied to real data sets from genetic isolates and created a distribution of P-value that was close to uniform. Thus, the proposed test corrects the non-uniform distribution of P-values obtained with the uncorrected test and illustrates the advantage of the approach on real data.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Alleles
  • Case-Control Studies
  • Computational Biology / methods
  • Genetic Markers
  • Genetics, Population
  • Genome, Human
  • Genome-Wide Association Study*
  • Humans
  • Likelihood Functions
  • Models, Genetic
  • Models, Statistical
  • Pedigree
  • Polymorphism, Single Nucleotide
  • Risk


  • Genetic Markers