Familial identification: population structure and relationship distinguishability

PLoS Genet. 2012 Feb;8(2):e1002469. doi: 10.1371/journal.pgen.1002469. Epub 2012 Feb 9.


With the expansion of offender/arrestee DNA profile databases, genetic forensic identification has become commonplace in the United States criminal justice system. Implementation of familial searching has been proposed to extend forensic identification to family members of individuals with profiles in offender/arrestee DNA databases. In familial searching, a partial genetic profile match between a database entrant and a crime scene sample is used to implicate genetic relatives of the database entrant as potential sources of the crime scene sample. In addition to concerns regarding civil liberties, familial searching poses unanswered statistical questions. In this study, we define confidence intervals on estimated likelihood ratios for familial identification. Using these confidence intervals, we consider familial searching in a structured population. We show that relatives and unrelated individuals from population samples with lower gene diversity over the loci considered are less distinguishable. We also consider cases where the most appropriate population sample for individuals considered is unknown. We find that as a less appropriate population sample, and thus allele frequency distribution, is assumed, relatives and unrelated individuals become more difficult to distinguish. In addition, we show that relationship distinguishability increases with the number of markers considered, but decreases for more distant genetic familial relationships. All of these results indicate that caution is warranted in the application of familial searching in structured populations, such as in the United States.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Alleles
  • Biometric Identification / methods*
  • Confidence Intervals
  • Crime
  • Criminals
  • DNA Fingerprinting / methods*
  • Data Interpretation, Statistical
  • Databases, Nucleic Acid
  • Family
  • Forensic Genetics*
  • Gene Frequency / genetics
  • Humans
  • Likelihood Functions
  • Population / genetics*
  • United States