CEESIt: A computational tool for the interpretation of STR mixtures

Forensic Sci Int Genet. 2016 May;22:149-160. doi: 10.1016/j.fsigen.2016.02.005. Epub 2016 Feb 23.


In forensic DNA interpretation, the likelihood ratio (LR) is often used to convey the strength of a match. Expanding on binary and semi-continuous methods that do not use all of the quantitative data contained in an electropherogram, fully continuous methods to calculate the LR have been created. These fully continuous methods utilize all of the information captured in the electropherogram, including the peak heights. Recently, methods that calculate the distribution of the LR using semi-continuous methods have also been developed. The LR distribution has been proposed as a way of studying the robustness of the LR, which varies depending on the probabilistic model used for its calculation. For example, the LR distribution can be used to calculate the p-value, which is the probability that a randomly chosen individual results in a LR greater than the LR obtained from the person-of-interest (POI). Hence, the p-value is a statistic that is different from, but related to, the LR; and it may be interpreted as the false positive rate resulting from a binary hypothesis test between the prosecution and defense hypotheses. Here, we present CEESIt, a method that combines the twin features of a fully continuous model to calculate the LR and its distribution, conditioned on the defense hypothesis, along with an associated p-value. CEESIt incorporates dropout, noise and stutter (reverse and forward) in its calculation. As calibration data, CEESIt uses single source samples with known genotypes and calculates a LR for a specified POI on a question sample, along with the LR distribution and a p-value. The method was tested on 303 files representing 1-, 2- and 3-person samples injected using three injection times containing between 0.016 and 1 ng of template DNA. Our data allows us to evaluate changes in the LR and p-value with respect to the complexity of the sample and to facilitate discussions regarding complex DNA mixture interpretation. We observed that the amount of template DNA from the contributor impacted the LR--small LRs resulted from contributors with low template masses. Moreover, as expected, we observed a decrease of p-values as the LR increased. A p-value of 10(-9) or lower was achieved in all the cases where the LR was greater than 10(8). We tested the repeatability of CEESIt by running all samples in duplicate and found the results to be repeatable.

Keywords: DNA analysis; LR distribution; Likelihood ratio; Mixture interpretation; p-Value.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Complex Mixtures / analysis*
  • Complex Mixtures / genetics*
  • DNA / analysis*
  • DNA / genetics*
  • DNA Fingerprinting / methods*
  • Genotype
  • Humans
  • Likelihood Functions
  • Microsatellite Repeats*
  • Models, Genetic
  • Models, Statistical


  • Complex Mixtures
  • DNA