Modeling the minus two base pair stutter ratio of the D1S1656 locus: A sequence-based mixture distribution model

Forensic Sci Int Genet. 2021 Mar:51:102450. doi: 10.1016/j.fsigen.2020.102450. Epub 2020 Dec 24.

Abstract

In this study, we propose a stutter ratio for a minus two base pair stutter (-2bpSR) model of the D1S1656 locus in capillary electrophoresis (CE)-based short tandem repeat (STR) typing. DNA from a total of 108 Japanese individuals was analyzed via massively parallel sequencing to investigate the length of the longest uninterrupted stretch of two base repeat motif (2bpLUS value) within repetitive structures involving the flanking region. Additionally, -2bpSR data was collected using the GlobalFiler Kit on a 3500xL Genetic Analyzer. As a result of sequencing analysis, all alleles were classified into two types by their 2bpLUS values. The -2bpSR differed significantly between the types. Then, we modeled the -2bpSR with a mixture log-normal distribution using the classification of alleles based on the 2bpLUS values. Furthermore, probabilities of the sequence type within each repeat number in the mixture log-normal distribution model were estimated using logistic regression for each of the five major detected populations. This study is expected to enable interpretation of STR typing while considering minus two base pair stutter at the D1S1656 locus.

Keywords: D1S1565; STR typing; Statistical modeling; Stutter.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles*
  • Asian People / genetics
  • Base Pairing*
  • DNA Fingerprinting
  • Electrophoresis, Capillary
  • Genetic Loci*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Japan
  • Microsatellite Repeats
  • Models, Statistical
  • Sequence Analysis, DNA*