A forensic Y-STR database generated in the US was compiled with profiles containing a portion or complete typing of 16 STR markers DYS19, DYS385, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS456, DYS458, DYS635, DYS448, and Y GATA H4. There were 17,447 samples in the version of database in which 77% and 20% were collected in North America and Asia, respectively. The database was separated into six general populations, African American, Asian, Caucasian, Hispanic, Indian, and Native American. Each population was further classified into subgroups according to geographic regions. Some subgroups were tested, found to be homogenous and merged together. Allele and haplotype frequencies, as well as sample sizes were summarized. Of the full haplotypes (i.e., 16 STRs without missing data), 93.7% in total population were distinct, 92.9% were population specific, and 89.3% were only observed once. The majority of shared haplotypes were found among North American populations as a result of admixture lasting the past few hundred years. The power of discrimination (PD), coancestry coefficient (F(st)), and coefficient of gene differentiation (G(st)) at locus and haplotype levels were also calculated. The most polymorphic marker was DYS385; this marker contains a tandem duplication and actually is composed of two loci. Both G(st) and F(st) estimates were very small with haplotypes composed of a high number of STRs haplotypes (e.g., 10-16 markers), although G(st) is slightly more conservative for these extended haplotypes. With Native American removed from the total population data set, the G(st) and F(st) estimates reduce further. PD was 0.9998 for the total population dataset for all 16 Y-STR markers. Three measures of Y-STR profile frequency were calculated: (1) unconditional haplotype frequency, (2) population substructure adjusted frequency, and (3) binomial upper bound of the haplotype frequency. The binomial upper bound is the most conservative estimate for most forensic applications. Estimates of the weight of a Y-STR haplotype can be estimated using population specific or total population databases.
Published by Elsevier Ireland Ltd.