A common epigenetic clock from childhood to old age

Forensic Sci Int Genet. 2022 Sep:60:102743. doi: 10.1016/j.fsigen.2022.102743. Epub 2022 Jun 25.

Abstract

Forensic age estimation is a DNA intelligence tool that forms an important part of Forensic DNA Phenotyping. Criminal cases with no suspects or with unsuccessful matches in searches on DNA databases; human identification analyses in mass disasters; anthropological studies or legal disputes; all benefit from age estimation to gain investigative leads. Several age prediction models have been developed to date based on DNA methylation. Although different DNA methylation technologies as well as diverse statistical methods have been proposed, most of them are based on blood samples and mainly restricted to adult age ranges. In the current study, we present an extended age prediction model based on 895 evenly distributed Spanish DNA blood samples from 2 to 104 years old. DNA methylation levels were detected using Agena Bioscience EpiTYPER® technology for a total of seven CpG sites located at seven genomic regions: ELOVL2, ASPA, PDE4C, FHL2, CCDC102B, MIR29B2CHG and chr16:85395429 (GRCh38). The accuracy of the age prediction system was tested by comparing three statistical methods: quantile regression (QR), quantile regression neural network (QRNN) and quantile regression support vector machine (QRSVM). The most accurate predictions were obtained when using QRNN or QRSVM (mean absolute prediction error, MAE of ± 3.36 and ± 3.41, respectively). Validation of the models with an independent Spanish testing set (N = 152) provided similar accuracies for both methods (MAE: ± 3.32 and ± 3.45, respectively). The main advantage of using quantile regression statistical tools lies in obtaining age-dependent prediction intervals, fitting the error to the estimated age. An additional analysis of dimensionality reduction shows a direct correlation of increased error and a reduction of correct classifications as the training sample size is reduced. Results indicated that a minimum sample size of six samples per year-of-age covered by the training set is recommended to efficiently capture the most inter-individual variability..

Keywords: DNA methylation; EpiTYPER®; Forensic age estimation; Machine learning; Quantile regression.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Aging* / genetics
  • Child
  • Child, Preschool
  • CpG Islands / genetics
  • DNA
  • DNA Methylation
  • Epigenesis, Genetic
  • Forensic Genetics* / methods
  • Humans
  • Middle Aged
  • Young Adult

Substances

  • DNA