Age prediction has been in the spotlight recently because it can provide an important information about the contributors of biological evidence left at crime scenes. Specifically, many researchers have actively suggested age-prediction models using DNA methylation at several CpG sites and tested the candidates using platforms such as the HumanMethylation 450 array and pyrosequencing. With DNA methylation data obtained from each platform, age prediction models were constructed using diverse statistical methods typically with multivariate linear regression. However, because each developed model is based on single-platform data, the prediction accuracy is reduced when applying DNA methylation data obtained from other platforms. In this study, bisulfite sequencing data for 95 saliva samples were generated using massively parallel sequencing (MPS) and compared with methylation SNaPshot data from the same 95 individuals. The predicted age obtained by applying MPS data to an age-prediction model built for methylation SNaPshot data differed greatly from the chronological age due to platform differences. Therefore, novel variables were introduced to indicate the platform type, and construct platform-independent age predictive models using a neural network and multivariate linear regression. The final neural network model had a mean absolute deviation (MAD) of 3.19 years between the predicted and chronological age, and the mean absolute percentage error (MAPE) was 8.89% in the test set. Similarly, the linear regression model showed 3.69 years of MAD and 10.44% of MAPE in the same test set. The platform-independent age-prediction model was made extensible to an increasing number of platforms by introducing platform variables, and the idea of platform variables can be applied to age prediction models for other body fluids.
Keywords: Age prediction; DNA methylation; MPS; Methylation SNaPshot; Neural network.
Copyright © 2018 Elsevier B.V. All rights reserved.