The potential influence of underlying differences in relative leukocyte distributions in studies involving blood-based profiling of DNA methylation is well recognized and has prompted development of a set of statistical methods for inferring changes in the distribution of white blood cells using DNA methylation signatures. However, the extent to which this methodology can accurately predict cell-type proportions based on blood-derived DNA methylation data in a large-scale epigenome-wide association study (EWAS) has yet to be examined. We used publicly available data deposited in the Gene Expression Omnibus (GEO) database (accession number GSE37008), which consisted of both blood-derived epigenome-wide DNA methylation data assayed using the Illumina Infinium HumanMethylation27 BeadArray and complete blood cell (CBC) counts among a community cohort of 94 non-diseased individuals. Constrained projection (CP) was used to obtain predictions of the proportions of lymphocytes, monocytes and granulocytes for each of the study samples based on their DNA methylation signatures. Our findings demonstrated high consistency between the average CBC-derived and predicted percentage of monocytes and lymphocytes (17.9% and 17.6% for monocytes and 82.1% and 81.4% for lymphocytes), with root mean squared error (rMSE) of 5% and 6%, for monocytes and lymphocytes, respectively. Similarly, there was moderate-high correlation between the CP-predicted and CBC-derived percentages of monocytes and lymphocytes (0.60 and 0.61, respectively), and these results were robust to the number of leukocyte differentially methylated regions (L-DMRs) used for CP prediction. These results serve as further validation of the CP approach and highlight the promise of this technique for EWAS where DNA methylation is profiled using whole-blood genomic DNA.
Keywords: DNA methylation; cell mixture analysis; leukocytes; mixture deconvolution; whole-blood.