Buccal epithelial cells are among the most clinically accessible tissues and are increasingly being used to identify epigenetic disease patterns. However, substantial variation in buccal DNA methylation patterns indicates heterogeneity of cell types within and between samples, raising questions of data quality. We systematically estimated cell-type composition for a large collection of buccal and saliva samples from 11 published studies of DNA methylation. In these we identified numerous cases of buccal samples with questionable purity, which may be affected by sampling from individuals with neurodevelopmental disorders, and by the brushes used for sample collection. Further challenges are involved in comparisons with tissues such as saliva, in which buccal component varies widely. We propose a reference-based method of correcting for buccal purity that reduces unwanted variation while preserving cross-tissue differences. Our work demonstrates the wide variation of buccal quality in epigenetic studies and suggests a possible approach to overcome this issue.
Keywords: DNA methylation; Epigenetics; buccal epithelial cells; tissue heterogeneity.