Meffil: efficient normalization and analysis of very large DNA methylation datasets

Bioinformatics. 2018 Dec 1;34(23):3983-3989. doi: 10.1093/bioinformatics/bty476.

Abstract

Motivation: DNA methylation datasets are growing ever larger both in sample size and genome coverage. Novel computational solutions are required to efficiently handle these data.

Results: We have developed meffil, an R package designed for efficient quality control, normalization and epigenome-wide association studies of large samples of Illumina Methylation BeadChip microarrays. A complete re-implementation of functional normalization minimizes computational memory without increasing running time. Incorporating fixed and random effects within functional normalization, and automated estimation of functional normalization parameters reduces technical variation in DNA methylation levels, thus reducing false positive rates and improving power. Support for normalization of datasets distributed across physically different locations without needing to share biologically-based individual-level data means that meffil can be used to reduce heterogeneity in meta-analyses of epigenome-wide association studies.

Availability and implementation: https://github.com/perishky/meffil/.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA Methylation*
  • Datasets as Topic
  • Epigenomics*
  • Oligonucleotide Array Sequence Analysis