Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb 24;15(2):R38.
doi: 10.1186/gb-2014-15-2-r38.

MOABS: model based analysis of bisulfite sequencing data

MOABS: model based analysis of bisulfite sequencing data

Deqiang Sun et al. Genome Biol. .

Abstract

Bisulfite sequencing (BS-seq) is the gold standard for studying genome-wide DNA methylation. We developed MOABS to increase the speed, accuracy, statistical power and biological relevance of BS-seq data analysis. MOABS detects differential methylation with 10-fold coverage at single-CpG resolution based on a Beta-Binomial hierarchical model and is capable of processing two billion reads in 24 CPU hours. Here, using simulated and real BS-seq data, we demonstrate that MOABS outperforms other leading algorithms, such as Fisher's exact test and BSmooth. Furthermore, MOABS analysis can be easily extended to differential 5hmC analysis using RRBS and oxBS-seq. MOABS is available at http://code.google.com/p/moabs/.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overview of the MOABS algorithm. (a) Posterior distribution of methylation ratio inferred from biological replicates. Each curve represents the inferred methylation ratio Beta distribution of a CpG. The symbols at the bottom indicate the observed methylation ratios of all replicates. The values on the top right corner indicate number of methylated reads over number of total reads in each replicate. (b) An example of Credible Methylation Difference (CDIF). Dash curves indicate inferred methylation ratio Beta distributions from low (Sample #1) or high sequencing depth (Sample #2). The black curve is the exact distribution of the methylation difference between two samples. The CDIF is shown as the lower bound of the 95% confidence interval. (c) Ranking of three CpG examples by CDIF, FETP p-value and nominal difference, i.e. direct subtraction of two methylation ratios. The three curves are the exact distributions of methylation differences. The corresponding CDIF values are show as vertical dash lines.
Figure 2
Figure 2
Overview of the MOABS software pipeline. (a) Comprehensive workflow of the MOABS pipeline. (b) An example of hypo-methylated region. (c) A descriptive figure for global methylation distribution of a mouse methylome. The Y-axis on the left is percent of CpGs and the Y-axis on the right is the average of local CpG density at each specified methylation ratio.
Figure 3
Figure 3
Comparison between MOABS and FETP in detecting DMCs. We simulated 1,000,000 CpGs in two samples with predefined true positive or true negative states. In both samples, 900,000 true negative CpGs were initially assigned the same methylation ratios. The density of the methylation ratios fits a bimodal distribution (Additional file 3: Figure S1) frequently observed in real BS-seq data. The remaining true positive 100,000 CpGs were randomly assigned at low ratios [0, 0.25] in one sample and high ratios [0.75,1] in the other sample, respectively. Each methylation ratio was then given a +/-0.05 fluctuation to simulate BS-seq errors. Sequencing depth is randomly sampled from 5-fold to 50-fold. The Y-axis shows the percentage of true DMCs predicted at 5% FDR.
Figure 4
Figure 4
MOABS improves the detection of allele specific DNA methylation. (a) The y-axis shows the number of known DMRs recovered by three different methods. (b) Sensitivity (Y-axis) at 5% FDR with different sequencing depth (X-axis).
Figure 5
Figure 5
MOABS reveals differential methylation underlying TFBSs. (a) UCSC genome browser illustration of one TF binding site. The tracks from top to bottom are genomic positions, RefSeq Gene, HSC Methylation, ESC Methylation, and TFBS. For each CpG, an upward bar denotes the methylation ratio. (b) Distribution of the number of DMCs underlying TFBSs. The inserted boxplot indicates the length distribution of TFBSs with 1–3 DMCs. (c) Number of differentially methylated TFBSs predicted by different methods at 5% FDR. (d) Running enrichment scores for TFBSs. All the CpGs are ranked by each method. The score increases if the CpG is in a TFBS or decreases if not. Only 10000 CpGs are sampled to make this plot, as indicated by the x-axis. The 10000 times of random shuffle of TFBSs determined p-values of the maximum enrichment score to be 1.4E-3, 1.6E-3, and 4E-3 for MOABS, FETP and BSMOOTH respectively. (e) and (f) Same as (c) and (d) with 4X sequencing depth by random sampling. The 10000 times of random shuffle of TFBSs determined p-values of the max enrichment score to be 2.9E-2, 5.1E-2, and 9.2E-2 for MOABS, FETP and BSMOOTH respectively.
Figure 6
Figure 6
MOABS detects differential 5hmc using RRBS and oxBS-Seq. (a) Simulation study of 5hmc detection from oxBS-seq and RRBS. Each point on curves represents the smallest number of reads (X-axis) needed to detect a 5hmc ratio (Y-axis) at specified 5mc ratio (indicated as colors). The thin and thick curves represent FETP and MOABS, respectively. (b) Beta distribution of 5hmc ratio in each sample.

Similar articles

Cited by

References

    1. Jones PA. Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat Rev Genet. 2012;13:484–492. doi: 10.1038/nrg3230. - DOI - PubMed
    1. Tahiliani M, Koh KP, Shen Y, Pastor WA, Bandukwala H, Brudno Y, Agarwal S, Iyer LM, Liu DR, Aravind L, Rao A. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science (New York, NY) 2009;324:930–935. doi: 10.1126/science.1170116. - DOI - PMC - PubMed
    1. He YF, Li BZ, Li Z, Liu P, Wang Y, Tang Q, Ding J, Jia Y, Chen Z, Li L, Sun Y, Li X, Dai Q, Song CX, Zhang K, He C, Xu GL. Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA. Science. 2011;333:1303–1307. doi: 10.1126/science.1210944. - DOI - PMC - PubMed
    1. Song CX, Yi C, He C. Mapping recently identified nucleotide variants in the genome and transcriptome. Nat Biotechnol. 2012;30:1107–1116. doi: 10.1038/nbt.2398. - DOI - PMC - PubMed
    1. Laird PW. Principles and challenges of genomewide DNA methylation analysis. Nat Rev Genet. 2010;11:191–203. doi: 10.1038/nrg2732. - DOI - PubMed

Publication types