Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Apr;49(4):635-642.
doi: 10.1038/ng.3805. Epub 2017 Mar 6.

Identification of Methylation Haplotype Blocks Aids in Deconvolution of Heterogeneous Tissue Samples and Tumor Tissue-Of-Origin Mapping From Plasma DNA

Affiliations
Free PMC article

Identification of Methylation Haplotype Blocks Aids in Deconvolution of Heterogeneous Tissue Samples and Tumor Tissue-Of-Origin Mapping From Plasma DNA

Shicheng Guo et al. Nat Genet. .
Free PMC article

Abstract

Adjacent CpG sites in mammalian genomes can be co-methylated owing to the processivity of methyltransferases or demethylases, yet discordant methylation patterns have also been observed, which are related to stochastic or uncoordinated molecular processes. We focused on a systematic search and investigation of regions in the full human genome that show highly coordinated methylation. We defined 147,888 blocks of tightly coupled CpG sites, called methylation haplotype blocks, after analysis of 61 whole-genome bisulfite sequencing data sets and validation with 101 reduced-representation bisulfite sequencing data sets and 637 methylation array data sets. Using a metric called methylation haplotype load, we performed tissue-specific methylation analysis at the block level. Subsets of informative blocks were further identified for deconvolution of heterogeneous samples. Finally, using methylation haplotypes we demonstrated quantitative estimation of tumor load and tissue-of-origin mapping in the circulating cell-free DNA of 59 patients with lung or colorectal cancer.

Conflict of interest statement

Competing Financial interests

S. Guo, D. Diep and Ku. Zhang were listed as inventors in patent applications related to the methods disclosed in this manuscript. Ku. Z. is a co-founder and scientific advisor of Singlera Genomics Inc.

Figures

Figure 1
Figure 1
Identification and characterization of human methylation haplotype blocks (MHBs). (a) Schematic overview of data generation and analysis. (b) An example of MHB at the promoter of the gene APC. (c) Smooth scatterplots of methylation linkage disequilibrium within MHBs. Red indicate relative higher density and blue indicates relative low density. The yellow dotted lines and percentages highlight the reduction of high linkage disequilibrium (r2>0.9). (d) Co-localization of MHBs with known genomic features. (e) Enrichment of MHBs in known genomic features.
Figure 2
Figure 2
Comparison of methylation haplotype load with four other metrics used in the literatures. Five patterns of methylation haplotype combinations are used to illustrate the difference between methylation frequency, methylation entropy, epi-polymorphism and methylation haplotype load. MHL is the only metric that can discriminate all the five patterns.
Figure 3
Figure 3
Tissue clustering based on methylation haplotype load. (a) MHL based unsupervised clustering of human tissues using the 15% most variable regions. (b) Supervised clustering of germ-layer specific MHBs. (c) MHL exhibits better signal-to-noise ratio than AMF and IMF for sample clustering.
Figure 4
Figure 4
Quantitative estimation of cancer DNA proportion in cell-free DNA based on MHL of informative MHBs. (a) Colorectal cancer (b) Lung cancer. Informative MHBs were selected based on the presence of high-MHL in cancer solid tissues (CT) and the absence of MHL in whole blood (WB). Group II regions have high MHL in cancer tissues (MHL>0.5) and cancer plasma while low MHL in WB and normal tissues (MHL<0.1), and hence were selected for further analysis. Bar-plots show average MHL in different groups of samples. MHL in cancer plasma (CCP and LCP) and normal plasma (NP) were compared with a two-tail t-test. NCT denotes normal colon tissues, NLT denotes normal lung tissues, and ONT denotes other normal tissues. (c) MHL has higher signal-to-noise ratio (Mean/SD) than individual 5mC levels as tumor fraction decreases. X-axis is the tumor fraction in synthetic mixtures. (d) Estimation of the cancer DNA proportions in plasma samples. CCP denotes colorectal cancer plasma, LCP denotes lung cancer plasma, and NP denotes normal plasma.
Figure 5
Figure 5
MHL-based prediction of cancer tissue-of-origin from plasma DNA. (a) Detection of tissue-specific MHL in the plasma of cancer patients, but not normal plasma or whole blood. Tissue specific MHL were visible in corresponding tissue and cancer plasma, indicating the feasibility for tissue-of-origin mapping. (b) Identification of informative MHBs for tissue prediction, using training data included WGBS and RRBS datasets from 10 human normal tissues. (c) Application of the prediction model to plasma samples from cancer patients and normal individuals.

Comment in

Similar articles

See all similar articles

Cited by 70 articles

See all "Cited by" articles

References

    1. Wigler M, Levy D, Perucho M. The somatic replication of DNA methylation. Cell. 1981;24:33–40. - PubMed
    1. Landau DA, et al. Locally disordered methylation forms the basis of intratumor methylome variation in chronic lymphocytic leukemia. Cancer Cell. 2014;26:813–25. - PMC - PubMed
    1. Slatkin M. Linkage disequilibrium--understanding the evolutionary past and mapping the medical future. Nat Rev Genet. 2008;9:477–85. - PMC - PubMed
    1. Shoemaker R, Deng J, Wang W, Zhang K. Allele-specific methylation is prevalent and is contributed by CpG-SNPs in the human genome. Genome Res. 2010;20:883–9. - PMC - PubMed
    1. Jones B. DNA methylation: Switching phenotypes with epialleles. Nat Rev Genet. 2014;15:572. - PubMed
Feedback