High-dimensional mediation analysis in survival models

PLoS Comput Biol. 2020 Apr 17;16(4):e1007768. doi: 10.1371/journal.pcbi.1007768. eCollection 2020 Apr.


Mediation analysis with high-dimensional DNA methylation markers is important in identifying epigenetic pathways between environmental exposures and health outcomes. There have been some methodology developments of mediation analysis with high-dimensional mediators. However, high-dimensional mediation analysis methods for time-to-event outcome data are still yet to be developed. To address these challenges, we propose a new high-dimensional mediation analysis procedure for survival models by incorporating sure independent screening and minimax concave penalty techniques for variable selection, with the Sobel and the joint method for significance test of indirect effect. The simulation studies show good performance in identifying correct biomarkers, false discovery rate control, and minimum estimation bias of the proposed procedure. We also apply this approach to study the causal pathway from smoking to overall survival among lung cancer patients potentially mediated by 365,307 DNA methylations in the TCGA lung cancer cohort. Mediation analysis using a Cox proportional hazards model estimates that patients who have serious smoking history increase the risk of lung cancer through methylation markers including cg21926276, cg27042065, and cg26387355 with significant hazard ratios of 1.2497(95%CI: 1.1121, 1.4045), 1.0920(95%CI: 1.0170, 1.1726), and 1.1489(95%CI: 1.0518, 1.2550), respectively. The three methylation sites locate in the three genes which have been showed to be associated with lung cancer event or overall survival. However, the three CpG sites (cg21926276, cg27042065 and cg26387355) have not been reported, which are newly identified as the potential novel epigenetic markers linking smoking and survival of lung cancer patients. Collectively, the proposed high-dimensional mediation analysis procedure has good performance in mediator selection and indirect effect estimation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Computational Biology / methods*
  • DNA Methylation / genetics
  • Epigenomics
  • Humans
  • Lung Neoplasms / genetics
  • Lung Neoplasms / mortality
  • Middle Aged
  • Models, Statistical*
  • Smoking / genetics
  • Smoking / mortality
  • Survival Analysis*

Grant support

ZY obtained the following three fundings: 2016YFC0902403(Yu) by Chinese Ministry of Science and Technology (http://www.most.gov.cn/eng/eng/index.htm), 11671256(Yu) by National Natural Science Foundation of China (http://www.nsfc.gov.cn/english/site_1/), and Yu(2017) by University of Michigan and Shanghai Jiao Tong University Collaboration Grant (https://kejichu.sjtu.edu.cn/, no funding ID). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.