A new efficient method to detect genetic interactions for lung cancer GWAS

BMC Med Genomics. 2020 Oct 30;13(1):162. doi: 10.1186/s12920-020-00807-9.

Abstract

Background: Genome-wide association studies (GWAS) have proven successful in predicting genetic risk of disease using single-locus models; however, identifying single nucleotide polymorphism (SNP) interactions at the genome-wide scale is limited due to computational and statistical challenges. We addressed the computational burden encountered when detecting SNP interactions for survival analysis, such as age of disease-onset. To confront this problem, we developed a novel algorithm, called the Efficient Survival Multifactor Dimensionality Reduction (ES-MDR) method, which used Martingale Residuals as the outcome parameter to estimate survival outcomes, and implemented the Quantitative Multifactor Dimensionality Reduction method to identify significant interactions associated with age of disease-onset.

Methods: To demonstrate efficacy, we evaluated this method on two simulation data sets to estimate the type I error rate and power. Simulations showed that ES-MDR identified interactions using less computational workload and allowed for adjustment of covariates. We applied ES-MDR on the OncoArray-TRICL Consortium data with 14,935 cases and 12,787 controls for lung cancer (SNPs = 108,254) to search over all two-way interactions to identify genetic interactions associated with lung cancer age-of-onset. We tested the best model in an independent data set from the OncoArray-TRICL data.

Results: Our experiment on the OncoArray-TRICL data identified many one-way and two-way models with a single-base deletion in the noncoding region of BRCA1 (HR 1.24, P = 3.15 × 10-15), as the top marker to predict age of lung cancer onset.

Conclusions: From the results of our extensive simulations and analysis of a large GWAS study, we demonstrated that our method is an efficient algorithm that identified genetic interactions to include in our models to predict survival outcomes.

Keywords: Genetic interactions; Genome-wide association study; Lung cancer; Machine learning.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Algorithms*
  • Biomarkers, Tumor / genetics*
  • Biomarkers, Tumor / metabolism
  • Case-Control Studies
  • Computational Biology / methods*
  • Female
  • Gene Expression Regulation, Neoplastic*
  • Genetic Predisposition to Disease
  • Genome-Wide Association Study*
  • Genotype
  • Humans
  • Lung Neoplasms / genetics
  • Lung Neoplasms / metabolism
  • Lung Neoplasms / mortality*
  • Lung Neoplasms / pathology
  • Male
  • Middle Aged
  • Multifactor Dimensionality Reduction
  • Polymorphism, Single Nucleotide*
  • Prognosis
  • Survival Rate
  • Young Adult

Substances

  • Biomarkers, Tumor