A Novel Approach to Detecting Epistasis using Random Sampling Regularisation

IEEE/ACM Trans Comput Biol Bioinform. 2020 Sep-Oct;17(5):1535-1545. doi: 10.1109/TCBB.2019.2948330. Epub 2019 Oct 21.

Abstract

Epistasis is a progressive approach that complements the 'common disease, common variant' hypothesis that highlights the potential for connected networks of genetic variants collaborating to produce a phenotypic expression. Epistasis is commonly performed as a pairwise or limitless-arity capacity that considers variant networks as either variant vs variant or as high order interactions. This type of analysis extends the number of tests that were previously performed in a standard approach such as Genome-Wide Association Study (GWAS), in which False Discovery Rate (FDR) is already an issue, therefore by multiplying the number of tests up to a factorial rate also increases the issue of FDR. Further to this, epistasis introduces its own limitations of computational complexity and intensity that are generated based on the analysis performed; to consider the most intense approach, a multivariate analysis introduces a time complexity of O(n!). Proposed in this paper is a novel methodology for the detection of epistasis using interpretable methods and best practice to outline interactions through filtering processes. Using a process of Random Sampling Regularisation which randomly splits and produces sample sets to conduct a voting system to regularise the significance and reliability of biological markers, SNPs. Preliminary results are promising, outlining a concise detection of interactions. Results for the detection of epistasis, in the classification of breast cancer patients, indicated eight outlined risk candidate interactions from five variants and a singular candidate variant with high protective association.

MeSH terms

  • Artificial Intelligence
  • Breast Neoplasms / genetics
  • Epistasis, Genetic / genetics*
  • Female
  • Genetic Markers / genetics
  • Genome-Wide Association Study
  • Genomics / methods*
  • Humans
  • Logistic Models
  • Phenotype
  • Polymorphism, Single Nucleotide / genetics*

Substances

  • Genetic Markers