Complex-Trait Prediction in the Era of Big Data

Trends Genet. 2018 Oct;34(10):746-754. doi: 10.1016/j.tig.2018.07.004. Epub 2018 Aug 20.


Accurate prediction of complex traits requires using a large number of DNA variants. Advances in statistical and machine learning methodology enable the identification of complex patterns in high-dimensional settings. However, training these highly parameterized methods requires very large data sets. Until recently, such data sets were not available. But the situation is changing rapidly as very large biomedical data sets comprising individual genotype-phenotype data for hundreds of thousands of individuals become available in public and private domains. We argue that the convergence of advances in methodology and the advent of Big Genomic Data will enable unprecedented improvements in complex-trait prediction; we review theory and evidence supporting our claim and discuss challenges and opportunities that Big Data will bring to complex-trait prediction.

Keywords: Big Data; GWAS; SNP; complex traits; disease risk; prediction.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Big Data*
  • Genome-Wide Association Study / trends*
  • Genomics
  • Genotype
  • Humans
  • Models, Genetic
  • Multifactorial Inheritance / genetics*
  • Polymorphism, Single Nucleotide / genetics
  • Quantitative Trait Loci / genetics*