Risk estimation and risk prediction using machine-learning methods

Hum Genet. 2012 Oct;131(10):1639-54. doi: 10.1007/s00439-012-1194-y. Epub 2012 Jul 3.


After an association between genetic variants and a phenotype has been established, further study goals comprise the classification of patients according to disease risk or the estimation of disease probability. To accomplish this, different statistical methods are required, and specifically machine-learning approaches may offer advantages over classical techniques. In this paper, we describe methods for the construction and evaluation of classification and probability estimation rules. We review the use of machine-learning approaches in this context and explain some of the machine-learning algorithms in detail. Finally, we illustrate the methodology through application to a genome-wide association analysis on rheumatoid arthritis.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Arthritis, Rheumatoid / genetics
  • Artificial Intelligence*
  • Genetic Predisposition to Disease
  • Genome-Wide Association Study
  • Humans
  • Models, Genetic
  • Models, Statistical
  • Probability
  • Reproducibility of Results
  • Risk Assessment