Bayesian additive regression trees and the General BART model

Stat Med. 2019 Nov 10;38(25):5048-5069. doi: 10.1002/sim.8347. Epub 2019 Aug 28.

Abstract

Bayesian additive regression trees (BART) is a flexible prediction model/machine learning approach that has gained widespread popularity in recent years. As BART becomes more mainstream, there is an increased need for a paper that walks readers through the details of BART, from what it is to why it works. This tutorial is aimed at providing such a resource. In addition to explaining the different components of BART using simple examples, we also discuss a framework, the General BART model that unifies some of the recent BART extensions, including semiparametric models, correlated outcomes, and statistical matching problems in surveys, and models with weaker distributional assumptions. By showing how these models fit into a single framework, we hope to demonstrate a simple way of applying BART to research problems that go beyond the original independent continuous or binary outcomes framework.

Keywords: Bayesian nonparametrics; Dirichlet process mixtures; machine learning; semiparametric models; spatial.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Bayes Theorem*
  • Humans
  • Machine Learning*
  • Regression Analysis*