Adapting Random Forests to Predict Obesity-Associated Gene Expression

Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul:2022:4407-4410. doi: 10.1109/EMBC48229.2022.9871234.

Abstract

Random forests (RFs) are effective at predicting gene expression from genotype data. However, a comparison of RF regressors and classifiers, including feature selection and encoding, has been under-explored in the context of gene expression prediction. Specifically, we examine the role of ordinal or one-hot encoding and of data balancing via oversam-pling in the prediction of obesity-associated gene expression. Our work shows that RFs compete with PrediXcan in the prediction of obesity-associated gene expression in subcutaneous adipose tissue, a highly relevant tissue to obesity. Additionally, RFs generate predictions for obesity-associated genes where PrediXcan fails to do so.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Gene Expression
  • Humans
  • Obesity* / genetics