Phenotyping through Semi-Supervised Tensor Factorization (PSST)

AMIA Annu Symp Proc. 2018 Dec 5;2018:564-573. eCollection 2018.


A computational phenotype is a set of clinically relevant and interesting characteristics that describe patients with a given condition. Various machine learning methods have been proposed to derive phenotypes in an automatic, high-throughput manner. Among these methods, computational phenotyping through tensor factorization has been shown to produce clinically interesting phenotypes. However, few of these methods incorporate auxiliary patient information into the phenotype derivation process. In this work, we introduce Phenotyping through Semi-Supervised Tensor Factorization (PSST), a method that leverages disease status knowledge about subsets of patients to generate computational phenotypes from tensors constructed from the electronic health records of patients. We demonstrate the potential of PSST to uncover predictive and clinically interesting computational phenotypes through case studies focusing on type-2 diabetes and resistant hypertension. PSST yields more discriminative phenotypes compared to the unsupervised methods and more meaningful phenotypes compared to a supervised method.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Computational Biology*
  • Diabetes Mellitus, Type 2 / diagnosis
  • Electronic Health Records
  • Humans
  • Hypertension / diagnosis
  • Machine Learning
  • Phenotype*