Unsupervised Bayesian Inference to Fuse Biosignal Sensory Estimates for Personalizing Care

IEEE J Biomed Health Inform. 2019 Jan;23(1):47-58. doi: 10.1109/JBHI.2018.2820054. Epub 2018 Jun 5.

Abstract

The role of sensing technologies, such as wearables, in delivering precision care is becoming widely acceptable. Given the very large quantities of sensor data that rapidly accumulate, there is a need to employ automated algorithms to label biosignal sensor data. In many real-life clinical applications, no such expert labels are available, and algorithms for processing sensor data must be relied upon, without access to the "ground truth." It is therefore extremely difficult to choose which algorithms to trust or discard at any point in time, where different algorithms may be optimal for different patients, or even for different points in time for the same patient. We propose two fully Bayesian approaches for fusing labels from independent and potentially correlated annotators (i.e., algorithms or, where available, experts). These are generative models to aggregate labels (i.e., the outputs of the algorithms, such as identified ECG morphology) in an unsupervised manner, to estimate jointly the assumed bias and precision of each algorithm without access to the ground truth. The latter fused estimate may then be used to infer the underlying ground truth. For the first time in the biomedical context, we show that modeling correlations between annotators, and fusing information concerning task difficulty (such as the estimated quality of the sensor data), improve these estimates with respect to commonly employed strategies in the literature. Also, we adopt a strongly Bayesian approach to inference using Gibbs sampling to improve estimates over the existing state of the art. We present results from applying the proposed pair of models to simulated and two publicly available biomedical datasets, to demonstrate proof-of-principle. We show that our proposed models outperform all existing approaches recreated from the literature. We also show that the proposed methods are robust when dealing with missing values (as often occurs in real-life biomedical applications), and that they are suitably efficient for use in real-time applications, thereby providing the basis for the reliable use of sensors for personalizing the care of the individual.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Algorithms
  • Bayes Theorem*
  • Child
  • Child, Preschool
  • Databases, Factual
  • Female
  • Humans
  • Infant
  • Male
  • Medical Informatics / methods*
  • Middle Aged
  • Models, Statistical
  • Precision Medicine / methods*
  • Unsupervised Machine Learning*
  • Young Adult