A hidden Markov model approach to analyze longitudinal ternary outcomes when some observed states are possibly misclassified

Stat Med. 2016 Apr 30;35(9):1549-57. doi: 10.1002/sim.6861. Epub 2016 Jan 18.


Understanding the dynamic disease process is vital in early detection, diagnosis, and measuring progression. Continuous-time Markov chain (CTMC) methods have been used to estimate state-change intensities but challenges arise when stages are potentially misclassified. We present an analytical likelihood approach where the hidden state is modeled as a three-state CTMC model allowing for some observed states to be possibly misclassified. Covariate effects of the hidden process and misclassification probabilities of the hidden state are estimated without information from a 'gold standard' as comparison. Parameter estimates are obtained using a modified expectation-maximization (EM) algorithm, and identifiability of CTMC estimation is addressed. Simulation studies and an application studying Alzheimer's disease caregiver stress-levels are presented. The method was highly sensitive to detecting true misclassification and did not falsely identify error in the absence of misclassification. In conclusion, we have developed a robust longitudinal method for analyzing categorical outcome data when classification of disease severity stage is uncertain and the purpose is to study the process' transition behavior without a gold standard.

Keywords: disease progression; hidden Markov model; longitudinal data analysis; misclassification.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Data Accuracy
  • Data Interpretation, Statistical
  • Early Diagnosis
  • Humans
  • Longitudinal Studies*
  • Markov Chains*
  • Models, Statistical
  • Time Factors
  • Treatment Outcome