Objective: To determine the data sources and 'look back' intervals to define comorbidities.
Data sources: Hospital discharge abstracts database (DAD), physician claims, population registry and death registry from April 1, 1994 to March 31, 2010 in Alberta, Canada.
Study design: Newly-diagnosed hypertension cases from 1997 to 2008 fiscal years were identified and followed up to 12 years. We defined comorbidities using data sources and duration of retrospective observation (6 months, 1 year, 2 years, and 3 years). The C-statistics for logistic regression and concordance index (CI) for Cox model of mortality and cardiovascular disease hospitalization were used to evaluate discrimination performance for each approach of defining comorbidities.
Principal findings: The comorbidities prevalence became higher with a longer duration. Using DAD alone underestimated the prevalence by about 75%, compared to using both DAD and physician claims. The C-statistic and CI were highest when both DAD and physician claims were used, and model performance improved when observation duration increased from 6 months to one year or longer.
Conclusion: The comorbidities prevalence is greatly impacted by the data source and duration of retrospective observation. A combination of DAD and physicians claims with at least one year observation duration improves predictions for cardiovascular disease and one-year mortality outcome model performance.