Use of survey and clinical data for screening and diagnosis

Stat Med. 2007 Jul 30;26(17):3213-28. doi: 10.1002/sim.2792.

Abstract

Background: U.S. prevalence of diabetes in 2005 was 20.8 million people (6.2 million undiagnosed) (Diabetes Statistics. 2006). Recognizable preclinical stage is between 7 and 12 years. Efficient screening and early diagnosis can help: 1. avoid or delay development of diabetes. 2. treat early and avoid co-morbidities.

Design: Retrospective cross-sectional study of 153 113 adults ages 24 to 83 from the Behavioural Risk Factor Surveillance Systems (BRFSS)-2003 of whom 4379 had diabetes at their current age or during the previous year and 2190 adults ages 40 to 74 from the National Health and Nutrition Examination Survey III, 211 of whom had glucose tolerance test result > or = 200 mg/dL.

Objectives: To develop statistical models for screening and diagnosis.

Methods: Logistic and generalized linear and additive regression models, Akaike information criterion, area under the receiver operating characteristic (ROC) curve, Baye's rule.

Results: Area under the 'productivity' curve using BRFSS data is 0.65 indicating an average yield of 17.3 per cent. Survey data is useful also for diagnosis. Area under the ROC curve (AUC) using only survey data is 0.68. AUC for fasting plasma glucose (FPG) alone is 0.91. Stepwise drop one method of selecting co-variates for diagnosis included pre-test probability from BRFSS. When both FPG and pre-test information are included, AUC increases to 0.93. Reduction in residual deviance and 0.02 increase in AUC are statistically significant (p = 0.0012). Clinical significance of prior odds is shown by increase in accuracy and weighted average of sensitivity and specificity. Empirically estimated regression weights for pre-test and test information vary with age and race and are not equal, as required by Baye's theorem. Overfitting index was less than 1 per cent.

Conclusions: Cross-section surveys are useful for increasing screening efficiency and diagnostic accuracy.

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Bayes Theorem
  • Cross-Sectional Studies
  • Diabetes Mellitus / diagnosis*
  • Diabetes Mellitus / epidemiology
  • Humans
  • Mass Screening*
  • Middle Aged
  • Models, Statistical*
  • ROC Curve
  • Retrospective Studies
  • United States / epidemiology