Impact of different definitions on estimates of accuracy of the diagnosis data in a clinical database

J Clin Epidemiol. 2001 Aug;54(8):782-8. doi: 10.1016/s0895-4356(01)00339-0.


Computerized medical databases are increasingly used for research. The influence of different definitions of the accuracy of matching on the estimated accuracy of diagnosis data was assessed in a database of visits to a public pediatric clinic. Differences between definitions involved 1) unit of analysis, 2) number of diagnoses required to match per visit, and/or 3) whether database contents are required to match the medical record or medical record contents are required to be matched in the database. Overall, 90% of diagnoses in the database (391/435) were accurately coded relative to the medical record. Alternatively, 77% of diagnoses listed in the medical record (391/506) were accurately coded in the database. When individual visits were used as the unit of analysis, estimates of accuracy using six definitions ranged from 65% to 92%. The most appropriate definition to use for estimating accuracy of diagnosis data likely depends on the purpose of the study. Use of two or more such definitions may enhance portrayal of the accuracy of diagnosis data.

MeSH terms

  • Algorithms
  • Data Interpretation, Statistical*
  • Databases, Factual*
  • Humans
  • Infant
  • Medical Records Systems, Computerized*
  • Otitis Media / diagnosis*
  • Quality Assurance, Health Care*
  • Reproducibility of Results