Comparison of epidemiologic data from multiple sources

J Chronic Dis. 1986;39(11):889-96. doi: 10.1016/0021-9681(86)90037-8.


We compared epidemiologic data collected from medical records and by interview for 462 subjects who were part of a case-control study of a chronic disease (cancer of the breast). The collected data included such clinical and pharmaceutical features as history of lactation, hysterectomy, diabetes mellitus, type of menopause, and whether a woman had used exogenous estrogens. We found that agreements between medical record and interview data are variable, and depend on the type of data examined and the strategy for handling incomplete or ambiguous (indeterminate) responses. For variables that represent inherent features of the patients' clinical condition, such as gynecologic surgical procedures and a family history of breast cancer, we found excellent agreement between the medical record and interview. For pharmaceutical features, however, we discovered considerable variability between the two data sources. We also detected substantial problems with a common tactic in which information from individual data sources are pooled to form a new "combined data source". In this analysis, combining data sources creates estimates for the proportion exposed that are different from estimates in either of the original information sources.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Neoplasms / etiology
  • Data Collection / methods*
  • Epidemiologic Methods*
  • Female
  • Humans
  • Medical Records
  • Middle Aged
  • Risk
  • Statistics as Topic