Variability in radiologists' interpretations of mammograms

N Engl J Med. 1994 Dec 1;331(22):1493-9. doi: 10.1056/NEJM199412013312206.


Background: Despite the proved value of mammography in screening for breast cancer, its efficacy depends on radiologists' interpretations. The variability in such interpretations is not well understood.

Methods: Using a technique of stratified random sampling, we selected 150 mammograms obtained in 1987: 27 from women with histopathologically confirmed breast cancer and 123 from women with no evidence of breast cancer after three years of follow-up examinations. Ten radiologists, who were unaware of the diagnoses and research hypothesis, each interpreted the 150 mammograms. Disagreement was analyzed within pairs of the 10 radiologists, as well as for the group of 150 women as a whole.

Results: The diagnostic consistency between pairs of radiologists was moderate, with a median weighted percentage of agreement of 78 percent (weighted kappa, 0.47). The frequency of the radiologists' recommendations for an immediate workup ranged from 74 to 96 percent for mammograms from the women with cancer and from 11 to 65 percent for films from the women without cancer. A substantial disagreement in management recommendations--in which one radiologist recommended routine follow-up and another recommended a biopsy for the same patient--occurred in 3 percent of the pairwise comparisons but in 25 percent of the comparisons for the group of women as a whole. When two or more radiologists recommended a biopsy for the same patient, a disagreement in the stated location (right or left breast) occurred in 2 percent of the pairwise comparisons among the radiologists but in 9 percent of comparisons for the group of women as a whole. Because some disagreement was likely, given that 10 radiologists read each film, the pairwise comparison is a more conservative estimate of disagreement.

Conclusions: Although mammography is of value in screening women for breast cancer, radiologists can differ, sometimes substantially, in their interpretations of mammograms and in their recommendations for management. Efforts to improve accuracy and reduce variability in interpretation may increase the effectiveness of mammography in detecting early breast cancers.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Diseases / diagnostic imaging*
  • Breast Neoplasms / diagnostic imaging
  • Clinical Competence
  • Diagnosis, Differential
  • Female
  • Humans
  • Mammography / statistics & numerical data*
  • Observer Variation
  • Patient Selection
  • Radiology / standards
  • Radiology / statistics & numerical data*
  • Sensitivity and Specificity