Categorizing breast mammographic density: intra- and interobserver reproducibility of BI-RADS density categories

Breast. 2005 Aug;14(4):269-75. doi: 10.1016/j.breast.2004.12.004.


The inter- and intraobserver agreement (kappa-statistic) in reporting according to Breast Imaging Reporting and Data System (BI-RADS((R))) breast density categories was tested in 12 dedicated breast radiologists reading a digitized set of 100 two-view mammograms. Average intraobserver agreement was substantial (kappa=0.71, range 0.32-0.88) on a four-grade scale (D1/D2/D3/D4) and almost perfect (kappa=0.81, range 0.62-1.00) on a two-grade scale (D1-2/D3-4). Average interobserver agreement was moderate (kappa=0.54, range 0.02-0.77) on a four-grade scale and substantial (kappa=0.71, range 0.31-0.88) on a two-grade scale. Major disagreement was found for intermediate categories (D2=0.25, D3=0.28). Categorization of breast density according to BI-RADS is feasible and consistency is good within readers and reasonable between readers. Interobserver inconsistency does occur, and checking the adoption of proper criteria through a proficiency test and appropriate training might be useful. As inconsistency is probably due to erroneous perception of classification criteria, standard sets of reference images should be made available for training.

Publication types

  • Evaluation Study

MeSH terms

  • Breast Neoplasms / diagnostic imaging*
  • Female
  • Humans
  • Mammography / standards*
  • Observer Variation
  • Reproducibility of Results