BI-RADS lexicon for US and mammography: interobserver variability and positive predictive value

Radiology. 2006 May;239(2):385-91. doi: 10.1148/radiol.2392042127. Epub 2006 Mar 28.


Purpose: To retrospectively evaluate interobserver variability between breast radiologists by using terminology of the fourth edition of the Breast Imaging Reporting and Data System (BI-RADS) to categorize lesions on mammograms and sonograms and to retrospectively determine the positive predictive value (PPV) of BI-RADS categories 4a, 4b, and 4c.

Materials and methods: Institutional review board approval was obtained; informed consent was not required. This study was HIPAA compliant. Ninety-four consecutive lesions in 91 women who underwent image-guided biopsy comprised 59 masses, 32 calcifications, and three masses with calcification. Five radiologists retrospectively reviewed these lesions. Each observer described each lesion with BI-RADS terminology and assigned a final BI-RADS category. Interobserver variability was assessed with the Cohen kappa statistic. A pathologic diagnosis was available for all 94 lesions; 30 (32%) were malignant and 64 (68%) were benign. Pathologic analysis of benign lesions was performed on tissue obtained with image-guided core-needle biopsy. In cases referred for excisional biopsy after needle biopsy because of atypia or discordance, final surgical pathologic analysis was used for correlation with imaging findings. PPV for category 4 or 5 lesions was determined for all readers combined.

Results: For ultrasonographic (US) descriptors, substantial agreement was obtained for lesion orientation, shape, and boundary (kappa = 0.61, 0.66, and 0.69, respectively). Moderate agreement was obtained for lesion margin and posterior acoustic features (kappa = 0.40 for both). Fair agreement was obtained for lesion echo pattern (kappa = 0.29). For mammographic descriptors, moderate agreement was obtained for mass shape, mass margin, and calcification distribution (kappa = 0.48, 0.48, and 0.50, respectively). Fair agreement was obtained for calcification description (kappa = 0.32). Slight agreement was obtained for mass density (kappa = 0.18). Fair agreement was obtained for final assessment category (kappa = 0.28). PPVs of BI-RADS category 4 and 5 assignments were as follows: category 4a, six (6%) of 102; category 4b, 17 (15%) of 110; category 4c, 48 (53%) of 91; and category 5, 71 (91%) of 78.

Conclusion: Interobserver agreement with the new BI-RADS terminology is good and validates the US lexicon. Subcategories 4a, 4b, and 4c are useful in predicting the likelihood of malignancy.

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Female
  • Humans
  • Mammography / statistics & numerical data*
  • Middle Aged
  • Observer Variation
  • Predictive Value of Tests
  • Retrospective Studies
  • Terminology as Topic*
  • Ultrasonography, Mammary / statistics & numerical data*