High accuracy breast cancer classification with BIRADS and coclustering

PLoS One. 2026 Feb 9;21(2):e0340772. doi: 10.1371/journal.pone.0340772. eCollection 2026.

Abstract

Breast cancer is one of the most common disease in women. Most of existing breast cancer classification methods include region segmentation, feature extraction and classification phases. It is hard for doctors to understand the conclusion drawn from low level image features. Besides, in cancer hospital more malignant cases than benign cases can be collected, in physical examination center more benign cases can be collected, causing the imbalance problem. To solve above two problems, this study designed a novel breast cancer classification method based on high level Breast Imaging Reporting and Data System (BI-RADS) features. First, an improved Synthetic Minority Oversampling Technique (SMOTE) algorithm is proposed to generate minority samples for balance. Subsequently, coclustering is adopted to mine diagnostic rules. Finally, with Adaboost, the rules can construct a strong classifier. Comparison experiment results on two public datasets shows that the accuracy, precision, recall F1 of proposed method improves more than 5% than comparison methods. Besides, under different imbalance ratios, accuracy of the proposed method is more than 5% higher than comparison methods.

MeSH terms

  • Algorithms
  • Breast Neoplasms* / classification
  • Breast Neoplasms* / diagnosis
  • Breast Neoplasms* / diagnostic imaging
  • Breast Neoplasms* / pathology
  • Cluster Analysis
  • Female
  • Humans
  • Mammography / methods