Evaluation of two semi-supervised learning methods and their combination for automatic classification of bone marrow cells

Sci Rep. 2022 Oct 6;12(1):16736. doi: 10.1038/s41598-022-20651-4.

Abstract

Differential bone marrow (BM) cell counting is an important test for the diagnosis of various hematological diseases. However, it is difficult to accurately classify BM cells due to non-uniformity and the lack of reproducibility of differential counting. Therefore, automatic classification systems have been developed in which deep learning is used. These systems requires large and accurately labeled datasets for training. To overcome this, we used semi-supervised learning (SSL), in which learning proceeds while labeling. We used three methods: self-training (ST), active learning (AL), and a combination of these methods, and attempted to automatically classify 16 types of BM cell images. ST involves data verification, as in AL, before adding them to the training dataset (confirmed self-training: CST). After 25 rounds of CST, AL, and CST + AL, the initial number of training data increased from 425 to 40,518; 3682; and 47,843, respectively. Accuracies for the test data of 50 images for each cell type were 0.944, 0.941, and 0.976, respectively. Data added with CST or AL showed some imbalances between classes, while CST + AL exhibited fewer imbalances. We suggest that CST + AL, when combined with two SSL methods, is efficient in increasing training data for the development of automatic BM cells classification systems.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bone Marrow Cells*
  • Reproducibility of Results
  • Supervised Machine Learning*