Proposing a novel multi-instance learning model for tuberculosis recognition from chest X-ray images based on CNNs, complex networks and stacked ensemble

Phys Eng Sci Med. 2021 Mar;44(1):291-311. doi: 10.1007/s13246-021-00980-w. Epub 2021 Feb 22.

Abstract

Mycobacterium Tuberculosis (TB) is an infectious bacterial disease. In 2018, about 10 million people has been diagnosed with tuberculosis (TB) worldwide. Early diagnosis of TB is necessary for effective treatment, higher survival rate, and preventing its further transmission. The gold standard for tuberculosis diagnosis is sputum culture. Nevertheless, posterior-anterior chest radiographs (CXR) is an effective central method with low cost and a relatively low radiation dose for screening TB with immediate results. TB diagnosis from CXR is a challenging task requiring high level of expertise due to the diverse presentation of the disease. Significant intra-class variation and inter-class similarity in CXR images makes TB diagnosis from CXR a more challenging task. The main aim of this study is tuberculosis recognition from CXR images for reducing the disease burden. For this purpose, a novel multi-instance classification model is proposed in this study which is based on CNNs, complex networks and stacked ensemble (CCNSE). A main advantage of CCNSE is not requiring an accurate lung segmentation to localize the suspicious regions. Several overlapping patches are extracted from each CXR image. Features describing each patch are obtained by CNNs and then the feature vectors are clustered. Local complex networks (LCN) and global ones (GCN) of the cluster representatives are formed and feature engineering on LCN (GCN) generates other features at image-level (patch-level and image-level). Global clustering on these feature sets is performed for all patches. Each patch is assigned the purity score of its corresponding cluster. Patch-level features and purity scores are aggregated for each image. Finally, the images are classified with a proposed stacked ensemble classifier to normal and TB classes. Two datasets are used in this study including Montgomery County CXR set (MC) and Shenzhen dataset (SZ). MC/SZ includes 138/662 chest X-rays (CXR) from which 80 and 58/326 and 336 images belong to normal/TB classes, respectively. The experimental results show that the proposed method with AUC of 99.00 ± 0.28/98.00 ± 0.16 for MC/SZ and accuracy of 99.26 ± 0.40/99.22 ± 0.32 for MC/SZ with fivefold cross validation strategy is superior than the compared ones for diagnosis of TB from CXR images. The proposed method can be used as a computer-aided diagnosis system to reduce the manual time, effort and dependency to specialist's expertise level.

Keywords: Computer-aided diagnosis; Deep learning; Feature engineering; Tuberculosis diagnosis; medical image processing.

MeSH terms

  • Diagnosis, Computer-Assisted
  • Humans
  • Mycobacterium tuberculosis*
  • Thorax
  • Tuberculosis* / diagnostic imaging
  • X-Rays