Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Feb;36(2):674-683.
doi: 10.1109/TMI.2016.2621185. Epub 2016 Nov 9.

DeepCut: Object Segmentation From Bounding Box Annotations Using Convolutional Neural Networks

DeepCut: Object Segmentation From Bounding Box Annotations Using Convolutional Neural Networks

Martin Rajchl et al. IEEE Trans Med Imaging. 2017 Feb.

Abstract

In this paper, we propose DeepCut, a method to obtain pixelwise object segmentations given an image dataset labelled weak annotations, in our case bounding boxes. It extends the approach of the well-known GrabCut [1] method to include machine learning by training a neural network classifier from bounding box annotations. We formulate the problem as an energy minimisation problem over a densely-connected conditional random field and iteratively update the training targets to obtain pixelwise object segmentations. Additionally, we propose variants of the DeepCut method and compare those to a naïve approach to CNN training under weak supervision. We test its applicability to solve brain and lung segmentation problems on a challenging fetal magnetic resonance dataset and obtain encouraging results in terms of accuracy.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
CNN architecture with convolutional (conv), max-pooling (pool) layers, and fully connected layers for foreground/background classification. Layer, which inputs are subjected to 50% dropout [33] are marked with *.
Fig. 2
Fig. 2
Naïve CNN learning versus the proposed DeepCut approach, iteratively updating the learning target classes for input patches.
Fig. 3
Fig. 3
Example brain segmentation results for all compared methods: Top row (from left to right): (1) original image (2) manual segmentation (red), (3) initial bounding box B with halo H, (4) GrabCut [1] (GC, blue). Bottom Row: (5) naïve learning approach (CNNnaïve, yellow), (6) DeepCut from bounding boxes (DCBB, purple), (7) DeepCut from pre-segmentation (DCPS, pink) and (8) fully supervised CNN segmentation (CNNFS, cyan).
Fig. 4
Fig. 4
Example lung segmentation results for all compared methods: Top row (from left to right): (1) original image (2) manual segmentation (red), (3) initial bounding box B with halo H, (4) GrabCut [1] (GC, blue). Bottom Row: (5) naïve learning approach (CNNnaïve, yellow), (6) DeepCut from bounding boxes (DCBB, purple), (7) DeepCut from pre-segmentation (DCPS, pink) and (8) fully supervised CNN segmentation (CNNFS, cyan).
Fig. 5
Fig. 5
Comparative accuracy results for the segmentation of the fetal brain (a) and lungs (b) for all methods: Initial bounding boxes (BB), GrabCut [1] (GC), naïve CNN CNNnaïve learning approach from bounding boxes (CNNBB), DeepCut initialised from bounding boxes (DCBB), DeepCut initialised via pre-segmentation (DCPS) and a fully supervised learning approach from manual segmentations (CNNFS) as upper bound for this network architecture.
Fig. 6
Fig. 6
Accuracy improvement in terms of DSC over DeepCut iterations in case of fetal brain segmentations. DeepCut initialisation with bounding boxes (DCBB) (red) versus initialisation with pre-segmentation (DCPS) (blue) in context with lower (CNNnaïve) and upper (CNNFS) accuracy bound, depicted with mean (black) and standard deviations (grey).

Similar articles

Cited by

References

    1. Rother C, Kolmogorov V, Blake A. Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics (TOG) 2004;23(3):309–314.
    1. Papandreou G, Chen L-C, Murphy K, Yuille AL. Weakly-and semi-supervised learning of a dcnn for semantic image segmentation. arXiv preprint arXiv:1502.02734. 2015
    1. Dai J, He K, Sun J. Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. arXiv preprint arXiv:1503.01640. 2015
    1. Schlegl T, Waldstein SM, Vogl W-D, Schmidt-Erfurth U, Langs G. Predicting semantic descriptions from medical images with convolutional neural networks. Information Processing in Medical Imaging; Springer; 2015. pp. 437–448. - PubMed
    1. Lempitsky V, Kohli P, Rother C, Sharp T. Image segmentation with a bounding box prior. Computer Vision, 2009 IEEE 12th International Conference on; IEEE; 2009. pp. 277–284.