Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2015 Jul 21;60(14):5571-99.
doi: 10.1088/0031-9155/60/14/5571. Epub 2015 Jul 2.

The 2014 Liver Ultrasound Tracking Benchmark

Affiliations
Free PMC article
Comparative Study

The 2014 Liver Ultrasound Tracking Benchmark

V De Luca et al. Phys Med Biol. .
Free PMC article

Abstract

The Challenge on Liver Ultrasound Tracking (CLUST) was held in conjunction with the MICCAI 2014 conference to enable direct comparison of tracking methods for this application. This paper reports the outcome of this challenge, including setup, methods, results and experiences. The database included 54 2D and 3D sequences of the liver of healthy volunteers and tumor patients under free breathing. Participants had to provide the tracking results of 90% of the data (test set) for pre-defined point-landmarks (healthy volunteers) or for tumor segmentations (patient data). In this paper we compare the best six methods which participated in the challenge. Quantitative evaluation was performed by the organizers with respect to manual annotations. Results of all methods showed a mean tracking error ranging between 1.4 mm and 2.1 mm for 2D points, and between 2.6 mm and 4.6 mm for 3D points. Fusing all automatic results by considering the median tracking results, improved the mean error to 1.2 mm (2D) and 2.5 mm (3D). For all methods, the performance is still not comparable to human inter-rater variability, with a mean tracking error of 0.5-0.6 mm (2D) and 1.2-1.8 mm (3D). The segmentation task was fulfilled only by one participant, resulting in a Dice coefficient ranging from 76.7% to 92.3%. The CLUST database continues to be available and the online leader-board will be updated as an ongoing challenge.

Figures

Figure 1.
Figure 1.
Examples of first frame I(0) of the training data: (top row) 2D sequences (ETH, MED, OX) and (bottom row) 3D sequences (EMC, SMT). Point-landmarks Pj(0) and the contour of the tumor segmentation Sj(0) are depicted in yellow.
Figure 1.
Figure 1.
Examples of first frame I(0) of the training data: (top row) 2D sequences (ETH, MED, OX) and (bottom row) 3D sequences (EMC, SMT). Point-landmarks Pj(0) and the contour of the tumor segmentation Sj(0) are depicted in yellow.
Figure 1.
Figure 1.
Examples of first frame I(0) of the training data: (top row) 2D sequences (ETH, MED, OX) and (bottom row) 3D sequences (EMC, SMT). Point-landmarks Pj(0) and the contour of the tumor segmentation Sj(0) are depicted in yellow.
Figure 2.
Figure 2.
Tracking scheme. First W(t  −  1, I (t)) is registered to W(0, I(0)) providing Tt. If this registration fails, W(t  −  1, I(t)) is registered to corresponding window of previous frame (Tt*).
Figure 2.
Figure 2.
Tracking scheme. First W(t  −  1, I (t)) is registered to W(0, I(0)) providing Tt. If this registration fails, W(t  −  1, I(t)) is registered to corresponding window of previous frame (Tt*).
Figure 2.
Figure 2.
Tracking scheme. First W(t  −  1, I (t)) is registered to W(0, I(0)) providing Tt. If this registration fails, W(t  −  1, I(t)) is registered to corresponding window of previous frame (Tt*).
Figure 3.
Figure 3.
Initialization: (Left) Within radius R0 of a given position, points on a local triangular grid with grid constant R1 are chosen. (Right) Example of point weights (pibrtpidrk, see text) in a first frame: area indicates value and color encodes sign (red: negative, green: positive).
Figure 3.
Figure 3.
Initialization: (Left) Within radius R0 of a given position, points on a local triangular grid with grid constant R1 are chosen. (Right) Example of point weights (pibrtpidrk, see text) in a first frame: area indicates value and color encodes sign (red: negative, green: positive).
Figure 3.
Figure 3.
Initialization: (Left) Within radius R0 of a given position, points on a local triangular grid with grid constant R1 are chosen. (Right) Example of point weights (pibrtpidrk, see text) in a first frame: area indicates value and color encodes sign (red: negative, green: positive).
Figure 4.
Figure 4.
Illustration of tracking performance for landmark P1 from sequence MED-07. Tracking errors MTE1∈MED−07 were 13.22 (KM), 3.84 (MEVIS), 7.46 (MEVIS + FOKUS), 2.88 (MEVIS + MED), 12.72 (PhR) and 1.93 mm (TUM). The mean motion for the landmark was 11.23 mm. Frames at ta, tc and te correspond to end-inhalations (with a deep inhale happening at tc), while tb, td and tf correspond to end-exhalations. In ROI(ta) the manual annotation is shown as a yellow circle.
Figure 4.
Figure 4.
Illustration of tracking performance for landmark P1 from sequence MED-07. Tracking errors MTE1∈MED−07 were 13.22 (KM), 3.84 (MEVIS), 7.46 (MEVIS + FOKUS), 2.88 (MEVIS + MED), 12.72 (PhR) and 1.93 mm (TUM). The mean motion for the landmark was 11.23 mm. Frames at ta, tc and te correspond to end-inhalations (with a deep inhale happening at tc), while tb, td and tf correspond to end-exhalations. In ROI(ta) the manual annotation is shown as a yellow circle.
Figure 4.
Figure 4.
Illustration of tracking performance for landmark P1 from sequence MED-07. Tracking errors MTE1∈MED−07 were 13.22 (KM), 3.84 (MEVIS), 7.46 (MEVIS + FOKUS), 2.88 (MEVIS + MED), 12.72 (PhR) and 1.93 mm (TUM). The mean motion for the landmark was 11.23 mm. Frames at ta, tc and te correspond to end-inhalations (with a deep inhale happening at tc), while tb, td and tf correspond to end-exhalations. In ROI(ta) the manual annotation is shown as a yellow circle.
Figure 5.
Figure 5.
Box-plot summarizing the 2D tracking error (in mm) w.r.t. mean manual annotation of three observers (Obs). Results are ranked (left to right) according to decreasing MTE2D (in green). On each box, the central red line is the median and the edges of the box are given by q1 = 25th and q3 = 75th percentiles of the error. Outliers are drawn as red crosses if larger than q3 + w(q3  −  q1), where w = 1.5 corresponds to approximately ±2.7 STD of the data.
Figure 5.
Figure 5.
Box-plot summarizing the 2D tracking error (in mm) w.r.t. mean manual annotation of three observers (Obs). Results are ranked (left to right) according to decreasing MTE2D (in green). On each box, the central red line is the median and the edges of the box are given by q1 = 25th and q3 = 75th percentiles of the error. Outliers are drawn as red crosses if larger than q3 + w(q3  −  q1), where w = 1.5 corresponds to approximately ±2.7 STD of the data.
Figure 5.
Figure 5.
Box-plot summarizing the 2D tracking error (in mm) w.r.t. mean manual annotation of three observers (Obs). Results are ranked (left to right) according to decreasing MTE2D (in green). On each box, the central red line is the median and the edges of the box are given by q1 = 25th and q3 = 75th percentiles of the error. Outliers are drawn as red crosses if larger than q3 + w(q3  −  q1), where w = 1.5 corresponds to approximately ±2.7 STD of the data.
Figure 6.
Figure 6.
Percentage of failure cases: ratio of annotated 2D landmarks whose TE>3 mm (orange) or TE>5 mm (red) shown for all methods. Results are shown (left to right) according to decreasing MTE2D (see table 3). TE is evaluated with respect to one observer.
Figure 6.
Figure 6.
Percentage of failure cases: ratio of annotated 2D landmarks whose TE>3 mm (orange) or TE>5 mm (red) shown for all methods. Results are shown (left to right) according to decreasing MTE2D (see table 3). TE is evaluated with respect to one observer.
Figure 6.
Figure 6.
Percentage of failure cases: ratio of annotated 2D landmarks whose TE>3 mm (orange) or TE>5 mm (red) shown for all methods. Results are shown (left to right) according to decreasing MTE2D (see table 3). TE is evaluated with respect to one observer.
Figure 7.
Figure 7.
Box-plot summarizing the 3D tracking error (in mm) w.r.t. mean manual annotation of three observers (Obs). Results are ranked (left to right) according to decreasing MTE3D (in green). On each box, the central red line is the median and the edges of the box are given by q1 = 25th and q3 = 75th percentiles of the error. Outliers are drawn as red crosses if larger than q3 + w(q3  −  q1), where w = 1.5 corresponds to approximately ±2.7 STD of the data.
Figure 7.
Figure 7.
Box-plot summarizing the 3D tracking error (in mm) w.r.t. mean manual annotation of three observers (Obs). Results are ranked (left to right) according to decreasing MTE3D (in green). On each box, the central red line is the median and the edges of the box are given by q1 = 25th and q3 = 75th percentiles of the error. Outliers are drawn as red crosses if larger than q3 + w(q3  −  q1), where w = 1.5 corresponds to approximately ±2.7 STD of the data.
Figure 7.
Figure 7.
Box-plot summarizing the 3D tracking error (in mm) w.r.t. mean manual annotation of three observers (Obs). Results are ranked (left to right) according to decreasing MTE3D (in green). On each box, the central red line is the median and the edges of the box are given by q1 = 25th and q3 = 75th percentiles of the error. Outliers are drawn as red crosses if larger than q3 + w(q3  −  q1), where w = 1.5 corresponds to approximately ±2.7 STD of the data.
Figure 8.
Figure 8.
Percentage of failure cases: ratio of annotated 3D landmarks whose TE>3 mm (orange) or TE>5 mm (red) shown for all methods. Results are shown (left to right) according to decreasing MTE3D (see table 4). TE is evaluated with respect to one observer.
Figure 8.
Figure 8.
Percentage of failure cases: ratio of annotated 3D landmarks whose TE>3 mm (orange) or TE>5 mm (red) shown for all methods. Results are shown (left to right) according to decreasing MTE3D (see table 4). TE is evaluated with respect to one observer.
Figure 8.
Figure 8.
Percentage of failure cases: ratio of annotated 3D landmarks whose TE>3 mm (orange) or TE>5 mm (red) shown for all methods. Results are shown (left to right) according to decreasing MTE3D (see table 4). TE is evaluated with respect to one observer.
Figure 9.
Figure 9.
Illustration of tracking performance for landmark P1 from sequence EMC-03. Tracking errors MTE1∈EMC−03 with respect to the mean of 3 observers were 7.77 mm (MEVIS + FOKUS), 5.63 mm (MEVIS + MED) and 9.93 mm (PhR). The mean motion for the landmark was 11.71 mm. Inter-observer errors were 4.06 mm (Obs1), 11.59 mm (Obs2) and 9.76 mm (Obs3). The tracking results and annotations are shown at time t* for the same ROI(t*) with planes cut at the corresponding P1(t*) from each method.
Figure 9.
Figure 9.
Illustration of tracking performance for landmark P1 from sequence EMC-03. Tracking errors MTE1∈EMC−03 with respect to the mean of 3 observers were 7.77 mm (MEVIS + FOKUS), 5.63 mm (MEVIS + MED) and 9.93 mm (PhR). The mean motion for the landmark was 11.71 mm. Inter-observer errors were 4.06 mm (Obs1), 11.59 mm (Obs2) and 9.76 mm (Obs3). The tracking results and annotations are shown at time t* for the same ROI(t*) with planes cut at the corresponding P1(t*) from each method.
Figure 9.
Figure 9.
Illustration of tracking performance for landmark P1 from sequence EMC-03. Tracking errors MTE1∈EMC−03 with respect to the mean of 3 observers were 7.77 mm (MEVIS + FOKUS), 5.63 mm (MEVIS + MED) and 9.93 mm (PhR). The mean motion for the landmark was 11.71 mm. Inter-observer errors were 4.06 mm (Obs1), 11.59 mm (Obs2) and 9.76 mm (Obs3). The tracking results and annotations are shown at time t* for the same ROI(t*) with planes cut at the corresponding P1(t*) from each method.
Figure 10.
Figure 10.
Illustration of tracking performance for S1 from OX-6. The Dice coefficient ranged from 92.5 % (at ta) to 53.6% (at tb). ROI(ta) and ROI(tb) show the overlap of the manual (in yellow) and PhR (in light blue) segmentations.
Figure 10.
Figure 10.
Illustration of tracking performance for S1 from OX-6. The Dice coefficient ranged from 92.5 % (at ta) to 53.6% (at tb). ROI(ta) and ROI(tb) show the overlap of the manual (in yellow) and PhR (in light blue) segmentations.
Figure 10.
Figure 10.
Illustration of tracking performance for S1 from OX-6. The Dice coefficient ranged from 92.5 % (at ta) to 53.6% (at tb). ROI(ta) and ROI(tb) show the overlap of the manual (in yellow) and PhR (in light blue) segmentations.

Similar articles

See all similar articles

Cited by 9 articles

See all "Cited by" articles

References

    1. Arulampalam M S, Maskell S, Gordon N, Clapp T. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans. Signal Process. 2002;50:174–88. doi: 10.1109/78.978374. - DOI
    1. Banerjee J, Klink C, Peters E D, Niessen W J, Moelker A, van Walsum T. Fast and robust 3D ultrasound registration—block and game theoretic matching. Med. Image Anal. 2015;20:173–83. doi: 10.1016/j.media.2014.11.004. - DOI - PubMed
    1. Banerjee J, Klink C, Peters E D, Niessen W J, Moelker A, van Walsum T. 4D liver ultrasound registration. Lecture Notes in Computer Science; Biomedical Image Registration; Berlin: Springer; 2014. pp. pp 194–202.
    1. Cifor A, Risser L, Chung D, Anderson E, Schnabel J. Hybrid feature-based Log-Demons registration for tumour tracking in 2D liver ultrasound images. 2012 IEEE 9th Int. Symp. on Biomedical Imaging; 2012. pp. pp 724–7.
    1. Cifor A, Risser L, Chung D, Anderson E, Schnabel J. Hybrid feature-based diffeomorphic registration for tumor tracking in 2D liver ultrasound images. IEEE Trans. Med. Imaging. 2013;32:1647–56. doi: 10.1109/TMI.2013.2262055. - DOI - PubMed

Publication types

Feedback