Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Mar 25:3:4.
doi: 10.21037/jmai.2019.10.03.

Improving ultrasound video classification: an evaluation of novel deep learning methods in echocardiography

Affiliations
Free PMC article

Improving ultrasound video classification: an evaluation of novel deep learning methods in echocardiography

James P Howard et al. J Med Artif Intell. .
Free PMC article

Abstract

Echocardiography is the commonest medical ultrasound examination, but automated interpretation is challenging and hinges on correct recognition of the 'view' (imaging plane and orientation). Current state-of-the-art methods for identifying the view computationally involve 2-dimensional convolutional neural networks (CNNs), but these merely classify individual frames of a video in isolation, and ignore information describing the movement of structures throughout the cardiac cycle. Here we explore the efficacy of novel CNN architectures, including time-distributed networks and two-stream networks, which are inspired by advances in human action recognition. We demonstrate that these new architectures more than halve the error rate of traditional CNNs from 8.1% to 3.9%. These advances in accuracy may be due to these networks' ability to track the movement of specific structures such as heart valves throughout the cardiac cycle. Finally, we show the accuracies of these new state-of-the-art networks are approaching expert agreement (3.6% discordance), with a similar pattern of discordance between views.

Keywords: Echocardiography; deep learning; medical ultrasound; neural networks.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: The authors have no conflicts of interest to declare.

Figures

Figure 1
Figure 1
The four different types of neural network architectures used in this study, along with the lowest error rate of each model within each group. The best-performing neural network was a “two-stream” network using both spatial and optical flow inputs, with a corresponding error rate of only 3.9%. Conversely, the 3D CNN architectures failed to classify echocardiograms. Conv, convolutional layer; batch norm, batch normalisation layer; ReLu, rectified linear unit layer. 3D, three-dimensional; CNN, convolutional neural network.
Figure 2
Figure 2
The 14 echocardiographic views. A2CH, apical 2 chamber; A3CH, apical 3 chamber; A4CH, apical 4 chamber; A5CH, apical 5 chamber; Ao, aorta; AV, aortic valve; IAS, interatrial septum; LA, left atrium; LV, left ventricle; PLAX, parasternal long axis; PS, parasternal; PV, pulmonary valve; RA, right atrium; RV, right ventricle; TV, tricuspid valve.
Figure 3
Figure 3
Confusion matrices for the best-performing classical CNN model (A), time-distributed model (B), 3D CNN (C) and two-stream network (D). The improvement associated with using the two-stream network versus the classical CNN is shown in (E). The inter-human agreement confusion matrix is shown in (F). CNN, convolutional neural network; 3D, three-dimensional.
Figure 4
Figure 4
A comparison of a ‘PLAX inflow’ view of the tricuspid valve (A) and a ‘parasternal (aortic and) pulmonary valves’ (B) view of the pulmonary valve (also termed ‘RV outflow’). These views are almost indistinguishable in a still image. However, when viewed as a video, the echocardiograph on the left clearly demonstrates the valve opening upwards (inwards; see arrows), allowing blood flow into the heart through the tricuspid valve, whereas the echocardiogram on the left shows the valve leaflets opening downwards (outwards; see arrows) allowing flow out of the heart. Misclassifications of these classes were common using classical 2D CNNs, but are almost eliminated by employing temporal models such as the ‘two stream’ networks. Saliency mapping can be used to visualise how the features from the pulmonary valve video contribute towards the two-stream network’s decision. (C,D) The important features leading to the classification are highlighted in cyan; (C) shows the spatial arm of the network appears to use the anatomical borders of the major cardiac structures present (pulmonary artery and left ventricle); (D) however, shows the decision of the temporal arm of the network is overwhelmingly influenced by the optical flow data of the valve itself. CNN, convolutional neural network.

Similar articles

Cited by

References

    1. Howard JP, Fisher L, Shun-Shin MJ, et al. Cardiac Rhythm Device Identification Using Neural Networks. JACC Clin Electrophysiol. 2019;5:576–86. - PMC - PubMed
    1. Esteva A, Kuprel B, Novoa RA. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542:115–8. - PubMed
    1. Rajpurkar P, Irvin J, Zhu K, et al. CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning. arXiv: 1711.05225v3 [cs.CV] 2017 Dec 25;
    1. Madani A, Arnaout R, Mofrad M. Fast and accurate view classification of echocardiograms using deep learning. NPJ Digit Med. 2018 doi: 10.1038/s41746-017-0013-1. - DOI - PMC - PubMed
    1. Zhang J, Gajjala S, Agrawal P. Fully Automated Echocardiogram Interpretation in Clinical Practice. Circulation. 2018;138:1623–35. - PMC - PubMed

LinkOut - more resources