A systematic review of automated segmentation of 3D computed-tomography scans for volumetric body composition analysis

J Cachexia Sarcopenia Muscle. 2023 Oct;14(5):1973-1986. doi: 10.1002/jcsm.13310. Epub 2023 Aug 10.

Abstract

Automated computed tomography (CT) scan segmentation (labelling of pixels according to tissue type) is now possible. This technique is being adapted to achieve three-dimensional (3D) segmentation of CT scans, opposed to single L3-slice alone. This systematic review evaluates feasibility and accuracy of automated segmentation of 3D CT scans for volumetric body composition (BC) analysis, as well as current limitations and pitfalls clinicians and researchers should be aware of. OVID Medline, Embase and grey literature databases up to October 2021 were searched. Original studies investigating automated skeletal muscle, visceral and subcutaneous AT segmentation from CT were included. Seven of the 92 studies met inclusion criteria. Variation existed in expertise and numbers of humans performing ground-truth segmentations used to train algorithms. There was heterogeneity in patient characteristics, pathology and CT phases that segmentation algorithms were developed upon. Reporting of anatomical CT coverage varied, with confusing terminology. Six studies covered volumetric regional slabs rather than the whole body. One study stated the use of whole-body CT, but it was not clear whether this truly meant head-to-fingertip-to-toe. Two studies used conventional computer algorithms. The latter five used deep learning (DL), an artificial intelligence technique where algorithms are similarly organized to brain neuronal pathways. Six of seven reported excellent segmentation performance (Dice similarity coefficients > 0.9 per tissue). Internal testing on unseen scans was performed for only four of seven algorithms, whilst only three were tested externally. Trained DL algorithms achieved full CT segmentation in 12 to 75 s versus 25 min for non-DL techniques. DL enables opportunistic, rapid and automated volumetric BC analysis of CT performed for clinical indications. However, most CT scans do not cover head-to-fingertip-to-toe; further research must validate using common CT regions to estimate true whole-body BC, with direct comparison to single lumbar slice. Due to successes of DL, we expect progressive numbers of algorithms to materialize in addition to the seven discussed in this paper. Researchers and clinicians in the field of BC must therefore be aware of pitfalls. High Dice similarity coefficients do not inform the degree to which BC tissues may be under- or overestimated and nor does it inform on algorithm precision. Consensus is needed to define accuracy and precision standards for ground-truth labelling. Creation of a large international, multicentre common CT dataset with BC ground-truth labels from multiple experts could be a robust solution.

Keywords: AI; Body composition measurement; Computed tomography; Deep learning; Sarcopenia; Segmentation.

Publication types

  • Review