Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2023 Apr;50(5):1337-1350.
doi: 10.1007/s00259-022-06097-w. Epub 2023 Jan 12.

Low-count whole-body PET/MRI restoration: an evaluation of dose reduction spectrum and five state-of-the-art artificial intelligence models

Affiliations
Multicenter Study

Low-count whole-body PET/MRI restoration: an evaluation of dose reduction spectrum and five state-of-the-art artificial intelligence models

Yan-Ran Joyce Wang et al. Eur J Nucl Med Mol Imaging. 2023 Apr.

Abstract

Purpose: To provide a holistic and complete comparison of the five most advanced AI models in the augmentation of low-dose 18F-FDG PET data over the entire dose reduction spectrum.

Methods: In this multicenter study, five AI models were investigated for restoring low-count whole-body PET/MRI, covering convolutional benchmarks - U-Net, enhanced deep super-resolution network (EDSR), generative adversarial network (GAN) - and the most cutting-edge image reconstruction transformer models in computer vision to date - Swin transformer image restoration network (SwinIR) and EDSR-ViT (vision transformer). The models were evaluated against six groups of count levels representing the simulated 75%, 50%, 25%, 12.5%, 6.25%, and 1% (extremely ultra-low-count) of the clinical standard 3 MBq/kg 18F-FDG dose. The comparisons were performed upon two independent cohorts - (1) a primary cohort from Stanford University and (2) a cross-continental external validation cohort from Tübingen University - in order to ensure the findings are generalizable. A total of 476 original count and simulated low-count whole-body PET/MRI scans were incorporated into this analysis.

Results: For low-count PET restoration on the primary cohort, the mean structural similarity index (SSIM) scores for dose 6.25% were 0.898 (95% CI, 0.887-0.910) for EDSR, 0.893 (0.881-0.905) for EDSR-ViT, 0.873 (0.859-0.887) for GAN, 0.885 (0.873-0.898) for U-Net, and 0.910 (0.900-0.920) for SwinIR. In continuation, SwinIR and U-Net's performances were also discreetly evaluated at each simulated radiotracer dose levels. Using the primary Stanford cohort, the mean diagnostic image quality (DIQ; 5-point Likert scale) scores of SwinIR restoration were 5 (SD, 0) for dose 75%, 4.50 (0.535) for dose 50%, 3.75 (0.463) for dose 25%, 3.25 (0.463) for dose 12.5%, 4 (0.926) for dose 6.25%, and 2.5 (0.534) for dose 1%.

Conclusion: Compared to low-count PET images, with near-to or nondiagnostic images at higher dose reduction levels (up to 6.25%), both SwinIR and U-Net significantly improve the diagnostic quality of PET images. A radiotracer dose reduction to 1% of the current clinical standard radiotracer dose is out of scope for current AI techniques.

Keywords: CNN; Deep learning; PET restoration; Transformer model; Whole-body PET imaging.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial or non-financial interests.

Figures

Figure 1:
Figure 1:. Schematic overviews of AI algorithm frameworks for low-count PET reconstruction
(A) The classic U-net model. (B) The adapted EDSR (enhanced deep super-resolution network) model. (C) The GAN (generative adversarial network) model. (D) The EDSR-ViT model. EDSR-ViT takes the feature encoder part from the adapted EDSR (B) directly, and makes use of the ViT (Visual transformer) block to obtain global self-attention within the image. (E) The SwinIR model, consisting of Swin transformer blocks. The main difference of Swin transformer and ViT transformer is where the self-attention operation applies. For Swin transformer block, the self-attention is applied within each of the local windows, including the regular window partitions (Layer l) and the following shifted-windows (Layer l+1, etc). For ViT, the self-attention is applied within the global image, which is equally partitioned into fixed-size patches.
Figure 2:
Figure 2:. PET image comparison across five state-of-the-art AI algorithms on 6.25% low-count PET reconstruction.
(A) Representative 18F-FDG PET scan of a 29-year-old female patient with Hodgkin lymphoma (HL). The enlarged patches are shown on the second panel (yellow arrows: basal ganglia). The structural similarity index (SSIM) and visual information fidelity (VIF) metrics are presented under each PET image. (B) Representative 18F-FDG PET scan of a 14-year-old male patient with HL. The SUVmax of the lesion (delineated by red circle) and liver for this patient are shown under each PET image. (C) The same patient as (B). The small lesion (less than 1.5 cm3; 5mm < width <10mm; height > 10mm; red arrow ) is enhanced by SwinIR with the lesion-to-liver contrast of SUVmax retained. The lesions (black arrow) are also clearly depicted by SwinIR, in contrast with being blurred and mixed together by the other reconstructions. (D) Representative 18F-FDG PET scan of a 17-year-old female patient from the external Tübingen testing cohort. All AI algorithms successfully denoise the 6.25% low-count images and provide similar diagnostic conspicuity of the lesion (red circle; red arrows) as the standard-dose PET, demonstrating the model is generalizable across different institutions for all AI algorithms. SwinIR shows superiority in retaining lesion-to-liver contrast and structural fidelity.
Figure 3:
Figure 3:. Quantitative metrics over the dose reduction spectrum.
The five AI algorithms were adapated for the low-count PET reconstruction task. The AI models were trained on 75%, 50%, 25%, 12.5%, 6.25%, and 1% of the clinical standard 18F-FDG dose PET/MRI images from the primary Stanford cohort. One round of cross-validation was adopted. The trained models were then evaluated on the corresponding low-count PET/MRI test set. The performance on the Stanford internal test set is shown on the top panel, and the performance on the external Tubingen test cohort is shown on the bottom panel. Measures of performance include structural similarity index (SSIM), peak signal-to-noise ratio (PSNR), and visual information fidelity (VIF). For all three metrics, higher represents better reconstruction. All comparisons are made against the ground-truth standard-count PET images. The blue line presents the original low-count PET images without AI enhancement and serves as the baseline for direct comparisons.
Figure 4:
Figure 4:. PET image comparisons across the dose reduction spectrum from 75% to 1% (of the clinical standard 3 MBq/kg 18F-FDG dose).
Representative 18F-FDG PET scan of 13-year-old male patient with Diffuse large B cell lymphoma (DLBCL). The SUVmax of two tumors and liver were measured for each PET image. SwinIR and U-Net are our demonstration models of choice, representing the transformer and CNN categories, respectively. (A) The coronal slice of the standard-dose PET, showing the chest region. SUVmax of two tumors and liver were measured for direct comparison. (B) The original low-count PET images with SUVmax measured under the same regions of tumors and liver as in (A). (C) U-Net reconstructed low-count PET images. The red arrows point to corrupted reconstruction in mediastinum and erroneous upstaging in liver. Red rectangle: enlargement of false upstaging in the liver area. U-Net-75p = U-Net reconstructed 75% low-count PET image. (D) SwinIR reconstructed low-count PET images. The red arrows point to the erroneous upstaging. Red rectangle: enlargement of the degraded reconstruction in liver. SwinIR-75p = SwinIR reconstructed 75% low-count PET image.
Figure 5:
Figure 5:. Representative discrepancy in reconstruction quality between different anatomical regions over the course of model training.
SwinIR is the model of choice for this demonstration. The performance is based on the primary Stanford PET/MRI cohort. The line chart shows the SSIM metric of the Stanford validation set over models at different training epochs. PET images illustrate cases from the Stanford testing set. The patches (top panel) are enlarged crops of a, b, and c, respectively. As the training progresses from epoch 4 to epoch 24, the structure of the basal ganglia within the brain becomes better reconstructed, while the small lesion (less than 1 cm3) within the liver gets over-smoothed.

Similar articles

Cited by

References

    1. Chaudhari AS, Mittra E, Davidzon GA, Gulaka P, Gandhi H, Brown A, et al. Low-count whole-body PET with deep learning in a multicenter and externally validated study. NPJ digital medicine. 2021;4:1–11. - PMC - PubMed
    1. Baum SH, Frühwald M, Rahbar K, Wessling J, Schober O, Weckesser M. Contribution of PET/CT to prediction of outcome in children and young adults with rhabdomyosarcoma. Journal of Nuclear Medicine. 2011;52:1535–40. - PubMed
    1. Kleis M, Daldrup-Link H, Matthay K, Goldsby R, Lu Y, Schuster T, et al. Diagnostic value of PET/CT for the staging and restaging of pediatric tumors. European journal of nuclear medicine and molecular imaging. 2009;36:23–36. - PubMed
    1. Baratto L, Hawk KE, Qi J, Gatidis S, Kiru L, Daldrup-Link HE. PET/MRI Improves Management of Children with Cancer. Journal of Nuclear Medicine. 2021;62:1334–40. - PMC - PubMed
    1. Huang B, Law MW-M, Khong P-L. Whole-body PET/CT scanning: estimation of radiation dose and cancer risk. Radiology. 2009;251:166–74. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources