Purpose: Measurement variance affects the clinical effectiveness of PET-based measurement as a semiquantitative imaging biomarker for cancer response in individual patients and for planning clinical trials. In this study, we measured test-retest reproducibility of SUV measurements under clinical practice conditions and recorded recognized deviations from protocol compliance.
Methods: Instrument performance calibration, display, and analyses conformed to manufacture recommendations. Baseline clinical (18)F-FDG PET/CT examinations were performed and then repeated at 1 to 7 days. Intended scan initiation uptake period was to repeat the examinations at the same time for each study after injection of 12 mCi FDG tracer. Avidity of uptake was measured in 62 tumors in 21 patients as SUV for maximum voxel (SUV(max)) and for a mean of sampled tumor voxels (SUV(mean)).
Results: The range of SUV(max) and SUV(mean) was 1.07 to 21.47 and 0.91 to 14.69, respectively. Intraclass correlation coefficient between log of SUV(max) and log of SUV(mean) was 0.93 (95% confidence interval [CI], 0.88-0.95) and 0.92 (95% CI, 0.87-0.95), respectively.Correlation analysis failed to show an effect on uptake period variation on SUV measurements between the 2 examinations, suggesting additional sources of noise.The threshold criteria for relative difference from baseline for the 95% CI were ± 49% or ± 44% for SUV(max) or SUV(mean), respectively.
Conclusions: Variance of SUV for FDG-PET/CT in current clinical practice in a single institution was greater than expected when compared with benchmarks reported under stringent efficacy study settings. Under comparable clinical practice conditions, interpretation of changes in tumor avidity in individuals and assumptions in planning clinical trials may be affected.