Purpose: The purpose of this study was to evaluate the variability in quantitation of positron emission tomography (PET) data acquired within the context of a multicenter consortium.
Methods: PET quantitation phantoms designed by American Association of Physicists in Medicine/ Society of Nuclear Medicine Task Group 145 were sent to the ten member sites of the Pediatric Brain Tumor Consortium (PBTC), a NIH-funded research consortium investigating the biology and therapies for brain tumors in children. The phantoms were water-filled cylinders (18.6 cm inside height and 20.4 cm inside diameter) based on the standard ACR phantom with four small, "hot" cylinders of varying diameters (8, 12, 16, 25 mm, all with 38 mm height), consisting of an equilibrium mixture of 68Ge/68Ga in an epoxy matrix. At each site, the operator added the appropriate amount of 18F to the water in the background in order to attain a feature-to-background ratio of roughly 4:1. The phantom was imaged and reconstructed as if it were a brain PET scan for the PBTC. An approximately 12 mm circular region of interest (ROI) was placed over each feature and in a central area in the background. The mean and maximum pixel values for each ROI were requested from local sites in units of activity concentration (Bq/ml) and the standard uptake value (SUV) (g/mL) based on bodyweight. The activity concentration was normalized by the decay-corrected known activity concentration for the features, and reported as the absolute recovery coefficient (RC). In addition, central analyses were performed by two observers
Results: The ten sites successfully imaged the phantom within 5 months and submitted the quantitative results and the phantom image data to the PBTC Operations and Biostatistics Center. The local site-based and central analyses yielded similar mean values for RC. Local site-based SUV measurements of the hot cylindrical features yielded greater variability than central analysis (COV range of 29.9%-42.8% compared to 7.7%-23.2%). Correcting for miscalculations in the local site reported SUVs substantially reduced the variation to levels similar to the central analysis (COV range of 8.8%-18.4%) and also led to the local sites providing a similar mean of the SUV values to those from the central analysis. In the central analysis, the use of mean SUV in place of maximum SUV for an ROI of fixed size substantially reduced the variation in the SUV values (COV ranges of 7.7%-11.3% vs. 9.3%-23.2%).
Conclusions: Based on this investigation, a SUV variability in the range of 10%-25% due solely to instrument and analysis factors can be expected in the context of a multicenter consortium if a central reading is used and quality assurance and quality control procedures are followed. The overall SUV variability can be expected to be larger than this due to biological and protocol factors.