Purpose: To identify prognostic imaging biomarkers in non-small cell lung cancer (NSCLC) by means of a radiogenomics strategy that integrates gene expression and medical images in patients for whom survival outcomes are not available by leveraging survival data in public gene expression data sets.
Materials and methods: A radiogenomics strategy for associating image features with clusters of coexpressed genes (metagenes) was defined. First, a radiogenomics correlation map is created for a pairwise association between image features and metagenes. Next, predictive models of metagenes are built in terms of image features by using sparse linear regression. Similarly, predictive models of image features are built in terms of metagenes. Finally, the prognostic significance of the predicted image features are evaluated in a public gene expression data set with survival outcomes. This radiogenomics strategy was applied to a cohort of 26 patients with NSCLC for whom gene expression and 180 image features from computed tomography (CT) and positron emission tomography (PET)/CT were available.
Results: There were 243 statistically significant pairwise correlations between image features and metagenes of NSCLC. Metagenes were predicted in terms of image features with an accuracy of 59%-83%. One hundred fourteen of 180 CT image features and the PET standardized uptake value were predicted in terms of metagenes with an accuracy of 65%-86%. When the predicted image features were mapped to a public gene expression data set with survival outcomes, tumor size, edge shape, and sharpness ranked highest for prognostic significance.
Conclusion: This radiogenomics strategy for identifying imaging biomarkers may enable a more rapid evaluation of novel imaging modalities, thereby accelerating their translation to personalized medicine.