Purpose: Reliable and accurate predictive models are necessary to drive the success of radiomics. Our aim was to identify the optimal radiomics-based machine learning method for isocitrate dehydrogenase (IDH) genotype prediction in diffuse gliomas.
Methods: Eight classical machine learning methods were evaluated in terms of their stability and performance for pre-operative IDH genotype prediction. A total of 126 patients were enrolled for analysis. Overall, 704 radiomic features extracted from the pre-operative MRI images were analyzed. The patients were randomly assigned to either the training set or the validation set at a ratio of 2:1. Feature selection and classification model training were done using the training set, whereas the predictive performance and stability of the model were independently assessed using the validation set.
Results: Random Forest (RF) showed high predictive performance (accuracy 0.885 ± 0.041, AUC 0.931 ± 0.036), whereas neural network (NN) (accuracy 0.829 ± 0.064, AUC 0.878 ± 0.052) and flexible discriminant analysis (FDA) (accuracy 0.851 ± 0.049, AUC 0.875 ± 0.057) displayed low predictive performance. With regard to stability, RF also showed high robustness against data perturbation (relative standard deviations, RSD 3.87%).
Conclusions: RF is a promising machine learning method in predicting IDH genotype. Development of an accurate and reliable model can assist in the initial diagnostic evaluation and treatment planning for diffuse glioma patients.
Keywords: Diffuse glioma; Isocitrate dehydrogenase; Machine learning; Magnetic resonance imaging; Radiomics.