Objective: The optimum selection and sequencing of combination antiretroviral therapy to maintain viral suppression can be challenging. The HIV Resistance Response Database Initiative has pioneered the development of computational models that predict the virological response to drug combinations. Here we describe the development and testing of random forest models to power an online treatment selection tool.
Methods: Five thousand, seven hundred and fifty-two treatment change episodes were selected to train a committee of 10 models to predict the probability of virological response to a new regimen. The input variables were antiretroviral treatment history, baseline CD4 cell count, viral load and genotype, drugs in the new regimen, time from treatment change to follow-up and follow-up viral load values. The models were assessed during cross-validation and with an independent set of 50 treatment change episodes by plotting receiver-operator characteristic curves and their performance compared with genotypic sensitivity scores from rules-based genotype interpretation systems.
Results: The models achieved an area under the curve during cross-validation of 0.77-0.87 (mean = 0.82), accuracy of 72-81% (mean = 77%), sensitivity of 62-80% (mean = 67%) and specificity of 75-89% (mean = 81%). When tested with the 50 test cases, the area under the curve was 0.70-0.88, accuracy 64-82%, sensitivity 62-80% and specificity 68-95%. The genotypic sensitivity scores achieved an area under the curve of 0.51-0.52, overall accuracy of 54-56%, sensitivity of 43-64% and specificity of 41-73%.
Conclusion: The models achieved a consistent, high level of accuracy in predicting treatment responses, which was markedly superior to that of genotypic sensitivity scores. The models are being used to power an experimental system now available via the Internet.