Objectives: Accurately predicting disease progress from a set of predictive variables is an important aspect of clinical work. For binary outcomes, the classical approach is to develop prognostic logistic regression (LR) models. Alternatively, machine learning algorithms were proposed with artificial neural networks (ANN) having become popular over the last decades. Although some studies have compared predictive accuracies of LR and ANN models, some concerns regarding their methodological quality have been voiced. Our comparison has the advantage of being based on two large independent data sets allowing for elaborate model development and independent validation.
Methods: From the German Stroke Database, a learning data set including 1754 prospectively recruited patients with acute ischemic stroke was used. Utilizing LR and ANN, two prognostic models were developed predicting restitution of functional independence and survival after 100 days. The resulting models were applied to classify 1470 patients with acute ischemic stroke; this test data set was collected independently from the learning data. Error fractions in the test data were determined, and differences in error fractions between the algorithms were calculated with 95% confidence intervals.
Results: For most prognostic models, error fractions in the test data were below 40%. There was no difference between the algorithms except for the model predicting completely versus incompletely restituted or deceased patients (difference in error fractions = 4.01% [2.10-5.96%], p = 0.0001).
Conclusions: The conscientiously applied LR remains the gold standard for prognostic modelling; however, ANN can be an alternative automated "quick and easy" multivariate analysis.