Background: Identifying successful public health ideas and practices is a difficult challenge towing to the presence of complex baseline characteristics that can affect health outcomes. We propose the use of machine learning algorithms to predict life expectancy at birth, and then compare health-related characteristics of the under- and overachievers (i.e., municipalities that have a worse and better outcome than predicted, respectively).
Methods: Our outcome was life expectancy at birth for Brazilian municipalities, and we used as predictors 60 local characteristics that are not directly controlled by public health officials (e.g., socioeconomic factors).
Results: The highest predictive performance was achieved by an ensemble of machine learning algorithms (cross-validated mean squared error of 0.168), including a 35% gain in comparison with standard decision trees. Overachievers presented better results regarding primary health care, such as higher coverage of the massive multidisciplinary program Family Health Strategy. On the other hand, underachievers performed more cesarean deliveries and mammographies and had more life-support health equipment.
Conclusions: The findings suggest that analyzing the predicted value of a health outcome may bring insights about good public health practices.