A Comparative Analysis of Multidimensional COVID-19 Poverty Determinants: An Observational Machine Learning Approach

New Gener Comput. 2023;41(1):155-184. doi: 10.1007/s00354-023-00203-8. Epub 2023 Feb 1.

Abstract

Poverty is a glaring issue in the twenty-first century, even after concerted efforts of organizations to eliminate the same. Predicting poverty using machine learning can offer practical models for facilitating the process of elimination of poverty. This paper uses Multidimensional Poverty Index Data from the Oxford Poverty and Human Development Initiative across the years 2019 and 2021 to make predictions of multidimensional poverty before and during the pandemic. Several poverty indicators under health, education and living standards are taken into consideration. The work implements several data analysis techniques like feature correlation and selection, and graphical visualizations to answer research questions about poverty. Various machine learning, such as Multiple Linear Regression, Decision Tree Regressor, Random Forest Regressor, XGBoost, AdaBoost, Gradient Boosting, Linear Support Vector Regressor (SVR), Ridge Regression, Lasso Regression, ElasticNet Regression, and K-Nearest Neighbor Regression algorithm, have been implemented to predict poverty across four datasets on a national and a subnational level. Regularization is used to increase the performance of the models, and cross-validation is used for estimation. Through a rigorous analysis and comparison of different models, this work identifies important poverty determinants and concludes that overall, Ridge Regression model performs the best with the highest R 2 score.

Keywords: Feature selection; Machine learning; Multidimensional; Poverty; Prediction; Regression.