Unsupervised Learning Applied to the Stratification of Preterm Birth Risk in Brazil with Socioeconomic Data

Int J Environ Res Public Health. 2022 May 5;19(9):5596. doi: 10.3390/ijerph19095596.

Abstract

Preterm birth (PTB) is a phenomenon that brings risks and challenges for the survival of the newborn child. Despite many advances in research, not all the causes of PTB are already clear. It is understood that PTB risk is multi-factorial and can also be associated with socioeconomic factors. Thereby, this article seeks to use unsupervised learning techniques to stratify PTB risk in Brazil using only socioeconomic data. Through the use of datasets made publicly available by the Federal Government of Brazil, a new dataset was generated with municipality-level socioeconomic data and a PTB occurrence rate. This dataset was processed using various unsupervised learning techniques, such as k-means, principal component analysis (PCA), and density-based spatial clustering of applications with noise (DBSCAN). After validation, four clusters with high levels of PTB occurrence were discovered, as well as three with low levels. The clusters with high PTB were comprised mostly of municipalities with lower levels of education, worse quality of public services-such as basic sanitation and garbage collection-and a less white population. The regional distribution of the clusters was also observed, with clusters of high PTB located mostly in the North and Northeast regions of Brazil. The results indicate a positive influence of the quality of life and the offer of public services on the reduction in PTB risk.

Keywords: Brazil; PTB risk; clustering; preterm birth; unsupervised learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Brazil / epidemiology
  • Female
  • Humans
  • Infant, Newborn
  • Pregnancy
  • Premature Birth* / epidemiology
  • Premature Birth* / etiology
  • Quality of Life
  • Risk Factors
  • Socioeconomic Factors
  • Unsupervised Machine Learning

Grants and funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)—Finance Code 001.