Population cluster data to assess the urban-rural split and electrification in Sub-Saharan Africa

Sci Data. 2021 Apr 23;8(1):117. doi: 10.1038/s41597-021-00897-9.


Human settlements are usually nucleated around manmade central points or distinctive natural features, forming clusters that vary in shape and size. However, population distribution in geo-sciences is often represented in the form of pixelated rasters. Rasters indicate population density at predefined spatial resolutions, but are unable to capture the actual shape or size of settlements. Here we suggest a methodology that translates high-resolution raster population data into vector-based population clusters. We use open-source data and develop an open-access algorithm tailored for low and middle-income countries with data scarcity issues. Each cluster includes unique characteristics indicating population, electrification rate and urban-rural categorization. Results are validated against national electrification rates provided by the World Bank and data from selected Demographic and Health Surveys (DHS). We find that our modeled national electrification rates are consistent with the rates reported by the World Bank, while the modeled urban/rural classification has 88% accuracy. By delineating settlements, this dataset can complement existing raster population data in studies such as energy planning, urban planning and disease response.

Publication types

  • Dataset
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Electric Power Supplies*
  • Humans
  • Population Density*
  • Rural Population / statistics & numerical data*
  • Urban Population / statistics & numerical data*