A data science based standardized Gini index as a Lorenz dominance preserving measure of the inequality of distributions

PLoS One. 2017 Aug 10;12(8):e0181572. doi: 10.1371/journal.pone.0181572. eCollection 2017.


The Gini index is a measure of the inequality of a distribution that can be derived from Lorenz curves. While commonly used in, e.g., economic research, it suffers from ambiguity via lack of Lorenz dominance preservation. Here, investigation of large sets of empirical distributions of incomes of the World's countries over several years indicated firstly, that the Gini indices are centered on a value of 33.33% corresponding to the Gini index of the uniform distribution and secondly, that the Lorenz curves of these distributions are consistent with Lorenz curves of log-normal distributions. This can be employed to provide a Lorenz dominance preserving equivalent of the Gini index. Therefore, a modified measure based on log-normal approximation and standardization of Lorenz curves is proposed. The so-called UGini index provides a meaningful and intuitive standardization on the uniform distribution as this characterizes societies that provide equal chances. The novel UGini index preserves Lorenz dominance. Analysis of the probability density distributions of the UGini index of the World's counties income data indicated multimodality in two independent data sets. Applying Bayesian statistics provided a data-based classification of the World's countries' income distributions. The UGini index can be re-transferred into the classical index to preserve comparability with previous research.

MeSH terms

  • Bayes Theorem
  • Computer Simulation
  • Humans
  • Income / statistics & numerical data
  • Models, Economic*
  • Models, Statistical*
  • Probability
  • Socioeconomic Factors*
  • Statistical Distributions

Grant support

This work has been funded by the Landesoffensive zur Entwicklung wissenschaftlich—ökonomischer Exzellenz (LOEWE), LOEWE-Zentrum für Translationale Medizin und Pharmakologie (JL). In particular, the work was related to the project „Datenbionische wissensentdeckende Arzneimittelforschung“ that aims at developing data science methods for knowledge discovery, for which the present work provides theoretical advancement at a generic topic. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.