Name analysis to classify populations by ethnicity in public health: validation of Onomap in Scotland

Public Health. 2011 Oct;125(10):688-96. doi: 10.1016/j.puhe.2011.05.003. Epub 2011 Sep 9.


Objectives: Health inequalities between ethnic minorities and the general population are persistent. Addressing them is hampered by the inability to classify individuals' ethnicity accurately. This is addressed by a new name-based ethnicity classification methodology called 'Onomap'. This paper evaluates the diagnostic accuracy of Onomap in identifying population groups by ethnicity, and discusses applications to public health practice.

Study design: Onomap was applied to three independent reference datasets (birth registration, pupil census and register of Polish health professionals) collected in Britain and Poland at individual level (n = 260,748).

Methods: Results were compared with the reference database ethnicity 'gold standard'. Outcome measures included sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). Ninety-five percent confidence intervals and Chi-squared tests were used.

Results: Onomap identified the majority of those in the British participant group with high sensitivity and PPV (>95%), and low misclassification (<5%), although specificity and NPV were lowest in this group (56-87%). Outcome measures for all other non-British groupings were high for specificity and NPV (>98%), but variable for sensitivity and PPV (17-89%). Differences in misclassification by gender were statistically significant. Using maiden name rather than married name in women improved classification outcomes for those born in the British Isles (0.53%, 95% confidence interval 0.26-0.8%; P < 0.001) but not for South Asian or Polish groups.

Conclusions: Onomap offers an effective methodology for identifying population groups in both health-related and educational datasets, categorizing populations into a variety of ethnic groups. This evaluation suggests that it can successfully assist health researchers, planners and policy makers in identifying and addressing health inequalities.

Publication types

  • Validation Study

MeSH terms

  • Asia
  • Censuses
  • Ethnicity / classification*
  • Female
  • Health Personnel
  • Health Status Disparities
  • Humans
  • Male
  • Names*
  • Poland
  • Registries / statistics & numerical data
  • Reproducibility of Results
  • Scotland
  • Sensitivity and Specificity