[Name-based identification of cases of Turkish origin in the childhood cancer registry in Mainz]

Gesundheitswesen. 2006 Oct;68(10):643-9. doi: 10.1055/s-2006-927166.
[Article in German]


Until now few analyses of routine data relating to the health of migrants have been conducted in Germany. A major obstacle is that most data sources do not provide reliable information on the origin of migrants. While some sources contain the nationality of persons registered, this information does not allow one to identify migrants who have taken up German citizenship, i.e., a substantial part of second-generation migrants. In this paper we demonstrate how a computer-aided, name-based algorithm can be used to identify persons of Turkish origin in the German Childhood Cancer Registry in Mainz, Germany. The performance of the algorithm, as assessed against the gold standard of assessing names manually, was very good (sensitivity and specificity > or = 0.975). In total, we identified 1774 of the 37,259 cases in the registry as being of Turkish origin. The name algorithm proved to be a useful tool to identify Turkish migrants in routine data sources, thus avoiding potential bias due to changes in citizenship. This approach aims at improving migrant-sensitive health reporting and research in Germany. In future, additional information on migrant status should be obtained already during primary data collection so that health data for all migrant groups can be provided.

Publication types

  • English Abstract

MeSH terms

  • Algorithms*
  • Artificial Intelligence
  • Child
  • Emigration and Immigration / classification*
  • Emigration and Immigration / statistics & numerical data*
  • Germany / epidemiology
  • Humans
  • Names*
  • Natural Language Processing
  • Neoplasms / ethnology*
  • Registries*
  • Turkey / ethnology