Issues in integrating epidemiology and research information in oncology: experience with ICD-O3 and the NCI Thesaurus

AMIA Annu Symp Proc. 2007 Oct 11;85-9.


The integration of the International Classification of Diseases for Oncology (ICD-O) and the NCI Thesaurus (NCIT) is expected to facilitate the integration of epidemiology data (cancer registries) with basic and clinical research data. We evaluated the degree to which ICD-O and NCIT provide consistent representations of neoplasms. 1,550 concepts (515 for topography and 1,035 for morphology) are shared by ICD-O and NCIT. Only 366 relations (about 1%) between these topography and morphology concepts are shared between ICD-O and NCIT. Two relationships--Disease Has Primary Anatomic Site and Disease Has Associated Anatomic Site--representing the anatomical site of a disease account for about 78% of the 1,376 relations between shared topography and morphology concepts in ICD-O and NCIT. In addition to these two roles, nine other NCIT relationships are found between topography and morphology concepts. Several issues are discussed, including incomplete representations in NCIT, mapping issues, systematic polysemy, and the use of post vs. pre-coordinated terms. The methods proposed provide a framework for analyzing inconsistencies.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Biomedical Research / classification*
  • Epidemiologic Methods
  • Humans
  • International Classification of Diseases*
  • Medical Oncology / classification
  • National Cancer Institute (U.S.)
  • Neoplasms / classification*
  • Neoplasms / epidemiology
  • Registries
  • United States
  • Vocabulary, Controlled*