Extending an open-source tool to measure data quality: case report on Observational Health Data Science and Informatics (OHDSI)

BMJ Health Care Inform. 2020 Mar;27(1):e100054. doi: 10.1136/bmjhci-2019-100054.


Introduction: As the health system seeks to leverage large-scale data to inform population outcomes, the informatics community is developing tools for analysing these data. To support data quality assessment within such a tool, we extended the open-source software Observational Health Data Sciences and Informatics (OHDSI) to incorporate new functions useful for population health.

Methods: We developed and tested methods to measure the completeness, timeliness and entropy of information. The new data quality methods were applied to over 100 million clinical messages received from emergency department information systems for use in public health syndromic surveillance systems.

Discussion: While completeness and entropy methods were implemented by the OHDSI community, timeliness was not adopted as its context did not fit with the existing OHDSI domains. The case report examines the process and reasons for acceptance and rejection of ideas proposed to an open-source community like OHDSI.

Keywords: information systems; medical informatics; public health.

Publication types

  • Evaluation Study

MeSH terms

  • Data Accuracy*
  • Data Science*
  • Information Storage and Retrieval*
  • Population Surveillance
  • Software*