A system for solution-orientated reporting of errors associated with the extraction of routinely collected clinical data for research and quality improvement

Stud Health Technol Inform. 2010;160(Pt 1):724-8.


Background: We have used routinely collected clinical data in epidemiological and quality improvement research for over 10 years. We extract, pseudonymise and link data from heterogeneous distributed databases; inevitably encountering errors and problems.

Objective: To develop a solution-orientated system of error reporting which enables appropriate corrective action.

Method: Review of the 94 errors, which occurred in 2008/9. Previously we had described failures in terms of the data missing from our response files; however this provided little information about causation. We therefore developed a taxonomy based on the IT component limiting data extraction.

Results: Our final taxonomy categorised errors as: (A) Data extraction Method and Process; (B) Translation Layer and Proxy Specification; (C) Shape and Complexity of the Original Schema; (D) Communication and System (mainly Software-based) Faults; (E) Hardware and Infrastructure; (F) Generic/Uncategorised and/or Human Errors. We found 79 distinct errors among the 94 reported; and the categories were generally predictive of the time needed to develop fixes.

Conclusions: A systematic approach to errors and linking them to problem solving has improved project efficiency and enabled us to better predict any associated delays.

MeSH terms

  • Biomedical Research / statistics & numerical data*
  • Data Mining / methods*
  • Medical Errors / classification*
  • Medical Errors / prevention & control
  • Medical Errors / statistics & numerical data*
  • Medical Records Systems, Computerized / statistics & numerical data*
  • Missouri
  • Quality Assurance, Health Care / standards*
  • Risk Management / organization & administration*