Background: UK general practice is computerised, and quality targets based on computer data provide a further incentive to improve data quality. A National Programme for Information Technology is standardising the technical infrastructure and removing some of the barriers to data aggregation. Routinely collected data is an underused resource, yet little has been written about the wide range of factors that need to be taken into account if we are to infer meaning from general practice data.
Objective: To report the complexity of general practice computer data and factors that need to be taken into account in its processing and interpretation.
Method: We run clinically focused programmes that provide clinically relevant feedback to clinicians, and overview statistics to localities and researchers. However, to take account of the complexity of these data we have carefully devised a system of process stages and process controls to maintain referential integrity, and improve data quality and error reduction. These are integrated into our design and processing stages. Our systems document the query, reference code set and create unique patient ID. The design stage is followed by appraisal of: data entry issues, how concepts might be represented in clinical systems, coding ambiguities, using surrogates where needed, validation and pilot-ing. The subsequent processing of data includes extraction, migration and integration of data from different sources, cleaning, processing and analysis.
Results: Results are presented to illustrate issues with the population denominator, data entry problems, identification of people with unmet needs, and how routine data can be used for real-world testing of pharmaceuticals.
Conclusions: Routinely collected primary care data could contribute more to the process of health improvement; however, those working with these data need to understand fully the complexity of the context within which data entry takes place.