Construction and validation of a scoring system for the selection of high-quality data in a Spanish population primary care database (SIDIAP)

Inform Prim Care. 2011;19(3):135-45. doi: 10.14236/jhi.v19i3.806.


Background: Computerised databases of primary care clinical records are widely used for epidemiological research. In Catalonia, the Information System for the Development of Research in Primary Care (SIDIAP) aims to promote the development of research based on high-quality validated data from primary care electronic medical records.

Objective: The purpose of this study is to create and validate a scoring system (Registry Quality Score, RQS) that will enable all primary care practices (PCPs) to be selected as providers of researchusable data based on the completeness of their registers.

Methods: Diseases that were likely to be representative of common diagnoses seen in primary care were selected for RQS calculations. The observed/expected cases ratio was calculated for each disease. Once we had obtained an estimated value for this ratio for each of the selected conditions we added up the ratios calculated for each condition to obtain a final RQS. Rate comparisons between observed and published prevalences of diseases not included in the RQS calculations (atrial fibrillation, diabetes, obesity, schizophrenia, stroke, urinary incontinence and Crohn's disease) were used to set the RQS cutoff which will enable researchers to select PCPs with research-usable data.

Results: Apart from Crohn's disease, all prevalences were the same as those published from the RQS fourth quintile (60th percentile) onwards. This RQS cut-off provided a total population of 1 936 443 (39.6% of the total SIDIAP population).

Conclusions: SIDIAP is highly representative of the population of Catalonia in terms of geographical, age and sex distributions. We report the usefulness of rate comparison as a valid method to establish research-usable data within primary care electronic medical records.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Adult
  • Age Distribution
  • Aged
  • Clinical Coding
  • Cross-Sectional Studies
  • Databases, Factual / standards*
  • Electronic Health Records / organization & administration*
  • Female
  • Humans
  • Information Storage and Retrieval / standards*
  • Male
  • Middle Aged
  • Prevalence
  • Primary Health Care / organization & administration*
  • Sex Distribution
  • Spain
  • Validation Studies as Topic