A selection-bias free method to estimate the prevalence of hypertension from an administrative primary health care database in the Girona Health Region, Spain

Comput Methods Programs Biomed. 2009 Mar;93(3):228-40. doi: 10.1016/j.cmpb.2008.10.010. Epub 2008 Dec 6.


The prevalence of common illnesses could be estimated using general practice databases, providing certain advantages when compared to other alternative sources of information, in particular being relatively more cost-effective. The main limitation is that it is a threat of selection bias. Some individuals have a higher probability of having used primary health care, implying that the potential result, 'contact registration', is overrepresented in the sample observed. The selection bias would provide inconsistent estimators of prevalence. The objective of this study is to propose a bias-free selection method to estimate prevalence using an administrative primary health care database. It proposes re-weighting the estimations of prevalence obtained from the database according to the probability of their being present in the same. These probabilities will be appropriately estimated from a health survey using a treatment effects model with a discrete response, i.e. a hurdle model. As an application, it was estimated the prevalence of hypertension in the population covered by public primary health care providers in the Girona Health Region, Spain, in 2005. Using this bias-free selection method the prevalence of hypertension has been estimated that 15.5% of individuals aged 15 and above (14.1% among men and 16.9% among women) suffer from hypertension. Likewise, the prevalence is estimated at 31.1% (30.3% men and 32.0% women) in individuals aged 45 and over; 48.3% (44.1% men and 51.9% women) among those aged 65 and over; and 13.1% (11.8% men and 13.9% women) in the general population. The proposed method provides estimators of the prevalence of hypertension very close to those obtained directly from the 2006 Catalan Health Survey. It is concluded that the proposed method could be used to estimate the prevalence of hypertension in an approximately unbiased form. Given the use of the administrative primary health care database corresponding to all users of primary health care in the 23 public managed Health Areas of the Girona Health Region during 2005, the proposed method will be more cost-effective and will provide much more population information than health questionnaires.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Cross-Sectional Studies
  • Databases, Factual / statistics & numerical data
  • Female
  • Health Surveys
  • Humans
  • Hypertension / epidemiology*
  • Male
  • Middle Aged
  • Models, Statistical*
  • Prevalence
  • Primary Health Care / organization & administration
  • Primary Health Care / statistics & numerical data*
  • Probability
  • Public Health / methods*
  • Selection Bias
  • Spain / epidemiology