A highly specific algorithm for identifying asthma cases and controls for genome-wide association studies

AMIA Annu Symp Proc. 2009 Nov 14;2009:497-501.


Our aim was to identify asthmatic patients as cases, and healthy patients as controls, for genome-wide association studies (GWAS), using readily available data from electronic medical records. For GWAS, high specificity is required to accurately identify genotype-phenotype correlations. We developed two algorithms using a combination of diagnoses, medications, and smoking history. By applying stringent criteria for source and specificity of the data we achieved a 95% positive predictive value and 96% negative predictive value for identification of asthma cases and controls compared against clinician review. We achieved a high specificity but at the loss of approximately 24% of the initial number of potential asthma cases we found. However, by standardizing and applying our algorithm across multiple sites, the high number of cases needed for a GWAS could be achieved.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Asthma / diagnosis*
  • Asthma / genetics
  • Databases, Nucleic Acid
  • Electronic Health Records
  • Genome-Wide Association Study*
  • Humans
  • Information Storage and Retrieval / methods*