Design patterns for the development of electronic health record-driven phenotype extraction algorithms

J Biomed Inform. 2014 Oct:51:280-6. doi: 10.1016/j.jbi.2014.06.007. Epub 2014 Jun 21.


Background: Design patterns, in the context of software development and ontologies, provide generalized approaches and guidance to solving commonly occurring problems, or addressing common situations typically informed by intuition, heuristics and experience. While the biomedical literature contains broad coverage of specific phenotype algorithm implementations, no work to date has attempted to generalize common approaches into design patterns, which may then be distributed to the informatics community to efficiently develop more accurate phenotype algorithms.

Methods: Using phenotyping algorithms stored in the Phenotype KnowledgeBase (PheKB), we conducted an independent iterative review to identify recurrent elements within the algorithm definitions. We extracted and generalized recurrent elements in these algorithms into candidate patterns. The authors then assessed the candidate patterns for validity by group consensus, and annotated them with attributes.

Results: A total of 24 electronic Medical Records and Genomics (eMERGE) phenotypes available in PheKB as of 1/25/2013 were downloaded and reviewed. From these, a total of 21 phenotyping patterns were identified, which are available as an online data supplement.

Conclusions: Repeatable patterns within phenotyping algorithms exist, and when codified and cataloged may help to educate both experienced and novice algorithm developers. The dissemination and application of these patterns has the potential to decrease the time to develop algorithms, while improving portability and accuracy.

Keywords: Algorithms; Design patterns; Electronic health record; Phenotype; Software design.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Biological Ontologies*
  • Data Curation / methods
  • Data Mining / methods*
  • Electronic Health Records / classification*
  • Electronic Health Records / organization & administration
  • Genomics / classification*
  • Genomics / organization & administration
  • Natural Language Processing*
  • Pattern Recognition, Automated / methods*
  • Phenotype