Challenges and Insights in Using HIPAA Privacy Rule for Clinical Text Annotation

AMIA Annu Symp Proc. 2015 Nov 5;2015:707-16. eCollection 2015.


The Privacy Rule of Health Insurance Portability and Accountability Act (HIPAA) requires that clinical documents be stripped of personally identifying information before they can be released to researchers and others. We have been manually annotating clinical text since 2008 in order to test and evaluate an algorithmic clinical text de-identification tool, NLM Scrubber, which we have been developing in parallel. Although HIPAA provides some guidance about what must be de-identified, translating those guidelines into practice is not as straightforward, especially when one deals with free text. As a result we have changed our manual annotation labels and methods six times. This paper explains why we have made those annotation choices, which have been evolved throughout seven years of practice on this field. The aim of this paper is to start a community discussion towards developing standards for clinical text annotation with the end goal of studying and comparing clinical text de-identification systems more accurately.

MeSH terms

  • Algorithms
  • Confidentiality* / legislation & jurisprudence
  • Data Anonymization* / standards
  • Electronic Health Records*
  • Health Insurance Portability and Accountability Act*
  • Humans
  • Personally Identifiable Information
  • Privacy / legislation & jurisprudence
  • United States