Automatic annotation of medical records

Stud Health Technol Inform. 2005:116:817-22.


One of the research projects running at the medical informatics department of the Institute of Computer Science AS CR explores the problem of medical information representation and development of electronic health record (EHR). With respect to this effort an interesting problem arises: how to transfer knowledge from a medical record written in a free text form into a structured electronic format represented by the EHR. Currently, this task was solved by writing extraction rules (regular expressions) for every element of information that is to be extracted from the medical record. However, such approach is very time consuming and requires supervision of a skilled programmer whenever the target area of medicine is changed. In this article we explore the possibility to mechanize this process by automatically generating the extraction rules from a pre-annotated corpus of medical records. Since we are currently in the phase of data acquisition and preliminary tests we will not present any final results, rather we will sketch the technologies we intend to use and describe the tools that were developed so far as a part of this project.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Electronic Health Records*
  • Humans
  • Medical Informatics*