An adaptive annotation approach for biomedical entity and relation recognition

Seid Muhie Yimam; Chris Biemann; Ljiljana Majnaric; Šefket Šabanović; Andreas Holzinger

doi:10.1007/s40708-016-0036-4

An adaptive annotation approach for biomedical entity and relation recognition

Brain Inform. 2016 Sep;3(3):157-168. doi: 10.1007/s40708-016-0036-4. Epub 2016 Feb 27.

Authors

Seid Muhie Yimam¹, Chris Biemann², Ljiljana Majnaric³, Šefket Šabanović³, Andreas Holzinger⁴

Affiliations

¹ TU Darmstadt CS Department, FG Language Technology, 64289, Darmstadt, Germany. seidymam@gmail.com.
² TU Darmstadt CS Department, FG Language Technology, 64289, Darmstadt, Germany.
³ Josip Juraj Strossmayer University of Osijek Faculty of Medicine Osijek, Osijek, Croatia.
⁴ Research Unit HCI-KDD Institute for Medical Informatics, Statistics and Documentation Medical University Graz, Auenbruggerplatz 2, 8036, Graz, Austria.

Abstract

In this article, we demonstrate the impact of interactive machine learning: we develop biomedical entity recognition dataset using a human-into-the-loop approach. In contrary to classical machine learning, human-in-the-loop approaches do not operate on predefined training or test sets, but assume that human input regarding system improvement is supplied iteratively. Here, during annotation, a machine learning model is built on previous annotations and used to propose labels for subsequent annotation. To demonstrate that such interactive and iterative annotation speeds up the development of quality dataset annotation, we conduct three experiments. In the first experiment, we carry out an iterative annotation experimental simulation and show that only a handful of medical abstracts need to be annotated to produce suggestions that increase annotation speed. In the second experiment, clinical doctors have conducted a case study in annotating medical terms documents relevant for their research. The third experiment explores the annotation of semantic relations with relation instance learning across documents. The experiments validate our method qualitatively and quantitatively, and give rise to a more personalized, responsive information extraction technology.

Keywords: Biomedical entity recognition; Data mining; Human in the loop; Interactive annotation; Knowledge discovery; Machine learning; Relation learning.