Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002;757-61.

Identification of patient name references within medical documents using semantic selectional restrictions

Affiliations
Free PMC article

Identification of patient name references within medical documents using semantic selectional restrictions

Ricky K Taira et al. Proc AMIA Symp. 2002.
Free PMC article

Abstract

De-identification of a patient's personal data from medical records is a protective legal requirement imposed before medical documents can be used for research purposes or transferred to other healthcare providers (e.g., teachers, students, tele-consultations). This de-identification process is tedious if performed manually, and is known to be quite faulty in direct search and replace strategies [9]. In this paper, we report on the identification step of this process. The proposed algorithm is based on estimating the fitness of candidate patient name references to a set of semantic selectional restrictions. The semantic restrictions place tight contextual requirements upon candidate words in the report text and are determined automatically from a manually tagged corpus of training reports. Maximum entropy classifiers are used to provide a probabilistic measure of the belief of a given candidate token to a given semantic restriction. We report on the design and preliminary evaluation of the system within the do-main of pediatric urology.

Similar articles

Cited by

References

    1. Acad Radiol. 2002 Jun;9(6):670-8 - PubMed
    1. Methods Inf Med. 1998 Sep;37(3):271-7 - PubMed
    1. Semin Nucl Med. 1978 Oct;8(4):283-98 - PubMed

LinkOut - more resources