A De-Identification Pipeline for Ultrasound Medical Images in DICOM Format

J Med Syst. 2017 May;41(5):89. doi: 10.1007/s10916-017-0736-1. Epub 2017 Apr 13.


Clinical data sharing between healthcare institutions, and between practitioners is often hindered by privacy protection requirements. This problem is critical in collaborative scenarios where data sharing is fundamental for establishing a workflow among parties. The anonymization of patient information burned in DICOM images requires elaborate processes somewhat more complex than simple de-identification of textual information. Usually, before sharing, there is a need for manual removal of specific areas containing sensitive information in the images. In this paper, we present a pipeline for ultrasound medical image de-identification, provided as a free anonymization REST service for medical image applications, and a Software-as-a-Service to streamline automatic de-identification of medical images, which is freely available for end-users. The proposed approach applies image processing functions and machine-learning models to bring about an automatic system to anonymize medical images. To perform character recognition, we evaluated several machine-learning models, being Convolutional Neural Networks (CNN) selected as the best approach. For accessing the system quality, 500 processed images were manually inspected showing an anonymization rate of 89.2%. The tool can be accessed at https://bioinformatics.ua.pt/dicom/anonymizer and it is available with the most recent version of Google Chrome, Mozilla Firefox and Safari. A Docker image containing the proposed service is also publicly available for the community.

Keywords: De-identification; Deep-learning; Medical imaging; Neural networks; OCR.

MeSH terms

  • Confidentiality
  • Data Anonymization*
  • Image Processing, Computer-Assisted
  • Information Dissemination
  • Privacy
  • Software
  • Ultrasonography