Clinical data sharing between healthcare institutions, and between practitioners is often hindered by privacy protection requirements. This problem is critical in collaborative scenarios where data sharing is fundamental for establishing a workflow among parties. The anonymization of patient information burned in DICOM images requires elaborate processes somewhat more complex than simple de-identification of textual information. Usually, before sharing, there is a need for manual removal of specific areas containing sensitive information in the images. In this paper, we present a pipeline for ultrasound medical image de-identification, provided as a free anonymization REST service for medical image applications, and a Software-as-a-Service to streamline automatic de-identification of medical images, which is freely available for end-users. The proposed approach applies image processing functions and machine-learning models to bring about an automatic system to anonymize medical images. To perform character recognition, we evaluated several machine-learning models, being Convolutional Neural Networks (CNN) selected as the best approach. For accessing the system quality, 500 processed images were manually inspected showing an anonymization rate of 89.2%. The tool can be accessed at https://bioinformatics.ua.pt/dicom/anonymizer and it is available with the most recent version of Google Chrome, Mozilla Firefox and Safari. A Docker image containing the proposed service is also publicly available for the community.
Keywords: De-identification; Deep-learning; Medical imaging; Neural networks; OCR.