Documenting the de-identification process of clinical and imaging data for AI for health imaging projects

Insights Imaging. 2024 May 31;15(1):130. doi: 10.1186/s13244-024-01711-x.


Artificial intelligence (AI) is revolutionizing the field of medical imaging, holding the potential to shift medicine from a reactive "sick-care" approach to a proactive focus on healthcare and prevention. The successful development of AI in this domain relies on access to large, comprehensive, and standardized real-world datasets that accurately represent diverse populations and diseases. However, images and data are sensitive, and as such, before using them in any way the data needs to be modified to protect the privacy of the patients. This paper explores the approaches in the domain of five EU projects working on the creation of ethically compliant and GDPR-regulated European medical imaging platforms, focused on cancer-related data. It presents the individual approaches to the de-identification of imaging data, and describes the problems and the solutions adopted in each case. Further, lessons learned are provided, enabling future projects to optimally handle the problem of data de-identification. CRITICAL RELEVANCE STATEMENT: This paper presents key approaches from five flagship EU projects for the de-identification of imaging and clinical data offering valuable insights and guidelines in the domain. KEY POINTS: ΑΙ models for health imaging require access to large amounts of data. Access to large imaging datasets requires an appropriate de-identification process. This paper provides de-identification guidelines from the AI for health imaging (AI4HI) projects.

Keywords: Data anonymization; Radiological imaging; Radiology.