Automated prediction of acute promyelocytic leukemia from flow cytometry data using a graph neural network pipeline

Am J Clin Pathol. 2024 Mar 1;161(3):264-272. doi: 10.1093/ajcp/aqad145.

Abstract

Objectives: Our study aimed to develop a machine learning (ML) model to accurately classify acute promyelocytic leukemia (APL) from other types of acute myeloid leukemia (other AML) using multicolor flow cytometry (MFC) data. Multicolor flow cytometry is used to determine immunophenotypes that serve as disease signatures for diagnosis.

Methods: We used a data set of MFC files from 27 patients with APL and 41 patients with other AML, including those with uncommon immunophenotypes. Our ML pipeline involved training a graph neural network (GNN) to output graph-level labels and identifying the most crucial MFC parameters and cells for predictions using an input perturbation method.

Results: The top-performing GNN achieved 100% accuracy on the training/validation and test sets on classifying APL from other AML and used MFC parameters similarly to expert pathologists. Pipeline performance is amenable to use in a clinical decision support system, and our deep learning architecture readily enables prediction explanations.

Conclusions: Our ML pipeline shows robust performance on predicting APL and could be used to screen for APL using MFC data. It also allowed for intuitive interrogation of the model's predictions by clinicians.

Keywords: acute myeloid leukemia; acute promyelocytic leukemia; flow cytometry; graph neural network; immunophenotype; machine learning.

MeSH terms

  • Decision Support Systems, Clinical*
  • Flow Cytometry
  • Humans
  • Immunophenotyping
  • Leukemia, Promyelocytic, Acute* / diagnosis
  • Neural Networks, Computer