Purpose: Immune effector cell-associated hematotoxicity (ICAHT) is a major cause of nonrelapse mortality after chimeric antigen receptor (CAR) T-cell therapy. We hypothesized that unsupervised time-series clustering could better identify archetypal patterns of early hematotoxicity compared to the early ICAHT (eICAHT) grading system.
Methods: We applied unsupervised k-means time-series clustering based on Euclidean distances to longitudinal absolute neutrophil count (ANC) data from days +0 through +30 post-CAR T-cell infusion in 691 patients treated at our center (training set: n = 483, 70%; test set: n = 208, 30%).
Results: Within our training set, we identified an optimal cluster solution based on four ANC recovery clusters, which were labeled as very good, good, poor, and very poor. We trained a random forest (RF) model including the top five most important features (day +3, +4, +5, +26, and +27 ANC values) to predict the cluster assignments. Within our test set, we applied the RF model to predict cluster assignments. Compared with the eICAHT criteria, the RF-predicted clusters were more compact and better separated (Dunn index: 0.078 v 0.034; average silhouette width: 0.12 v 0.010). In addition, the RF model identified patients in the good recovery cluster with intermediate overall survival (hazard ratio [HR], 1.70 [95% CI, 1.05 to 2.74]; P = .029; reference, very good), which was not captured by grade 2 eICAHT (HR, 1.37 [95% CI, 0.80 to 2.35]; P = .25; reference, grade 0-1).
Conclusion: Unsupervised time-series clustering identified distinct and clinically relevant patterns of hematotoxicity after CAR T-cell therapy. We trained and tested an RF model that accurately predicted cluster assignments using only five features. Predictions can be generated using our online web application.