Multivariate anomaly detection models enhance identification of errors in routine clinical chemistry testing

Clin Chem Lab Med. 2024 Jun 12. doi: 10.1515/cclm-2024-0484. Online ahead of print.


Objectives: Conventional autoverification rules evaluate analytes independently, potentially missing unusual patterns of results indicative of errors such as serum contamination by collection tube additives. This study assessed whether multivariate anomaly detection algorithms could enhance the detection of such errors.

Methods: Multivariate Gaussian, k-nearest neighbours (KNN) distance, and one-class support vector machine (SVM) anomaly detection models, along with conventional limit checks, were developed using a training dataset of 127,451 electrolyte, urea, and creatinine (EUC) results, with a 5 % flagging rate targeted for all approaches. The models were compared with limit checks for their ability to detect atypical EUC results from samples spiked with additives from collection tubes: EDTA, fluoride, sodium citrate, or acid citrate dextrose (n=200 per contaminant). The study additionally assessed the ability of the models to identify 127,449 single-analyte errors, a potential weakness of multivariate models.

Results: The KNN distance and SVM models outperformed limit checks for detecting all contaminants (p-values <0.05). The multivariate Gaussian model did not surpass limit checks for detecting EDTA contamination but was superior for detecting the other additives. All models surpassed limit checks for identifying single-analyte errors, with the KNN distance model demonstrating the highest overall sensitivity.

Conclusions: Multivariate anomaly detection models, particularly the KNN distance model, were superior to the conventional approach for detecting serum contamination and single-analyte errors. Developing multivariate approaches to autoverification is warranted to optimise error detection and improve patient safety.

Keywords: anomaly detection; autovalidation; autoverification; sample contamination.