Objectives: Artificial intelligence (AI) models are increasingly being developed for clinical chemistry applications, however, it is not understood whether human interaction with the models, which may occur once they are implemented, improves or worsens their performance. This study examined the effect of human supervision on an artificial neural network trained to identify wrong blood in tube (WBIT) errors.
Methods: De-identified patient data for current and previous (within seven days) electrolytes, urea and creatinine (EUC) results were used in the computer simulation of WBIT errors at a rate of 50%. Laboratory staff volunteers reviewed the AI model's predictions, and the EUC results on which they were based, before making a final decision regarding the presence or absence of a WBIT error. The performance of this approach was compared to the performance of the AI model operating without human supervision.
Results: Laboratory staff supervised the classification of 510 sets of EUC results. This workflow identified WBIT errors with an accuracy of 81.2%, sensitivity of 73.7% and specificity of 88.6%. However, the AI model classifying these samples autonomously was superior on all metrics (p-values<0.05), including accuracy (92.5%), sensitivity (90.6%) and specificity (94.5%).
Conclusions: Human interaction with AI models can significantly alter their performance. For computationally complex tasks such as WBIT error identification, best performance may be achieved by autonomously functioning AI models.
Keywords: artificial intelligence; artificial neural network; decision support; machine learning; mislabeled samples; wrong blood in tube.
© 2021 Walter de Gruyter GmbH, Berlin/Boston.