Automated machine learning for endemic active tuberculosis prediction from multiplex serological data

Sci Rep. 2021 Sep 9;11(1):17900. doi: 10.1038/s41598-021-97453-7.

Abstract

Serological diagnosis of active tuberculosis (TB) is enhanced by detection of multiple antibodies due to variable immune responses among patients. Clinical interpretation of these complex datasets requires development of suitable algorithms, a time consuming and tedious undertaking addressed by the automated machine learning platform MILO (Machine Intelligence Learning Optimizer). MILO seamlessly integrates data processing, feature selection, model training, and model validation to simultaneously generate and evaluate thousands of models. These models were then further tested for generalizability on out-of-sample secondary and tertiary datasets. Out of 31 antigens evaluated, a 23-antigen model was the most robust on both the secondary dataset (TB vs healthy) and the tertiary dataset (TB vs COPD) with sensitivity of 90.5% and respective specificities of 100.0% and 74.6%. MILO represents a user-friendly, end-to-end solution for automated generation and deployment of optimized models, ideal for applications where rapid clinical implementation is critical such as emerging infectious diseases.

MeSH terms

  • Adult
  • Female
  • Humans
  • Machine Learning*
  • Male
  • Models, Theoretical*
  • Retrospective Studies
  • Tuberculosis / epidemiology*
  • Young Adult