Background: We investigated the potential of proteomic fingerprinting with mass spectrometric serum profiling, coupled with pattern recognition methods, to identify biomarkers that could improve diagnosis of tuberculosis.
Methods: We obtained serum proteomic profiles from patients with active tuberculosis and controls by surface-enhanced laser desorption ionisation time of flight mass spectrometry. A supervised machine-learning approach based on the support vector machine (SVM) was used to obtain a classifier that distinguished between the groups in two independent test sets. We used k-fold cross validation and random sampling of the SVM classifier to assess the classifier further. Relevant mass peaks were selected by correlational analysis and assessed with SVM. We tested the diagnostic potential of candidate biomarkers, identified by peptide mass fingerprinting, by conventional immunoassays and SVM classifiers trained on these data.
Findings: Our SVM classifier discriminated the proteomic profile of patients with active tuberculosis from that of controls with overlapping clinical features. Diagnostic accuracy was 94% (sensitivity 93.5%, specificity 94.9%) for patients with tuberculosis and was unaffected by HIV status. A classifier trained on the 20 most informative peaks achieved diagnostic accuracy of 90%. From these peaks, two peptides (serum amyloid A protein and transthyretin) were identified and quantitated by immunoassay. Because these peptides reflect inflammatory states, we also quantitated neopterin and C reactive protein. Application of an SVM classifier using combinations of these values gave diagnostic accuracies of up to 84% for tuberculosis. Validation on a second, prospectively collected testing set gave similar accuracies using the whole proteomic signature and the 20 selected peaks. Using combinations of the four biomarkers, we achieved diagnostic accuracies of up to 78%.
Interpretation: The potential biomarkers for tuberculosis that we identified through proteomic fingerprinting and pattern recognition have a plausible biological connection with the disease and could be used to develop new diagnostic tests.