Background: A combination of biomarkers in a multivariate model may predict disease with greater accuracy than a single biomarker employed alone. We developed a non-linear method of multivariate analysis, weighted digital analysis (WDA), and evaluated its ability to predict lung cancer employing volatile biomarkers in the breath.
Methods: WDA generates a discriminant function to predict membership in disease vs no disease groups by determining weight, a cutoff value, and a sign for each predictor variable employed in the model. The weight of each predictor variable was the area under the curve (AUC) of the receiver operating characteristic (ROC) curve minus a fixed offset of 0.55, where the AUC was obtained by employing that predictor variable alone, as the sole marker of disease. The sign (+/-) was used to invert the predictor variable if a lower value indicated a higher probability of disease. When employed to predict the presence of a disease in a particular patient, the discriminant function was determined as the sum of the weights of all predictor variables that exceeded their cutoff values. The algorithm that generates the discriminant function is deterministic because parameters are calculated from each individual predictor variable without any optimization or adjustment. We employed WDA to re-evaluate data from a recent study of breath biomarkers of lung cancer, comprising the volatile organic compounds (VOCs) in the alveolar breath of 193 subjects with primary lung cancer and 211 controls with a negative chest CT.
Results: The WDA discriminant function accurately identified patients with lung cancer in a model employing 30 breath VOCs (ROC curve AUC=0.90; sensitivity=84.5%, specificity=81.0%). These results were superior to multilinear regression analysis of the same data set (AUC=0.74, sensitivity=68.4, specificity=73.5%). WDA test accuracy did not vary appreciably with TNM (tumor, node, metastasis) stage of disease, and results were not affected by tobacco smoking (ROC curve AUC=0.92 in current smokers, 0.90 in former smokers). WDA was a robust predictor of lung cancer: random removal of 1/3 of the VOCs did not reduce the AUC of the ROC curve by >10% (99.7% CI).
Conclusions: A test employing WDA of breath VOCs predicted lung cancer with accuracy similar to chest computed tomography. The algorithm identified dependencies that were not apparent with traditional linear methods. WDA appears to provide a useful new technique for non-linear multivariate analysis of data.