SIMCA Modeling for Overlapping Classes: Fixed or Optimized Decision Threshold?

Anal Chem. 2018 Sep 18;90(18):10738-10747. doi: 10.1021/acs.analchem.8b01270. Epub 2018 Sep 6.

Abstract

An approach exploiting the principles of Receiver Operating Characteristic (ROC) curves for the simultaneous optimization of both the complexity and the decision threshold in Soft Independent Modeling of Class Analogy (SIMCA) classification models is here proposed. The outcomes resulting from the analysis of two simulated and four real case-studies highlight that, in the presence of strong overlap among various categories of samples, the implemented method can lead to better classification efficiency in external validation, compared to fixing such a threshold a priori. This guarantees a higher robustness toward class dispersion. On the other hand, in cases of clearer and more definite separation among the different groups of observations, their classification performance is equally satisfactory for test samples.