Computed ABC Analysis for Rational Selection of Most Informative Variables in Multivariate Data

PLoS One. 2015 Jun 10;10(6):e0129767. doi: 10.1371/journal.pone.0129767. eCollection 2015.

Abstract

Objective: Multivariate data sets often differ in several factors or derived statistical parameters, which have to be selected for a valid interpretation. Basing this selection on traditional statistical limits leads occasionally to the perception of losing information from a data set. This paper proposes a novel method for calculating precise limits for the selection of parameter sets.

Methods: The algorithm is based on an ABC analysis and calculates these limits on the basis of the mathematical properties of the distribution of the analyzed items. The limits implement the aim of any ABC analysis, i.e., comparing the increase in yield to the required additional effort. In particular, the limit for set A, the "important few", is optimized in a way that both, the effort and the yield for the other sets (B and C), are minimized and the additional gain is optimized.

Results: As a typical example from biomedical research, the feasibility of the ABC analysis as an objective replacement for classical subjective limits to select highly relevant variance components of pain thresholds is presented. The proposed method improved the biological interpretation of the results and increased the fraction of valid information that was obtained from the experimental data.

Conclusions: The method is applicable to many further biomedical problems including the creation of diagnostic complex biomarkers or short screening tests from comprehensive test batteries. Thus, the ABC analysis can be proposed as a mathematically valid replacement for traditional limits to maximize the information obtained from multivariate research data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Datasets as Topic*
  • Multivariate Analysis

Grants and funding

The research received funding, in particular the necessary computation equipment, from the European Union Seventh Framework Programme (FP7/2007 – 2013) under grant agreement no. 602919 (JL). The funders had no role in method design, data selection and analysis, decision to publish, or preparation of the manuscript.