Combining markers with and without the limit of detection

Stat Med. 2014 Apr 15;33(8):1307-20. doi: 10.1002/sim.6027. Epub 2013 Oct 17.

Abstract

In this paper, we consider the combination of markers with and without the limit of detection (LOD). LOD is often encountered when measuring proteomic markers. Because of the limited detecting ability of an equipment or instrument, it is difficult to measure markers at a relatively low level. Suppose that after some monotonic transformation, the marker values approximately follow multivariate normal distributions. We propose to estimate distribution parameters while taking the LOD into account, and then combine markers using the results from the linear discriminant analysis. Our simulation results show that the ROC curve parameter estimates generated from the proposed method are much closer to the truth than simply using the linear discriminant analysis to combine markers without considering the LOD. In addition, we propose a procedure to select and combine a subset of markers when many candidate markers are available. The procedure based on the correlation among markers is different from a common understanding that a subset of the most accurate markers should be selected for the combination. The simulation studies show that the accuracy of a combined marker can be largely impacted by the correlation of marker measurements. Our methods are applied to a protein pathway dataset to combine proteomic biomarkers to distinguish cancer patients from non-cancer patients.

Keywords: ROC curve; diagnostic accuracy; limit of detection; linear discriminant analysis.

MeSH terms

  • Area Under Curve
  • Biomarkers / analysis*
  • Carcinoma, Non-Small-Cell Lung / diagnosis
  • Computer Simulation
  • Discriminant Analysis*
  • Humans
  • Limit of Detection*
  • Lung Neoplasms / diagnosis
  • Neoplasm Proteins / analysis
  • Proteomics / methods*
  • ROC Curve*
  • Reproducibility of Results
  • Tissue Array Analysis

Substances

  • Biomarkers
  • Neoplasm Proteins