Analysis of biomarker data: logs, odds ratios, and receiver operating characteristic curves

Curr Opin HIV AIDS. 2010 Nov;5(6):473-9. doi: 10.1097/COH.0b013e32833ed742.


Purpose of review: We discuss two data analysis issues for studies that use binary clinical outcomes (whether or not an event occurred): the choice of an appropriate scale and transformation when biomarkers are evaluated as explanatory factors in logistic regression and assessing the ability of biomarkers to improve prediction accuracy for event risk.

Recent findings: Biomarkers with skewed distributions should be transformed before they are included as continuous covariates in logistic regression models. The utility of new biomarkers may be assessed by measuring the improvement in predicting event risk after adding the biomarkers to an existing model. The area under the receiver operating characteristic (ROC) curve (C-statistic) is often cited; it was developed for a different purpose, however, and may not address the clinically relevant questions. Measures of risk reclassification and risk prediction accuracy may be more appropriate.

Summary: The appropriate analysis of biomarkers depends on the research question. Odds ratios obtained from logistic regression describe associations of biomarkers with clinical events; failure to accurately transform the markers, however, may result in misleading estimates. Although the C-statistic is often used to assess the ability of new biomarkers to improve the prediction of event risk, other measures may be more suitable.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Biomarkers*
  • Data Interpretation, Statistical
  • Humans
  • Logistic Models*
  • Odds Ratio*
  • Predictive Value of Tests*
  • ROC Curve*


  • Biomarkers