Logistic regression analysis of biomarker data subject to pooling and dichotomization

Stat Med. 2012 Sep 28;31(22):2473-84. doi: 10.1002/sim.4367. Epub 2011 Sep 23.

Abstract

There is growing interest in pooling specimens across subjects in epidemiologic studies, especially those involving biomarkers. This paper is concerned with regression analysis of epidemiologic data where a binary exposure is subject to pooling and the pooled measurement is dichotomized to indicate either that no subjects in the pool are exposed or that some are exposed, without revealing further information about the exposed subjects in the latter case. The pooling process may be stratified on the disease status (a binary outcome) and possibly other variables but is otherwise assumed random. We propose methods for estimating parameters in a prospective logistic regression model and illustrate these with data from a population-based case-control study of colorectal cancer. Simulation results show that the proposed methods perform reasonably well in realistic settings and that pooling can lead to sizable gains in cost efficiency. We make recommendations with regard to the choice of design for pooled epidemiologic studies.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural

MeSH terms

  • Biomarkers / analysis*
  • Colorectal Neoplasms / genetics
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Humans
  • Logistic Models*
  • Polymorphism, Single Nucleotide
  • Prospective Studies

Substances

  • Biomarkers