Influence function based variance estimation and missing data issues in case-cohort studies

Lifetime Data Anal. 2001 Dec;7(4):331-44. doi: 10.1023/a:1012533130596.


Recognizing that the efficiency in relative risk estimation for the Cox proportional hazards model is largely constrained by the total number of cases, Prentice (1986) proposed the case-cohort design in which covariates are measured on all cases and on a random sample of the cohort. Subsequent to Prentice, other methods of estimation and sampling have been proposed for these designs. We formalize an approach to variance estimation suggested by Barlow (1994), and derive a robust variance estimator based on the influence function. We consider the applicability of the variance estimator to all the proposed case-cohort estimators, and derive the influence function when known sampling probabilities in the estimators are replaced by observed sampling fractions. We discuss the modifications required when cases are missing covariate information. The missingness may occur by chance, and be completely at random; or may occur as part of the sampling design, and depend upon other observed covariates. We provide an adaptation of S-plus code that allows estimating influence function variances in the presence of such missing covariates. Using examples from our current case-cohort studies on esophageal and gastric cancer, we illustrate how our results our useful in solving design and analytic issues that arise in practice.

MeSH terms

  • Cohort Studies*
  • Confounding Factors, Epidemiologic
  • Esophageal Neoplasms
  • Health Care Rationing
  • Humans
  • Proportional Hazards Models*
  • Research Design
  • Stomach Neoplasms
  • United States