Lung cancer risk prediction: Prostate, Lung, Colorectal And Ovarian Cancer Screening Trial models and validation

J Natl Cancer Inst. 2011 Jul 6;103(13):1058-68. doi: 10.1093/jnci/djr173. Epub 2011 May 23.

Abstract

Introduction: Identification of individuals at high risk for lung cancer should be of value to individuals, patients, clinicians, and researchers. Existing prediction models have only modest capabilities to classify persons at risk accurately.

Methods: Prospective data from 70 962 control subjects in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO) were used in models for the general population (model 1) and for a subcohort of ever-smokers (N = 38 254) (model 2). Both models included age, socioeconomic status (education), body mass index, family history of lung cancer, chronic obstructive pulmonary disease, recent chest x-ray, smoking status (never, former, or current), pack-years smoked, and smoking duration. Model 2 also included smoking quit-time (time in years since ever-smokers permanently quit smoking). External validation was performed with 44 223 PLCO intervention arm participants who completed a supplemental questionnaire and were subsequently followed. Known available risk factors were included in logistic regression models. Bootstrap optimism-corrected estimates of predictive performance were calculated (internal validation). Nonlinear relationships for age, pack-years smoked, smoking duration, and quit-time were modeled using restricted cubic splines. All reported P values are two-sided.

Results: During follow-up (median 9.2 years) of the control arm subjects, 1040 lung cancers occurred. During follow-up of the external validation sample (median 3.0 years), 213 lung cancers occurred. For models 1 and 2, bootstrap optimism-corrected receiver operator characteristic area under the curves were 0.857 and 0.805, and calibration slopes (model-predicted probabilities vs observed probabilities) were 0.987 and 0.979, respectively. In the external validation sample, models 1 and 2 had area under the curves of 0.841 and 0.784, respectively. These models had high discrimination in women, men, whites, and nonwhites.

Conclusion: The PLCO lung cancer risk models demonstrate high discrimination and calibration.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adult
  • Aged
  • Area Under Curve
  • Canada / epidemiology
  • Clinical Trials as Topic
  • Colorectal Neoplasms / diagnosis
  • Colorectal Neoplasms / epidemiology
  • Confounding Factors, Epidemiologic
  • Early Detection of Cancer*
  • Female
  • Humans
  • Logistic Models
  • Lung Neoplasms / diagnosis*
  • Lung Neoplasms / epidemiology*
  • Lung Neoplasms / etiology
  • Lung Neoplasms / mortality
  • Male
  • Middle Aged
  • Models, Statistical*
  • Ovarian Neoplasms / diagnosis
  • Ovarian Neoplasms / epidemiology
  • Predictive Value of Tests
  • Proportional Hazards Models
  • Prospective Studies
  • Prostatic Neoplasms / diagnosis
  • Prostatic Neoplasms / epidemiology
  • ROC Curve
  • Reproducibility of Results
  • Research Design
  • Risk Assessment
  • Risk Factors
  • Smoking / adverse effects
  • Smoking / epidemiology*
  • Smoking Cessation