Adaptive regression modeling of biomarkers of potential harm in a population of U.S. adult cigarette smokers and nonsmokers

BMC Med Res Methodol. 2010 Mar 16;10:19. doi: 10.1186/1471-2288-10-19.


Background: This article describes the data mining analysis of a clinical exposure study of 3585 adult smokers and 1077 nonsmokers. The analysis focused on developing models for four biomarkers of potential harm (BOPH): white blood cell count (WBC), 24 h urine 8-epi-prostaglandin F2alpha (EPI8), 24 h urine 11-dehydro-thromboxane B2 (DEH11), and high-density lipoprotein cholesterol (HDL).

Methods: Random Forest was used for initial variable selection and Multivariate Adaptive Regression Spline was used for developing the final statistical models

Results: The analysis resulted in the generation of models that predict each of the BOPH as function of selected variables from the smokers and nonsmokers. The statistically significant variables in the models were: platelet count, hemoglobin, C-reactive protein, triglycerides, race and biomarkers of exposure to cigarette smoke for WBC (R-squared = 0.29); creatinine clearance, liver enzymes, weight, vitamin use and biomarkers of exposure for EPI8 (R-squared = 0.41); creatinine clearance, urine creatinine excretion, liver enzymes, use of Non-steroidal antiinflammatory drugs, vitamins and biomarkers of exposure for DEH11 (R-squared = 0.29); and triglycerides, weight, age, sex, alcohol consumption and biomarkers of exposure for HDL (R-squared = 0.39).

Conclusions: Levels of WBC, EPI8, DEH11 and HDL were statistically associated with biomarkers of exposure to cigarette smoking and demographics and life style factors. All of the predictors together explain 29%-41% of the variability in the BOPH.

MeSH terms

  • Adult
  • Algorithms
  • Biomarkers / blood
  • Biomarkers / urine
  • Cholesterol, HDL / blood*
  • Data Mining
  • Dinoprost / analogs & derivatives*
  • Dinoprost / urine
  • Female
  • Humans
  • Leukocyte Count
  • Male
  • Regression Analysis
  • Smoking / blood*
  • Smoking / urine*
  • Thromboxane B2 / analogs & derivatives*
  • Thromboxane B2 / urine
  • United States


  • Biomarkers
  • Cholesterol, HDL
  • 8-epi-prostaglandin F2alpha
  • Thromboxane B2
  • 11-dehydro-thromboxane B2
  • Dinoprost