An adjusted partial least squares regression framework to utilize additional exposure information in environmental mixture data analysis

J Appl Stat. 2022 Mar 5;50(8):1790-1811. doi: 10.1080/02664763.2022.2043254. eCollection 2023.

Abstract

In a large-scale environmental health population study that is composed of subprojects, often different fractions of participants out of the total enrolled have measures of specific outcomes. It's conceptually reasonable to assume the association study would benefit from utilizing additional exposure information from those with a specific outcome not measured. Partial least squares regression is a practical approach to determine the exposure-outcome associations for mixture data. Like a typical regression approach, however, the partial least squares regression requires that each data observation must have both complete covariate and outcome for model fitting. In this paper, we propose novel adjustments to the general partial least squares regression to estimate and examine the association effects of individual environmental exposure to an outcome within a more complete context of the study population's environmental mixture exposures. The proposed framework takes advantage of the bilinear model structure. It allows information from all participants, with or without the outcome values, to contribute to the model fitting and the assessment of association effects. Using this proposed framework, incorporation of additional information will lead to smaller root mean square errors in the estimation of association effects, and improve the ability to assess the significance of the effects.

Keywords: Adjusted SIMPLS; Birth Cohort; Navajo; metal mixture exposure; mixture analysis.