Background: Previous literature showed significant health disparities between Native American population and other populations such as Non-Hispanic White. Most existing studies for Native American Health were based on non-probability samples which suffer with selection bias. In this paper, we are the first to evaluate the effectiveness of data integration methods, including calibration and sequential mass imputation, to improve the representativeness of the Tribal Behavioral Risk Factor Surveillance System (TBRFSS) in terms of reducing the biases of the raw estimates.
Methods: We evaluated the benefits of our proposed data integration methods, including calibration and sequential mass imputation, by using the 2019 TBRFSS and the 2018 and 2019 Behavioral Risk Factor Surveillance System (BRFSS). We combined the data from the 2018 and 2019 BRFSS by composite weighting. Demographic variables and general health variables were used as predictors for data integration. The following health-related variables were used for evaluation in terms of biases: Smoking status, Arthritis status, Cardiovascular Disease status, Chronic Obstructive Pulmonary Disease status, Asthma status, Cancer status, Stroke status, Diabetes status, and Health Coverage status.
Results: For most health-related variables, data integration methods showed smaller biases compared with unadjusted TBRFSS estimates. After calibration, the demographic and general health variables benchmarked with those for the BRFSS.
Conclusion: Data integration procedures, including calibration and sequential mass imputation methods, hold promise for improving the representativeness of the TBRFSS.
Keywords: Data integration; Nonprobability sample; Selection bias.
© 2023. The Author(s).