Background: When using administrative databases for epidemiologic research, a subsample of subjects can be interviewed, eliciting information on undocumented confounders. This article presents a thorough investigation of the validity of a two-stage sample encompassing an assessment of nonparticipation and quantification of the extent of bias.
Methods: Established through record linkage of administrative databases, the Québec Birth Cohort on Immunity and Health (n = 81,496) aims to study the association between Bacillus Calmette-Guérin vaccination and asthma. Among 76,623 subjects classified in four Bacillus Calmette-Guérin-asthma strata, a two-stage sampling strategy with a balanced design was used to randomly select individuals for interviews. We compared stratum-specific sociodemographic characteristics and healthcare utilization of stage 2 participants (n = 1,643) with those of eligible nonparticipants (n = 74,980) and nonrespondents (n = 3,157). We used logistic regression to determine whether participation varied across strata according to these characteristics. The effect of nonparticipation was described by the relative odds ratio (ROR = ORparticipants/ORsource population) for the association between sociodemographic characteristics and asthma.
Results: Parental age at childbirth, area of residence, family income, and healthcare utilization were comparable between groups. Participants were slightly more likely to be women and have a mother born in Québec. Participation did not vary across strata by sex, parental birthplace, or material and social deprivation. Estimates were not biased by nonparticipation; most RORs were below one and bias never exceeded 20%.
Conclusions: Our analyses evaluate and provide a detailed demonstration of the validity of a two-stage sample for researchers assembling similar research infrastructures.