Background: Respondent driven sampling (RDS) was designed for sampling "hidden" populations and intended as a means of generating unbiased population estimates. Its widespread use has been accompanied by increasing scrutiny as researchers attempt to understand the extent to which the population estimates produced by RDS are, in fact, generalizable to the actual population of interest. In this study we compare two different methods of seed selection to determine whether this may influence recruitment and RDS measures.
Methods: Two seed groups were established. One group was selected as per a standard RDS approach of study staff purposefully selecting a small number of individuals to initiate recruitment chains. The second group consisted of individuals self-presenting to study staff during the time of data collection. Recruitment was allowed to unfold from each group and RDS estimates were compared between the groups. A comparison of variables associated with HIV was also completed.
Results: Three analytic groups were used for the majority of the analyses-RDS recruits originating from study staff-selected seeds (n = 196); self-presenting seeds (n = 118); and recruits of self-presenting seeds (n = 264). Multinomial logistic regression demonstrated significant differences between the three groups across six of ten sociodemographic and risk behaviours examined. Examination of homophily values also revealed differences in recruitment from the two seed groups (e.g. in one arm of the study sex workers and solvent users tended not to recruit others like themselves, while the opposite was true in the second arm of the study). RDS estimates of population proportions were also different between the two recruitment arms; in some cases corresponding confidence intervals between the two recruitment arms did not overlap. Further differences were revealed when comparisons of HIV prevalence were carried out.
Conclusions: RDS is a cost-effective tool for data collection, however, seed selection has the potential to influence which subgroups within a population are accessed. Our findings indicate that using multiple methods for seed selection may improve access to hidden populations. Our results further highlight the need for a greater understanding of RDS to ensure appropriate, accurate and representative estimates of a population can be obtained from an RDS sample.