Background: Since large-scale health surveys usually have complicated sampling schemes, there is often a question as to whether the sampling design must be considered in the analysis of the data. A recent disagreement concerning the analysis of a body iron stores-cancer association found in the first National Health and Nutrition Examination Survey and its follow-up is used to highlight the issues.
Methods: We explain and illustrate the importance of two aspects of the sampling design: clustering and weighting of observations. The body iron stores-cancer data are reanalyzed by utilizing or ignoring various aspects of the sampling design. Simple formulas are given to describe how using the sampling design of a survey in the analysis will affect the conclusions of that analysis.
Results: The different analyses of the body iron stores-cancer data lead to very different conclusions. Application of the simple formulas suggests that utilization of the sample clustering in the analysis is appropriate, but that a standard utilization of the sample weights leads to an uninformative analysis. The recommended analysis incorporates the sampling weights in a nonstandard way and the sample clustering in the standard way.
Conclusions: Which particular aspects of the sampling design to use in the analysis of complex survey data and how to use them depend on certain features of the design. We give some guidelines for when to use the sample clustering and sample weights in the analysis.