Purpose: The health care system in the United States is inherently hierarchical. Patients are "nested" within physicians who in turn are "nested" within practices. Much of the research data gathered in practice-based research networks (PBRNs) also have similar patterns of nesting (clustering). When research data are nested, statistical approaches to the data must account for the multilevel nature of the data or risk errors in interpretation. We illustrate the concept of multilevel structure and provide examples with implications for practice-based research.
Methods: We present a selection of multilevel (hierarchical) models and contrast them with traditional linear regression models, using an example of a simulated observational study to illustrate increasingly complex statistical approaches, as well as to explore the consequences of ignoring clustering in data. Additionally, we discuss other types of outcome data and designs, and the effects of clustering on sample size and power.
Results: Multilevel models demonstrate that the effects of physician-level activities may differ from clinic to clinic as well as between rural and urban settings; this variability would be undetected in traditional linear regression approaches. Study conclusions differed when the data were analyzed with multilevel methods compared with traditional linear regression methods. Clustered data also affected sample size; as the intraclass correlation increased and the patients per cluster increased, the required number of patients increased dramatically.
Conclusions: Recognizing and accounting for multilevel structure when analyzing data from PBRN studies can lead to more accurate conclusions, as well as offer opportunities to explore contextual effects and differences across sites. Accommodating multilevel structure in planning research studies can result in more appropriate estimation of required sample size.