Causal analyses of existing databases: no power calculations required

J Clin Epidemiol. 2021 Aug 27;S0895-4356(21)00273-0. doi: 10.1016/j.jclinepi.2021.08.028. Online ahead of print.

Abstract

Observational databases are often used to study causal questions. Before being granted access to data or funding, researchers may need to prove that "the statistical power of their analysis will be high." Analyses expected to have low power, and hence result in imprecise estimates, will not be approved. This restrictive attitude towards observational analyses is misguided. A key misunderstanding is the belief that the goal of a causal analysis is to "detect" an effect. Causal effects are not binary signals that are either detected or undetected; causal effects are numerical quantities that need to be estimated. Because the goal is to quantify the effect as unbiasedly and precisely as possible, the solution to observational analyses with imprecise effect estimates is not avoiding observational analyses with imprecise estimates, but rather encouraging the conduct of many observational analyses. It is preferable to have multiple studies with imprecise estimates than having no study at all. After several studies become available, we will meta-analyze them and provide a more precise pooled effect estimate. Therefore, the justification to withhold an observational analysis of preexisting data cannot be that our estimates will be imprecise. Ethical arguments for power calculations before conducting a randomized trial which place individuals at risk are not transferable to observational analyses of existing databases. If a causal question is important, analyze your data, publish your estimates, encourage others to do the same, and then meta-analyze. The alternative is an unanswered question.

Keywords: Causal analysis; Causal inference; Meta-analysis; Observational analysis; Observational studies; Sample size; Statistical power; Statistical significance.