Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov 20;12(11):e0184923.
doi: 10.1371/journal.pone.0184923. eCollection 2017.

The relation between statistical power and inference in fMRI

Affiliations

The relation between statistical power and inference in fMRI

Henk R Cremers et al. PLoS One. .

Abstract

Statistically underpowered studies can result in experimental failure even when all other experimental considerations have been addressed impeccably. In fMRI the combination of a large number of dependent variables, a relatively small number of observations (subjects), and a need to correct for multiple comparisons can decrease statistical power dramatically. This problem has been clearly addressed yet remains controversial-especially in regards to the expected effect sizes in fMRI, and especially for between-subjects effects such as group comparisons and brain-behavior correlations. We aimed to clarify the power problem by considering and contrasting two simulated scenarios of such possible brain-behavior correlations: weak diffuse effects and strong localized effects. Sampling from these scenarios shows that, particularly in the weak diffuse scenario, common sample sizes (n = 20-30) display extremely low statistical power, poorly represent the actual effects in the full sample, and show large variation on subsequent replications. Empirical data from the Human Connectome Project resembles the weak diffuse scenario much more than the localized strong scenario, which underscores the extent of the power problem for many studies. Possible solutions to the power problem include increasing the sample size, using less stringent thresholds, or focusing on a region-of-interest. However, these approaches are not always feasible and some have major drawbacks. The most prominent solutions that may help address the power problem include model-based (multivariate) prediction methods and meta-analyses with related synthesis-oriented approaches.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Examples of the sampling simulation.
Two different brain-behavior correlation distributions (full sample n = 10.000): (a) Weak Diffuse (WD) and (b) Strong Localized (SL). The scatter plots display the voxel (indicated by a green circle in the image) with the strongest absolute correlation between its brain activity (y-axis) and the behavioral variable (x-axis). The color bar indicates the effect size (Pearson’s r). From these two full samples, 4 random subsamples are drawn with two different sample sizes: (c) n = 30 from the WD full sample (d) n = 30 from the SL full sample (e) n = 150 from the WD full sample (f) n = 150 from the SL full sample. For each subsample a p < .01 uncorrected threshold is applied. The distribution of the maximum effect sizes (absolute values) obtained from the subsampling procedure are shown for the (g) WD scenario and (h) SL scenario. The thick black lines indicate the maximum correlation of the full sample. Small samples (n = 30) of the WD scenario show a large discrepancy (overestimation of the effect sizes and underestimation of the spatial extend of brain-behavior correlations) with the full sample effects, and the WD subsamples appear more similar to the SL scenario. Larger samples of the WD scenario (n = 150) show less, but still substantial discrepancy with the full sample effects. For the SL scenario there is a small discrepancy between subsamples of either sample size and the full sample effects.
Fig 2
Fig 2. Results of the sampling simulations.
Relation between sample size (n) and (a) average (solid line) and at least one (alo; dashed line) statistical power; (b) detected effect size (Pearson’s r) in the samples, and the full population effect size range (shown as a colored transparent bar); (c) Percentage of voxels below the threshold (p < .01); (d) mean dice coefficient: spatial overlap of significant (p < .01) voxels between two subsequent replications. SL; Strong Localized effects, WD; Weak Diffuse effects. The shaded grey area around the estimates reflects the 95% confidence intervals based on the sampling distribution.
Fig 3
Fig 3. Main effect of the HCP social cognition task.
(a) effects for the TOM>CON contrast in the full sample (n = 485). The colors reflect effect size (Cohen's d). (b) number of significant voxels and mean d in significant voxels as a function of subsample size. (c) Results of TOM>CON contrast for 16 random subsamples of n = 15. TOM; theory of Mind. CON; control condition.
Fig 4
Fig 4. Correlation between the HCP social cognition contrast and the personality trait agreeableness.
The top row is the full sample (n = 485), the scatterplot at the right displays the single strongest voxel (r = .25). The bottom row shows 8 random subsamples of n = 30. Each one shows a slice through the brain where effects are "detected" and the scatterplot shows the single most strongly correlated voxel.

Similar articles

Cited by

References

    1. Button KS, Ioannidis JPA, Mokrysz C, Nosek BA, Flint J, Robinson ESJ, et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;14: 365–376. doi: 10.1038/nrn3475 - DOI - PubMed
    1. Desmond JE, Glover GH. Estimating sample size in functional MRI (fMRI) neuroimaging studies: statistical power analyses. Journal of Neuroscience Methods. 2002;118: 115–128. - PubMed
    1. Yarkoni T. Big Correlations in Little Studies: Inflated fMRI Correlations Reflect Low Statistical Power-Commentary on Vul et al. (2009). Perspectives on Psychological Science. 2009;4: 294–298. doi: 10.1111/j.1745-6924.2009.01127.x - DOI - PubMed
    1. Durnez J, Moerkerke B, Nichols TE. Post-hoc power estimation for topological inference in fMRI. NeuroImage. 2014;84: 45–64. doi: 10.1016/j.neuroimage.2013.07.072 - DOI - PubMed
    1. Durnez J, Degryse J, Moerkerke B, Seurinck R, Sochat V, Poldrack R, et al. Power and sample size calculations for fMRI studies based on the prevalence of active peaks. 2016. April doi: 10.1101/049429 - DOI

MeSH terms