Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Feb 26;11(2):e0149794.
doi: 10.1371/journal.pone.0149794. eCollection 2016.

A Bayesian Perspective on the Reproducibility Project: Psychology

Affiliations

A Bayesian Perspective on the Reproducibility Project: Psychology

Alexander Etz et al. PLoS One. .

Abstract

We revisit the results of the recent Reproducibility Project: Psychology by the Open Science Collaboration. We compute Bayes factors-a quantity that can be used to express comparative evidence for an hypothesis but also for the null hypothesis-for a large subset (N = 72) of the original papers and their corresponding replication attempts. In our computation, we take into account the likely scenario that publication bias had distorted the originally published results. Overall, 75% of studies gave qualitatively similar results in terms of the amount of evidence provided. However, the evidence was often weak (i.e., Bayes factor < 10). The majority of the studies (64%) did not provide strong evidence for either the null or the alternative hypothesis in either the original or the replication, and no replication attempts provided strong evidence in favor of the null. In all cases where the original paper provided strong evidence but the replication did not (15%), the sample size in the replication was smaller than the original. Where the replication provided strong evidence but the original did not (10%), the replication sample size was larger. We conclude that the apparent failure of the Reproducibility Project to replicate many target effects can be adequately explained by overestimation of effect sizes (or overestimation of evidence against the null hypothesis) due to small sample sizes and publication bias in the psychological literature. We further conclude that traditional sample sizes are insufficient and that a more widespread adoption of Bayesian methods is desirable.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Predicted distributions of t statistics in the literature.
Predicted distributions are shown under the four censoring mechanisms we consider (columns) and two possible states of nature (top row: H0 true (δ = 0); bottom row: H0 false (δ ≠ 0)).
Fig 2
Fig 2. Evidence resulting from replicated studies plotted against evidence resulting from the original publications.
For the original publications, evidence for the alternative hypothesis was calculated taking into account the possibility of publication bias. Small crosses indicate cases where neither the replication nor the original gave strong evidence. Circles indicate cases where one or the other gave strong evidence, with the size of each circle proportional to the ratio of the replication sample size to the original sample size (a reference circle appears in the lower right). The area labeled ‘replication uninformative’ contains cases where the original provided strong evidence but the replication did not, and the area labeled ‘original uninformative’ contains cases where the reverse was true. Two studies that fell beyond the limits of the figure in the top right area (i.e., that yielded extremely large Bayes factors both times) and two that fell above the top left area (i.e., large Bayes factors in the replication only) are not shown. The effect that relative sample size has on Bayes factor pairs is shown by the systematic size difference of circles going from the bottom right to the top left. All values in this figure can be found in S1 Table.

Similar articles

Cited by

References

    1. Open Science Collaboration. Estimating the reproducibility of psychological science. Science. 2015;349(6251):aac4716 10.1126/science.aac4716 - DOI - PubMed
    1. Handwerk B. Scientists replicated 100 psychology studies, and fewer than half got the same results; 2015. Accessed: 2015-10-31. http://bit.ly/1OYZVHY
    1. Jump P. More than half of psychology papers are not reproducible; 2015. Accessed: 2015-10-31. http://bit.ly/1GwLHGh
    1. Connor S. Study reveals that a lot of psychology research really is just ‘psycho-babble’; 2015. Accessed: 2015-10-31. http://ind.pn/1R07hby
    1. Feldman Barrett L. Psychology is not in crisis; 2015. Accessed: 2015-10-31. http://nyti.ms/1PInTEg

Publication types

Grants and funding

This work was partly funded by the National Science Foundation grants #1230118 and #1534472 from the Methods, 335 Measurements, and Statistics panel (www.nsf.gov) and the John Templeton Foundation grant #48192 (www.templeton.org). This publication was made possible through the support of a grant from the John Templeton Foundation. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

LinkOut - more resources