The number of scholarly documents on the public web

PLoS One. 2014 May 9;9(5):e93949. doi: 10.1371/journal.pone.0093949. eCollection 2014.


The number of scholarly documents available on the web is estimated using capture/recapture methods by studying the coverage of two major academic search engines: Google Scholar and Microsoft Academic Search. Our estimates show that at least 114 million English-language scholarly documents are accessible on the web, of which Google Scholar has nearly 100 million. Of these, we estimate that at least 27 million (24%) are freely available since they do not require a subscription or payment of any kind. In addition, at a finer scale, we also estimate the number of scholarly documents on the web for fifteen fields: Agricultural Science, Arts and Humanities, Biology, Chemistry, Computer Science, Economics and Business, Engineering, Environmental Sciences, Geosciences, Material Science, Mathematics, Medicine, Physics, Social Sciences, and Multidisciplinary, as defined by Microsoft Academic Search. In addition, we show that among these fields the percentage of documents defined as freely available varies significantly, i.e., from 12 to 50%.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Economics / statistics & numerical data
  • Humanities / statistics & numerical data
  • Humans
  • Information Dissemination*
  • Internet / statistics & numerical data*
  • Medicine / statistics & numerical data
  • Physics / statistics & numerical data
  • Publications / statistics & numerical data*
  • Science / statistics & numerical data
  • Search Engine / statistics & numerical data*

Grant support

This work was partially funded by the National Science Foundation, grants 0958143, 1348712, and 1143921. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. There has been no additional external funding received for this study.