Wikipedia and medicine: quantifying readership, editors, and the significance of natural language

J Med Internet Res. 2015 Mar 4;17(3):e62. doi: 10.2196/jmir.4069.


Background: Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of health care information by both professionals and the lay public.

Objective: This paper quantifies the production and consumption of Wikipedia's medical content along 4 dimensions. First, we measured the amount of medical content in both articles and bytes and, second, the citations that supported that content. Third, we analyzed the medical readership against that of other health care websites between Wikipedia's natural language editions and its relationship with disease prevalence. Fourth, we surveyed the quantity/characteristics of Wikipedia's medical contributors, including year-over-year participation trends and editor demographics.

Methods: Using a well-defined categorization infrastructure, we identified medically pertinent English-language Wikipedia articles and links to their foreign language equivalents. With these, Wikipedia can be queried to produce metadata and full texts for entire article histories. Wikipedia also makes available hourly reports that aggregate reader traffic at per-article granularity. An online survey was used to determine the background of contributors. Standard mining and visualization techniques (eg, aggregation queries, cumulative distribution functions, and/or correlation metrics) were applied to each of these datasets. Analysis focused on year-end 2013, but historical data permitted some longitudinal analysis.

Results: Wikipedia's medical content (at the end of 2013) was made up of more than 155,000 articles and 1 billion bytes of text across more than 255 languages. This content was supported by more than 950,000 references. Content was viewed more than 4.88 billion times in 2013. This makes it one of if not the most viewed medical resource(s) globally. The core editor community numbered less than 300 and declined over the past 5 years. The members of this community were half health care providers and 85.5% (100/117) had a university education.

Conclusions: Although Wikipedia has a considerable volume of multilingual medical content that is extensively read and well-referenced, the core group of editors that contribute and maintain that content is small and shrinking in size.

Keywords: Internet; Web 2.0; consumer health information; cooperative behavior; health information systems; information networks; information science; information sharing.

MeSH terms

  • Consumer Health Information / organization & administration*
  • Encyclopedias as Topic*
  • Humans
  • Internet / organization & administration*
  • Internet / statistics & numerical data
  • Language
  • Medicine*