Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 15 (2), e0228713
eCollection

Science Through Wikipedia: A Novel Representation of Open Knowledge Through Co-Citation Networks

Affiliations

Science Through Wikipedia: A Novel Representation of Open Knowledge Through Co-Citation Networks

Wenceslao Arroyo-Machado et al. PLoS One.

Abstract

This study provides an overview of science from the Wikipedia perspective. A methodology has been established for the analysis of how Wikipedia editors regard science through their references to scientific papers. The method of co-citation has been adapted to this context in order to generate Pathfinder networks (PFNET) that highlight the most relevant scientific journals and categories, and their interactions in order to find out how scientific literature is consumed through this open encyclopaedia. In addition to this, their obsolescence has been studied through Price index. A total of 1 433 457 references available at Altmetric.com have been initially taken into account. After pre-processing and linking them to the data from Elsevier's CiteScore Metrics the sample was reduced to 847 512 references made by 193 802 Wikipedia articles to 598 746 scientific articles belonging to 14 149 journals indexed in Scopus. As highlighted results we found a significative presence of "Medicine" and "Biochemistry, Genetics and Molecular Biology" papers and that the most important journals are multidisciplinary in nature, suggesting also that high-impact factor journals were more likely to be cited. Furthermore, only 13.44% of Wikipedia citations are to Open Access journals.

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Methodological process of collecting the massive dataset of papers referenced in Wikipedia and assigning them to different scientific categories.
Fig 2
Fig 2. Annual values of total references made by Wikipedia and single articles cited.
Fig 3
Fig 3. Box and violin plots for the years of publication of the scientific articles referenced in Wikipedia (outliers are shown in red).
Fig 4
Fig 4. Literature obsolescence of Wikipedia article references using the Price index for 5, 10, 15 and 20 years.
Fig 5
Fig 5. Scatter plot of journals by citation collected in Scopus and Wikipedia in 2016 to articles published between 2013 and 2015.
The size of the points corresponds to the number of articles published in that period and the color corresponds to the ratio between citation percentiles: red (more on Scopus) and blue (more on Wikipedia).
Fig 6
Fig 6. Co-citation network of journals based on Wikipedia article references.
A) Main component of the full network; B) Pathfinder of the full network. Each node represents one journal and node size corresponds to the total number of citations received; color corresponds to the area but those with more than one are white; the thickness of the edges corresponds to the degree of co-citation between the two. The titles of the 10 journals with the highest intermediation value have been included.
Fig 7
Fig 7. Co-citation network of journals based on Wikipedia article references.
This network is produced by applying the Pathfinder algorithm—based on a minimum of 50 co-cites—and shows a total of 629 relationships. Each node represents one journal and node size corresponds to the total number of citations received; color corresponds to the area or combination of subject areas to which it belongs; and the thickness of the edges corresponds to the degree of co-citation between the two. The titles of the 20 journals with the highest intermediation value have been included.
Fig 8
Fig 8. Comparison of the percentage of articles by main field in Scopus and Wikipedia.
Fig 9
Fig 9. Price index for Wikipedia main fields.
Fig 10
Fig 10. Co-citation network of the 27 main fields after applying the Pathfinder algorithm.
The nodes represent each main field; node size corresponds to the total number of citations received, color corresponds to own vector centrality; and the thickness of the edges corresponds with degree of co-citation.
Fig 11
Fig 11. Co-citation network of the 330 fields after applying the Pathfinder algorithm.
The nodes represent each field, indicating size, total number of citations received, color, thematic area or areas, and the thickness of the edges indicates the degree of co-citation. Field titles are given for the 15 fields with the highest levels of intermediation.

Similar articles

See all similar articles

References

    1. O’Reilly T. What is web 2.0? Design patterns and business models for the next generation of software [Internet]. O’Reilly Media, Inc; 2005. Available from: https://www.oreilly.com/pub/a/web2/archive/what-is-web-20.html
    1. Surowiecki J. The wisdom of crowds. Anchor; 2005.
    1. Fallis D. Toward an epistemology of Wikipedia. J Assoc Inf Sci Technol. 2008;59(10):1662–74.
    1. Rubira R, Gil-Egui G. Wikipedia as a space for discursive constructions of globalization. Int Commun Gaz [Internet]. 2019. October 30;81(1):3–19. Available from: 10.1177/1748048517736415 - DOI
    1. König R. Wikipedia: Between lay participation and elite knowledge representation. Information, Commun Soc [Internet]. 2013. March 1;16(2):160–77. Available from: 10.1080/1369118X.2012.734319 - DOI

Grant support

This study has been possible thanks to financial support from “Knowmetrics: knowledge evaluation in digital society", a project funded by scientific research team grants from the BBVA Foundation, 2016, and the grant TIN2016-75850-R from the Spanish Ministry of Economy and Competitiveness with FEDER funds.
Feedback