Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jan;40(Database issue):D1255-61.
doi: 10.1093/nar/gkr925. Epub 2011 Nov 10.

The Gene Wiki in 2011: Community Intelligence Applied to Human Gene Annotation

Affiliations
Free PMC article

The Gene Wiki in 2011: Community Intelligence Applied to Human Gene Annotation

Benjamin M Good et al. Nucleic Acids Res. .
Free PMC article

Abstract

The Gene Wiki is an open-access and openly editable collection of Wikipedia articles about human genes. Initiated in 2008, it has grown to include articles about more than 10,000 genes that, collectively, contain more than 1.4 million words of gene-centric text with extensive citations back to the primary scientific literature. This growing body of useful, gene-centric content is the result of the work of thousands of individuals throughout the scientific community. Here, we describe recent improvements to the automated system that keeps the structured data presented on Gene Wiki articles in sync with the data from trusted primary databases. We also describe the expanding contents, editors and users of the Gene Wiki. Finally, we introduce a new automated system, called WikiTrust, which can effectively compute the quality of Wikipedia articles, including Gene Wiki articles, at the word level. All articles in the Gene Wiki can be freely accessed and edited at Wikipedia, and additional links and information can be found at the project's Wikipedia portal page: http://en.wikipedia.org/wiki/Portal:Gene_Wiki.

Figures

Figure 1.
Figure 1.
The Gene Wiki article about Cyclin-dependent kinase 2. Programmatically gathered and updated data such as protein structure diagrams, Gene Ontology annotations and links to related database entries are displayed in the information box on the right. The manually authored text that forms the main body of the article is organized into subsections as indicated by the table of contents on the top left. Note that the article had to be truncated for space considerations and thus the bottom portion, including the references section, is not displayed.
Figure 2.
Figure 2.
The top 100 most heavily linked genes, their connections to topics classified as diseases or drugs, and to the people that contributed most heavily to each article. (‘anon’ represents anonymous editors and ‘all bots’ aggregates automated edits to the articles). The thickness of the band connecting a gene to a person indicates the relative number of edits made to the article for that gene by that person.
Figure 3.
Figure 3.
Monthly growth of words in Gene Wiki articles, page views per month and edits per month between 1 September 2009 and 1 September 2011.
Figure 4.
Figure 4.
Trust distributions of Gene Wiki revisions versus general (non-Gene Wiki) Wikipedia revisions.
Figure 5.
Figure 5.
Screenshot of Firefox plugin displaying WikiTrust information for the Wikipedia article on the gene CDK2.
Figure 6.
Figure 6.
Cumulative revisions to Gene Wiki articles with trust assessments between 1 September 2009 and 1 September 2011. Note that the figure includes both manual and bot edits. The sharp edit spikes that occur near the beginning of the chart are the result of bots. Overall about half of the edits are manual and half are automated.
Figure 7.
Figure 7.
Longevity distribution, in seconds, of vandalism on the Gene Wiki and on the general (non-Gene Wiki) Wikipedia.

Similar articles

See all similar articles

Cited by 25 articles

See all "Cited by" articles

References

    1. Huss JW, Orozco C, Goodale J, Wu C, Batalov S, Vickers TJ, Valafar F, Su AI. A gene wiki for community annotation of gene function. PLoS Biol. 2008;6:e175. - PMC - PubMed
    1. Maglott D, Ostell J, Pruitt KD, Tatusova T. Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 2011;39:D52–D57. - PMC - PubMed
    1. Huss JW, Lindenbaum P, Martone M, Roberts D, Pizarro A, Valafar F, Hogenesch JB, Su AI. The Gene Wiki: community intelligence applied to human gene annotation. Nucleic Acids Res. 2010;38:D633–D639. - PMC - PubMed
    1. Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, Cline MS, Goldman M, Barber GP, Clawson H, Coelho A, et al. The UCSC Genome Browser database: update 2011. Nucleic Acids Res. 2011;39:D876–D882. - PMC - PubMed
    1. Ortega F. Thesis. Universidad Rey Juan Carlos, Madrid. 2009. Wikipedia: A Quantitative Analysis.

Publication types

Feedback