Literome: PubMed-scale genomic knowledge base in the cloud

Bioinformatics. 2014 Oct;30(19):2840-2. doi: 10.1093/bioinformatics/btu383. Epub 2014 Jun 17.


Motivation: Advances in sequencing technology have led to an exponential growth of genomics data, yet it remains a formidable challenge to interpret such data for identifying disease genes and drug targets. There has been increasing interest in adopting a systems approach that incorporates prior knowledge such as gene networks and genotype-phenotype associations. The majority of such knowledge resides in text such as journal publications, which has been undergoing its own exponential growth. It has thus become a significant bottleneck to identify relevant knowledge for genomic interpretation as well as to keep up with new genomics findings.

Results: In the Literome project, we have developed an automatic curation system to extract genomic knowledge from PubMed articles and made this knowledge available in the cloud with a Web site to facilitate browsing, searching and reasoning. Currently, Literome focuses on two types of knowledge most pertinent to genomic medicine: directed genic interactions such as pathways and genotype-phenotype associations. Users can search for interacting genes and the nature of the interactions, as well as diseases and drugs associated with a single nucleotide polymorphism or gene. Users can also search for indirect connections between two entities, e.g. a gene and a disease might be linked because an interacting gene is associated with a related disease.

Availability and implementation: Literome is freely available at Download for non-commercial use is available via Web services.

MeSH terms

  • Algorithms
  • Automation
  • Computational Biology / methods*
  • Genetic Association Studies
  • Genome
  • Genomics / methods*
  • Genotype
  • Humans
  • Internet
  • Knowledge Bases
  • Phenotype
  • Polymorphism, Single Nucleotide*
  • PubMed*
  • Software