Data, information, knowledge and principle: back to metabolism in KEGG

Nucleic Acids Res. 2014 Jan;42(Database issue):D199-205. doi: 10.1093/nar/gkt1076. Epub 2013 Nov 7.


In the hierarchy of data, information and knowledge, computational methods play a major role in the initial processing of data to extract information, but they alone become less effective to compile knowledge from information. The Kyoto Encyclopedia of Genes and Genomes (KEGG) resource ( or has been developed as a reference knowledge base to assist this latter process. In particular, the KEGG pathway maps are widely used for biological interpretation of genome sequences and other high-throughput data. The link from genomes to pathways is made through the KEGG Orthology system, a collection of manually defined ortholog groups identified by K numbers. To better automate this interpretation process the KEGG modules defined by Boolean expressions of K numbers have been expanded and improved. Once genes in a genome are annotated with K numbers, the KEGG modules can be computationally evaluated revealing metabolic capacities and other phenotypic features. The reaction modules, which represent chemical units of reactions, have been used to analyze design principles of metabolic networks and also to improve the definition of K numbers and associated annotations. For translational bioinformatics, the KEGG MEDICUS resource has been developed by integrating drug labels (package inserts) used in society.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Chemical*
  • Drug-Related Side Effects and Adverse Reactions
  • Genome
  • Internet
  • Knowledge Bases
  • Metabolic Networks and Pathways* / genetics
  • Pharmaceutical Preparations / chemistry
  • Pharmaceutical Preparations / classification
  • Phenotype


  • Pharmaceutical Preparations