The KEGG resource for deciphering the genome

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D277-80. doi: 10.1093/nar/gkh063.


A grand challenge in the post-genomic era is a complete computer representation of the cell and the organism, which will enable computational prediction of higher-level complexity of cellular processes and organism behavior from genomic information. Toward this end we have been developing a knowledge-based approach for network prediction, which is to predict, given a complete set of genes in the genome, the protein interaction networks that are responsible for various cellular processes. KEGG at is the reference knowledge base that integrates current knowledge on molecular interaction networks such as pathways and complexes (PATHWAY database), information about genes and proteins generated by genome projects (GENES/SSDB/KO databases) and information about biochemical compounds and reactions (COMPOUND/GLYCAN/REACTION databases). These three types of database actually represent three graph objects, called the protein network, the gene universe and the chemical universe. New efforts are being made to abstract knowledge, both computationally and manually, about ortholog clusters in the KO (KEGG Orthology) database, and to collect and analyze carbohydrate structures in the GLYCAN database.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Carbohydrate Sequence
  • Chemical Phenomena
  • Chemistry*
  • Computational Biology
  • Databases, Factual*
  • Databases, Genetic
  • Genes
  • Genome
  • Genomics*
  • Humans
  • Internet
  • Molecular Biology*
  • Molecular Sequence Data
  • Protein Binding
  • Proteins / genetics
  • Proteins / metabolism


  • Proteins