Identification of Enzyme Genes Using Chemical Structure Alignments of Substrate-Product Pairs

J Chem Inf Model. 2016 Mar 28;56(3):510-6. doi: 10.1021/acs.jcim.5b00216. Epub 2016 Feb 17.


Although there are several databases that contain data on many metabolites and reactions in biochemical pathways, there is still a big gap in the numbers between experimentally identified enzymes and metabolites. It is supposed that many catalytic enzyme genes are still unknown. Although there are previous studies that estimate the number of candidate enzyme genes, these studies required some additional information aside from the structures of metabolites such as gene expression and order in the genome. In this study, we developed a novel method to identify a candidate enzyme gene of a reaction using the chemical structures of the substrate-product pair (reactant pair). The proposed method is based on a search for similar reactant pairs in a reference database and offers ortholog groups that possibly mediate the given reaction. We applied the proposed method to two experimentally validated reactions. As a result, we confirmed that the histidine transaminase was correctly identified. Although our method could not directly identify the asparagine oxo-acid transaminase, we successfully found the paralog gene most similar to the correct enzyme gene. We also applied our method to infer candidate enzyme genes in the mesaconate pathway. The advantage of our method lies in the prediction of possible genes for orphan enzyme reactions where any associated gene sequences are not determined yet. We believe that this approach will facilitate experimental identification of genes for orphan enzymes.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Databases, Protein
  • Enzymes / genetics*
  • Enzymes / metabolism
  • Substrate Specificity


  • Enzymes