Distribution patterns of small-molecule ligands in the protein universe and implications for origin of life and drug discovery

Genome Biol. 2007;8(8):R176. doi: 10.1186/gb-2007-8-8-r176.

Abstract

Background: Extant life depends greatly on the binding of small molecules (such as ligands) with macromolecules (such as proteins), and one ligand can bind multiple proteins. However, little is known about the global patterns of ligand-protein mapping.

Results: By examining 2,186 well-defined small-molecule ligands and thousands of protein domains derived from a database of druggable binding sites, we show that a few ligands bind tens of protein domains or folds, whereas most ligands bind only one, which indicates that ligand-protein mapping follows a power law. Through assigning the protein-binding orders (early or late) for bio-ligands, we demonstrate that the preferential attachment principle still holds for the power-law relation between ligands and proteins. We also found that polar molecular surface area, H-bond acceptor counts, H-bond donor counts and partition coefficient are potential factors to discriminate ligands from ordinary molecules and to differentiate super ligands (shared by three or more folds) from others.

Conclusion: These findings have significant implications for evolution and drug discovery. First, the chronology of ligand-protein binding can be inferred by the power-law feature of ligand-protein mapping. Some nucleotide-containing ligands, such as ATP, ADP, GDP, NAD, FAD, dihydro-nicotinamide-adenine-dinucleotide phosphate (NDP), nicotinamide-adenine-dinucleotide phosphate (NAP), flavin mononucleotide (FMN) and AMP, are found to be the earliest cofactors bound to proteins, agreeing with the current understanding of evolutionary history. Second, the finding that about 30% of ligands are shared by two or more domains will help with drug discovery, such as in finding new functions from old drugs, developing promiscuous drugs and depending more on natural products.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Drug Design*
  • Evolution, Molecular*
  • Humans
  • Ligands
  • Origin of Life*
  • Protein Folding
  • Protein Structure, Tertiary
  • Proteins / chemistry*

Substances

  • Ligands
  • Proteins