Annotating nucleic acid-binding function based on protein structure

J Mol Biol. 2003 Feb 28;326(4):1065-79. doi: 10.1016/s0022-2836(03)00031-7.

Abstract

Many of the targets of structural genomics will be proteins with little or no structural similarity to those currently in the database. Therefore, novel function prediction methods that do not rely on sequence or fold similarity to other known proteins are needed. We present an automated approach to predict nucleic-acid-binding (NA-binding) proteins, specifically DNA-binding proteins. The method is based on characterizing the structural and sequence properties of large, positively charged electrostatic patches on DNA-binding protein surfaces, which typically coincide with the DNA-binding-sites. Using an ensemble of features extracted from these electrostatic patches, we predict DNA-binding proteins with high accuracy. We show that our method does not rely on sequence or structure homology and is capable of predicting proteins of novel-binding motifs and protein structures solved in an unbound state. Our method can also distinguish NA-binding proteins from other proteins that have similar, large positive electrostatic patches on their surfaces, but that do not bind nucleic acids.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Motifs
  • Amino Acid Sequence
  • Binding Sites
  • DNA-Binding Proteins / chemistry*
  • DNA-Binding Proteins / metabolism*
  • Hydrogen Bonding
  • Models, Molecular
  • Molecular Sequence Data
  • Neural Networks, Computer*
  • Protein Binding
  • Protein Conformation*
  • Software
  • Static Electricity

Substances

  • DNA-Binding Proteins