A graph kernel method for DNA-binding site prediction

BMC Syst Biol. 2014;8 Suppl 4(Suppl 4):S10. doi: 10.1186/1752-0509-8-S4-S10. Epub 2014 Dec 8.

Abstract

Background: Protein-DNA interactions play important roles in many biological processes. Computational methods that can accurately predict DNA-binding sites on proteins will greatly expedite research on problems involving protein-DNA interactions.

Results: This paper presents a method for predicting DNA-binding sites on protein structures. The method represents protein surface patches using labeled graphs and uses a graph kernel method to calculate the similarities between graphs. A new surface patch is predicted to be interface or non-interface patch based on its similarities to known DNA-binding patches and non-DNA-binding patches. The proposed method achieved high accuracy when tested on a representative set of 146 protein-DNA complexes using leave-one-out cross-validation. Then, the method was applied to identify DNA-binding sites on 13 unbound structures of DNA-binding proteins. In each of the unbound structure, the top 1 patch predicted by the proposed method precisely indicated the location of the DNA-binding site. Comparisons with other methods showed that the proposed method was competitive in predicting DNA-binding sites on unbound proteins.

Conclusions: The proposed method uses graphs to encode the feature's distribution in the 3-dimensional (3D) space. Thus, compared with other vector-based methods, it has the advantage of taking into account the spatial distribution of features on the proteins. Using an efficient kernel method to compare graphs the proposed method also avoids the demanding computations required for 3D objects comparison. It provides a competitive method for predicting DNA-binding sites without requiring structure alignment.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Binding Sites
  • Computational Biology / methods*
  • Computer Graphics*
  • DNA / metabolism*
  • DNA-Binding Proteins / metabolism*
  • Protein Binding
  • Surface Properties

Substances

  • DNA-Binding Proteins
  • DNA