DBSI: DNA-binding site identifier

Nucleic Acids Res. 2013 Sep;41(16):e160. doi: 10.1093/nar/gkt617. Epub 2013 Jul 19.


In this study, we present the DNA-Binding Site Identifier (DBSI), a new structure-based method for predicting protein interaction sites for DNA binding. DBSI was trained and validated on a data set of 263 proteins (TRAIN-263), tested on an independent set of protein-DNA complexes (TEST-206) and data sets of 29 unbound (APO-29) and 30 bound (HOLO-30) protein structures distinct from the training data. We computed 480 candidate features for identifying protein residues that bind DNA, including new features that capture the electrostatic microenvironment within shells near the protein surface. Our iterative feature selection process identified features important in other models, as well as features unique to the DBSI model, such as a banded electrostatic feature with spatial separation comparable with the canonical width of the DNA minor groove. Validations and comparisons with established methods using a range of performance metrics clearly demonstrate the predictive advantage of DBSI, and its comparable performance on unbound (APO-29) and bound (HOLO-30) conformations demonstrates robustness to binding-induced protein conformational changes. Finally, we offer our feature data table to others for integration into their own models or for testing improved feature selection and model training strategies based on DBSI.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Validation Study

MeSH terms

  • Binding Sites
  • DNA / chemistry*
  • DNA / metabolism
  • DNA-Binding Proteins / chemistry*
  • DNA-Binding Proteins / metabolism
  • Models, Molecular
  • Nucleic Acid Conformation
  • Protein Binding
  • Protein Conformation
  • Static Electricity
  • Support Vector Machine*


  • DNA-Binding Proteins
  • DNA