Interaction of DNA with clusters of amino acids in proteins

Nucleic Acids Res. 2004 Aug 9;32(14):4109-18. doi: 10.1093/nar/gkh733. Print 2004.


Protein-DNA interactions facilitate the fundamental functions of living cells and are universal in all living organisms. Several investigations have been carried out, essentially identifying pairs of interactions between the amino acid residues in proteins and the bases in DNA. In the present study, we have detected the recognition motifs that may constitute a cluster of spatially interacting residues in proteins, which interact with the bases of DNA. Graph spectral algorithm has been used to detect side chain clusters comprising Arg, Lys, Asn, Gln and aromatic residues from proteins interacting with DNA. We find that the interaction of proteins with DNA is through clusters in about half of the proteins in the dataset and through individual residues in the rest. Furthermore, inspection of the clusters has revealed additional interactions in a few cases, which have not been reported earlier. The geometry of the interaction between the DNA base and the protein residue is quantified by the distance d and the angle theta. These parameters have been identified for the cation-pi/H-bond stair motif that was reported earlier. Among the Arg, Lys, Asn and Gln residues, the range of (d, theta) values of the interacting Arg clearly falls into the cation-pi and the hydrogen bond interactions of the 'cation-pi/H-bond' stair motif. Analysis of the cluster composition reveals that the Arg residue is predominant than the Lys, Asn and Gln residues. The clusters are classified into Type I and Type II based on the presence or absence of aromatic residues (Phe, Tyr) in them. Residue conservation in these clusters has been examined. Apart from the conserved residues identified previously, a few more residues mainly Phe, Tyr and Arg have also been identified as conserved and interactive with the DNA. Interestingly, a few residues that are parts of interacting clusters and do not interact directly with the DNA have also been conserved. This emphasizes the importance of recognizing the protein side chain cluster motifs interacting with the DNA, which could serve as signatures of protein-DNA recognition in the families of DNA binding proteins.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Motifs
  • Amino Acids / chemistry*
  • Animals
  • DNA / chemistry*
  • DNA / metabolism
  • DNA-Binding Proteins / chemistry*
  • DNA-Binding Proteins / metabolism
  • Humans
  • Hydrogen Bonding
  • Mice
  • Models, Molecular
  • Molecular Structure
  • Proto-Oncogene Proteins / chemistry
  • Proto-Oncogene Proteins / metabolism
  • Proto-Oncogene Proteins c-ets
  • Transcription Factors / chemistry
  • Transcription Factors / metabolism


  • Amino Acids
  • DNA-Binding Proteins
  • Proto-Oncogene Proteins
  • Proto-Oncogene Proteins c-ets
  • Transcription Factors
  • DNA