Data mining of metal ion environments present in protein structures

J Inorg Biochem. 2008 Sep;102(9):1765-76. doi: 10.1016/j.jinorgbio.2008.05.006. Epub 2008 May 28.


Analysis of metal-protein interaction distances, coordination numbers, B-factors (displacement parameters), and occupancies of metal-binding sites in protein structures determined by X-ray crystallography and deposited in the PDB shows many unusual values and unexpected correlations. By measuring the frequency of each amino acid in metal ion-binding sites, the positive or negative preferences of each residue for each type of cation were identified. Our approach may be used for fast identification of metal-binding structural motifs that cannot be identified on the basis of sequence similarity alone. The analysis compares data derived separately from high and medium-resolution structures from the PDB with those from very high-resolution small-molecule structures in the Cambridge Structural Database (CSD). For high-resolution protein structures, the distribution of metal-protein or metal-water interaction distances agrees quite well with data from CSD, but the distribution is unrealistically wide for medium (2.0-2.5A) resolution data. Our analysis of cation B-factors versus average B-factors of atoms in the cation environment reveals substantial numbers of structures contain either an incorrect metal ion assignment or an unusual coordination pattern. Correlation between data resolution and completeness of the metal coordination spheres is also found.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Amino Acids / chemistry
  • Binding Sites
  • Databases, Protein*
  • Metalloproteins / chemistry*
  • Metals / chemistry*
  • Molecular Structure
  • Protein Binding


  • Amino Acids
  • Metalloproteins
  • Metals