Waterless structures in the Protein Data Bank

IUCrJ. 2024 Nov 1;11(Pt 6):966-976. doi: 10.1107/S2052252524009928.

Abstract

The absence of solvent molecules in high-resolution protein crystal structure models deposited in the Protein Data Bank (PDB) contradicts the fact that, for proteins crystallized from aqueous media, water molecules are always expected to bind to the protein surface, as well as to some sites in the protein interior. An analysis of the contents of the PDB indicated that the expected ratio of the number of water molecules to the number of amino-acid residues exceeds 1.5 in atomic resolution structures, decreasing to 0.25 at around 2.5 Å resolution. Nevertheless, almost 800 protein crystal structures determined at a resolution of 2.5 Å or higher are found in the current release of the PDB without any water molecules, whereas some other depositions have unusually low or high occupancies of modeled solvent. Detailed analysis of these depositions revealed that the lack of solvent molecules might be an indication of problems with either the diffraction data, the refinement protocol, the deposition process or a combination of these factors. It is postulated that problems with solvent structure should be flagged by the PDB and addressed by the depositors.

Keywords: PDB-REDO; Protein Data Bank; X-ray crystallography; electron density; modeling errors; protein hydration; protein structure; structure-factor statistics; structure-quality statistics.

MeSH terms

  • Crystallography, X-Ray
  • Databases, Protein*
  • Models, Molecular
  • Protein Conformation*
  • Proteins* / chemistry
  • Solvents* / chemistry
  • Water* / chemistry

Substances

  • Proteins
  • Water
  • Solvents

Grants and funding

This project was supported in part by the Intramural Research Program of the NIH, National Cancer Institute, and Center for Cancer Research (to ZD and AW). The work of PR and WM was supported by Harrison Family Funds.