Functional characterization of 3D protein structures informed by human genetic diversity

Proc Natl Acad Sci U S A. 2019 Apr 30;116(18):8960-8965. doi: 10.1073/pnas.1820813116. Epub 2019 Apr 15.


Sequence variation data of the human proteome can be used to analyze 3D protein structures to derive functional insights. We used genetic variant data from nearly 140,000 individuals to analyze 3D positional conservation in 4,715 proteins and 3,951 homology models using 860,292 missense and 465,886 synonymous variants. Sixty percent of protein structures harbor at least one intolerant 3D site as defined by significant depletion of observed over expected missense variation. Structural intolerance data correlated with deep mutational scanning functional readouts for PPARG, MAPK1/ERK2, UBE2I, SUMO1, PTEN, CALM1, CALM2, and TPK1 and with shallow mutagenesis data for 1,026 proteins. The 3D structural intolerance analysis revealed different features for ligand binding pockets and orthosteric and allosteric sites. Large-scale data on human genetic variation support a definition of functional 3D sites proteome-wide.

Keywords: deep mutational scanning; exome; genome constraint; protein structure.

MeSH terms

  • Binding Sites
  • Calmodulin / genetics
  • DNA Mutational Analysis / methods
  • Genetic Variation / genetics*
  • Humans
  • Imaging, Three-Dimensional / methods*
  • Ligands
  • Mitogen-Activated Protein Kinase 1 / genetics
  • Models, Molecular
  • Molecular Conformation
  • Mutation
  • PPAR gamma / genetics
  • PTEN Phosphohydrolase / genetics
  • Protein Conformation
  • Proteome / genetics*
  • SUMO-1 Protein / genetics
  • Ubiquitin-Activating Enzymes / genetics


  • CALM1 protein, human
  • Calmodulin
  • Ligands
  • PPAR gamma
  • PPARG protein, human
  • Proteome
  • SUMO-1 Protein
  • SUMO1 protein, human
  • MAPK1 protein, human
  • Mitogen-Activated Protein Kinase 1
  • PTEN Phosphohydrolase
  • PTEN protein, human
  • UBA7 protein, human
  • Ubiquitin-Activating Enzymes