Catalytic and binding sites prediction in globular proteins through discrete Markov chains and network centrality measures

Phys Biol. 2021 Sep 23;18(6). doi: 10.1088/1478-3975/ac211b.

Abstract

In this work we use a discrete Markov chain approach combined with network centrality measures to identify and predict the location of active sites in globular proteins. To accomplish this, we use a three-dimensional network of proteinCαatoms as nodes connected through weighted edges which represent the varying interaction degree between protein's atoms. We compute the mean first passage time matrixH= {Hji} for this Markov chain and evaluate the averaged number of steps ⟨Hj⟩ to reach single nodenjin order to identify such residues that, on the average, are at the least distant from every other node. We also carry out a graph theory analysis to evaluate closeness centralityCc, betweenness centralityCband eigenvector centralityCemeasures which provide relevant information about the connectivity structure and topology of theCαprotein networks. Finally we also performed an analysis of equivalent random and regular networks of the same sizeNin terms of the average path lengthLand the average clustering coefficient⟨C⟩comparing these with the corresponding values forCαprotein networks. Our results show that the mean-first passage time matrixHand its related quantity ⟨Hj⟩ together withCc,CbandCecan not only predict with relative high accuracy the location of active sites in globular proteins but also exhibit a high feasibility to use them to predict the existence of new regions in protein's structure to identify new potential binding or catalytic activity or, in some cases, the presence of new allosteric pathways.

Keywords: active sites; discrete Markov chains; globular proteins; hitting time matrix; network centrality.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites
  • Cluster Analysis
  • Markov Chains
  • Protein Binding
  • Protein Folding*
  • Protein Interaction Maps
  • Proteins / chemistry*

Substances

  • Proteins