Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations
- PMID: 28002465
- PMCID: PMC5225019
- DOI: 10.1371/journal.pcbi.1005294
Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations
Abstract
Over evolutionary time, members of a superfamily of homologous proteins sharing a common structural core diverge into subgroups filling various functional niches. At the sequence level, such divergence appears as correlations that arise from residue patterns distinct to each subgroup. Such a superfamily may be viewed as a population of sequences corresponding to a complex, high-dimensional probability distribution. Here we model this distribution as hierarchical interrelated hidden Markov models (hiHMMs), which describe these sequence correlations implicitly. By characterizing such correlations one may hope to obtain information regarding functionally-relevant properties that have thus far evaded detection. To do so, we infer a hiHMM distribution from sequence data using Bayes' theorem and Markov chain Monte Carlo (MCMC) sampling, which is widely recognized as the most effective approach for characterizing a complex, high dimensional distribution. Other routines then map correlated residue patterns to available structures with a view to hypothesis generation. When applied to N-acetyltransferases, this reveals sequence and structural features indicative of functionally important, yet generally unknown biochemical properties. Even for sets of proteins for which nothing is known beyond unannotated sequences and structures, this can lead to helpful insights. We describe, for example, a putative coenzyme-A-induced-fit substrate binding mechanism mediated by arginine residue switching between salt bridge and π-π stacking interactions. A suite of programs implementing this approach is available (psed.igs.umaryland.edu).
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
Similar articles
-
Gapped alignment of protein sequence motifs through Monte Carlo optimization of a hidden Markov model.BMC Bioinformatics. 2004 Oct 25;5:157. doi: 10.1186/1471-2105-5-157. BMC Bioinformatics. 2004. PMID: 15504234 Free PMC article.
-
Bayesian models and Markov chain Monte Carlo methods for protein motifs with the secondary characteristics.J Comput Biol. 2005 Sep;12(7):952-70. doi: 10.1089/cmb.2005.12.952. J Comput Biol. 2005. PMID: 16201915 Review.
-
A novel member of the GCN5-related N-acetyltransferase superfamily from Caenorhabditis elegans preferentially catalyses the N-acetylation of thialysine [S-(2-aminoethyl)-L-cysteine].Biochem J. 2004 Nov 15;384(Pt 1):129-37. doi: 10.1042/BJ20040789. Biochem J. 2004. PMID: 15283700 Free PMC article.
-
Bayesian Top-Down Protein Sequence Alignment with Inferred Position-Specific Gap Penalties.PLoS Comput Biol. 2016 May 18;12(5):e1004936. doi: 10.1371/journal.pcbi.1004936. eCollection 2016 May. PLoS Comput Biol. 2016. PMID: 27192614 Free PMC article.
-
Structure of histone acetyltransferases.J Mol Biol. 2001 Aug 17;311(3):433-44. doi: 10.1006/jmbi.2001.4859. J Mol Biol. 2001. PMID: 11492997 Review.
Cited by
-
Statistical investigations of protein residue direct couplings.PLoS Comput Biol. 2018 Dec 31;14(12):e1006237. doi: 10.1371/journal.pcbi.1006237. eCollection 2018 Dec. PLoS Comput Biol. 2018. PMID: 30596639 Free PMC article.
-
Inferring joint sequence-structural determinants of protein functional specificity.Elife. 2018 Jan 16;7:e29880. doi: 10.7554/eLife.29880. Elife. 2018. PMID: 29336305 Free PMC article.
-
Initial Cluster Analysis.J Comput Biol. 2018 Feb;25(2):121-129. doi: 10.1089/cmb.2017.0050. Epub 2017 Aug 3. J Comput Biol. 2018. PMID: 28771374 Free PMC article.
-
Deep Analysis of Residue Constraints (DARC): identifying determinants of protein functional specificity.Sci Rep. 2020 Feb 3;10(1):1691. doi: 10.1038/s41598-019-55118-6. Sci Rep. 2020. PMID: 32015389 Free PMC article.
-
Highly regulated, diversifying NTP-dependent biological conflict systems with implications for the emergence of multicellularity.Elife. 2020 Feb 26;9:e52696. doi: 10.7554/eLife.52696. Elife. 2020. PMID: 32101166 Free PMC article.
References
-
- Mendel G. Versuche über Pflanzen Hybriden. Verhandlungen des Naturforschenden Vereines Brünn. 1866;4:3–47.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous
