D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions
- PMID: 34536380
- PMCID: PMC8586911
- DOI: 10.1016/j.cels.2021.08.010
D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions
Abstract
We combine advances in neural language modeling and structurally motivated design to develop D-SCRIPT, an interpretable and generalizable deep-learning model, which predicts interaction between two proteins using only their sequence and maintains high accuracy with limited training data and across species. We show that a D-SCRIPT model trained on 38,345 human PPIs enables significantly improved functional characterization of fly proteins compared with the state-of-the-art approach. Evaluating the same D-SCRIPT model on protein complexes with known 3D structure, we find that the inter-protein contact map output by D-SCRIPT has significant overlap with the ground truth. We apply D-SCRIPT to screen for PPIs in cow (Bos taurus) at a genome-wide scale and focusing on rumen physiology, identify functional gene modules related to metabolism and immune response. The predicted interactions can then be leveraged for function prediction at scale, addressing the genome-to-phenome challenge, especially in species where little data are available.
Keywords: cow rumen; deep learning; embedding; function prediction; genome to phenome; interpretability; language models; metabolism; module detection; protein-protein interaction.
Copyright © 2021 The Authors. Published by Elsevier Inc. All rights reserved.
Conflict of interest statement
Declaration of interests The authors declare no competing interests.
Figures
Comment in
-
Protein matchmaking through representation learning.Cell Syst. 2021 Oct 20;12(10):948-950. doi: 10.1016/j.cels.2021.09.007. Cell Syst. 2021. PMID: 34672956
Similar articles
-
Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):54. doi: 10.1186/s12918-016-0302-3. BMC Syst Biol. 2016. PMID: 27490187 Free PMC article.
-
Contrastive learning in protein language space predicts interactions between drugs and protein targets.Proc Natl Acad Sci U S A. 2023 Jun 13;120(24):e2220778120. doi: 10.1073/pnas.2220778120. Epub 2023 Jun 8. Proc Natl Acad Sci U S A. 2023. PMID: 37289807 Free PMC article.
-
Explainable Deep Relational Networks for Predicting Compound-Protein Affinities and Contacts.J Chem Inf Model. 2021 Jan 25;61(1):46-66. doi: 10.1021/acs.jcim.0c00866. Epub 2020 Dec 21. J Chem Inf Model. 2021. PMID: 33347301 Free PMC article.
-
An Experimental Approach to Genome Annotation: This report is based on a colloquium sponsored by the American Academy of Microbiology held July 19-20, 2004, in Washington, DC.Washington (DC): American Society for Microbiology; 2004. Washington (DC): American Society for Microbiology; 2004. PMID: 33001599 Free Books & Documents. Review.
-
A primer on artificial intelligence in plant digital phenomics: embarking on the data to insights journey.Trends Plant Sci. 2023 Feb;28(2):154-184. doi: 10.1016/j.tplants.2022.08.021. Epub 2022 Sep 24. Trends Plant Sci. 2023. PMID: 36167648 Review.
Cited by
-
A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond.Brief Bioinform. 2024 Mar 27;25(3):bbae162. doi: 10.1093/bib/bbae162. Brief Bioinform. 2024. PMID: 38739759 Free PMC article. Review.
-
AcrNET: predicting anti-CRISPR with deep learning.Bioinformatics. 2023 May 4;39(5):btad259. doi: 10.1093/bioinformatics/btad259. Bioinformatics. 2023. PMID: 37084259 Free PMC article.
-
INTREPPPID-an orthologue-informed quintuplet network for cross-species prediction of protein-protein interaction.Brief Bioinform. 2024 Jul 25;25(5):bbae405. doi: 10.1093/bib/bbae405. Brief Bioinform. 2024. PMID: 39171984 Free PMC article.
-
SENSE-PPI reconstructs interactomes within, across, and between species at the genome scale.iScience. 2024 Jun 25;27(7):110371. doi: 10.1016/j.isci.2024.110371. eCollection 2024 Jul 19. iScience. 2024. PMID: 39055916 Free PMC article.
-
Zero-shot prediction of mutation effects with multimodal deep representation learning guides protein engineering.Cell Res. 2024 Sep;34(9):630-647. doi: 10.1038/s41422-024-00989-2. Epub 2024 Jul 5. Cell Res. 2024. PMID: 38969803 Free PMC article.
References
-
- Adams MD et al. (2000) ‘The genome sequence of Drosophila melanogaster’, Science, 287(5461), pp. 2185–2195. - PubMed
-
- Alonso A et al. (2004) ‘Protein tyrosine phosphatases in the human genome’, Cell, 117(6), pp. 699–711. - PubMed
-
- Alonso A and Pulido R (2016) ‘The extended human PTP ome: A growing tyrosine phosphatase family’, The FEBS journal, 283(8), pp. 1404–1429. - PubMed
-
- Altschul SF et al. (1990) ‘Basic local alignment search tool’, Journal of molecular biology, 215(3), pp. 403–410. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
