DeepCPI: A Deep Learning-based Framework for Large-scale in silico Drug Screening

Genomics Proteomics Bioinformatics. 2019 Oct;17(5):478-495. doi: 10.1016/j.gpb.2019.04.003. Epub 2020 Feb 6.

Abstract

Accurate identification of compound-protein interactions (CPIs) in silico may deepen our understanding of the underlying mechanisms of drug action and thus remarkably facilitate drug discovery and development. Conventional similarity- or docking-based computational methods for predicting CPIs rarely exploit latent features from currently available large-scale unlabeled compound and protein data and often limit their usage to relatively small-scale datasets. In the present study, we propose DeepCPI, a novel general and scalable computational framework that combines effective feature embedding (a technique of representation learning) with powerful deep learning methods to accurately predict CPIs at a large scale. DeepCPI automatically learns the implicit yet expressive low-dimensional features of compounds and proteins from a massive amount of unlabeled data. Evaluations of the measured CPIs in large-scale databases, such as ChEMBL and BindingDB, as well as of the known drug-target interactions from DrugBank, demonstrated the superior predictive performance of DeepCPI. Furthermore, several interactions among small-molecule compounds and three G protein-coupled receptor targets (glucagon-like peptide-1 receptor, glucagon receptor, and vasoactive intestinal peptide receptor) predicted using DeepCPI were experimentally validated. The present study suggests that DeepCPI is a useful and powerful tool for drug discovery and repositioning. The source code of DeepCPI can be downloaded from https://github.com/FangpingWan/DeepCPI.

Keywords: Compound–protein interaction prediction; Deep learning; Drug discovery; In silico drug screening; Machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Area Under Curve
  • Databases, Chemical
  • Deep Learning*
  • Pharmaceutical Preparations / chemistry
  • Pharmaceutical Preparations / metabolism
  • Proteins / chemistry
  • Proteins / metabolism
  • ROC Curve
  • User-Computer Interface*

Substances

  • Pharmaceutical Preparations
  • Proteins