TEFDTA: a transformer encoder and fingerprint representation combined prediction method for bonded and non-bonded drug-target affinities

Bioinformatics. 2024 Jan 2;40(1):btad778. doi: 10.1093/bioinformatics/btad778.

Abstract

Motivation: The prediction of binding affinity between drug and target is crucial in drug discovery. However, the accuracy of current methods still needs to be improved. On the other hand, most deep learning methods focus only on the prediction of non-covalent (non-bonded) binding molecular systems, but neglect the cases of covalent binding, which has gained increasing attention in the field of drug development.

Results: In this work, a new attention-based model, A Transformer Encoder and Fingerprint combined Prediction method for Drug-Target Affinity (TEFDTA) is proposed to predict the binding affinity for bonded and non-bonded drug-target interactions. To deal with such complicated problems, we used different representations for protein and drug molecules, respectively. In detail, an initial framework was built by training our model using the datasets of non-bonded protein-ligand interactions. For the widely used dataset Davis, an additional contribution of this study is that we provide a manually corrected Davis database. The model was subsequently fine-tuned on a smaller dataset of covalent interactions from the CovalentInDB database to optimize performance. The results demonstrate a significant improvement over existing approaches, with an average improvement of 7.6% in predicting non-covalent binding affinity and a remarkable average improvement of 62.9% in predicting covalent binding affinity compared to using BindingDB data alone. At the end, the potential ability of our model to identify activity cliffs was investigated through a case study. The prediction results indicate that our model is sensitive to discriminate the difference of binding affinities arising from small variances in the structures of compounds.

Availability and implementation: The codes and datasets of TEFDTA are available at https://github.com/lizongquan01/TEFDTA.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Factual
  • Drug Delivery Systems*
  • Drug Development*
  • Drug Discovery