[Resampling combined with stacking learning for prediction of blood-brain barrier permeability of compounds]

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2023 Aug 25;40(4):753-761. doi: 10.7507/1001-5515.202210067.
[Article in Chinese]

Abstract

It is a significant challenge to improve the blood-brain barrier (BBB) permeability of central nervous system (CNS) drugs in their development. Compared with traditional pharmacokinetic property tests, machine learning techniques have been proven to effectively and cost-effectively predict the BBB permeability of CNS drugs. In this study, we introduce a high-performance BBB permeability prediction model named balanced-stacking-learning based BBB permeability predictor(BSL-B3PP). Firstly, we screen out the feature set that has a strong influence on BBB permeability from the perspective of medicinal chemistry background and machine learning respectively, and summarize the BBB positive(BBB+) quantification intervals. Then, a combination of resampling algorithms and stacking learning(SL) algorithm is used for predicting the BBB permeability of CNS drugs. The BSL-B3PP model is constructed based on a large-scale BBB database (B3DB). Experimental validation shows an area under curve (AUC) of 97.8% and a Matthews correlation coefficient (MCC) of 85.5%. This model demonstrates promising BBB permeability prediction capability, particularly for drugs that cannot penetrate the BBB, which helps reduce CNS drug development costs and accelerate the CNS drug development process.

如何改善中枢神经系统(CNS)药物的血脑屏障(BBB)透过率,是CNS药物研发中面临的重要挑战。相较于传统的药代动力学性质测试,机器学习技术已被证实可以有效、低成本地预测CNS药物的BBB透过率。本文提出一种基于均衡化堆叠学习(SL)的BBB透过率预测模型(BSL-B3PP),首先分别从药物化学背景角度以及机器学习角度,筛选出对BBB透过率有关键影响的特征集,并总结可穿透BBB(BBB+)量化区间;然后融合重采样方法与堆叠学习算法,进行CNS药物BBB透过率预测。BSL-B3PP模型基于较大规模的BBB数据集(B3DB)构建,经实验验证,曲线下面积(AUC)达97.8%,马修斯相关系数(MCC)为85.5%。研究结果说明,本模型具有较好的BBB透过率预测能力,尤其对于不能穿透BBB的药物也能保持较高的预测准确度,有助于降低CNS药物研发成本,加快CNS药物研发进程。.

Keywords: Blood-brain barrier permeability; Blood-brain barrier positive quantification intervals; Machine learning; Resampling algorithms; Stacking learning algorithm.

Publication types

  • English Abstract

MeSH terms

  • Algorithms*
  • Area Under Curve
  • Blood-Brain Barrier*
  • Databases, Factual
  • Permeability