Computational prediction and interpretation of druggable proteins using a stacked ensemble-learning framework

Phasit Charoenkwan; Nalini Schaduangrat; Pietro Lio'; Mohammad Ali Moni; Watshara Shoombuatong; Balachandran Manavalan

doi:10.1016/j.isci.2022.104883

Computational prediction and interpretation of druggable proteins using a stacked ensemble-learning framework

iScience. 2022 Aug 5;25(9):104883. doi: 10.1016/j.isci.2022.104883. eCollection 2022 Sep 16.

Authors

Phasit Charoenkwan¹, Nalini Schaduangrat², Pietro Lio'³, Mohammad Ali Moni⁴, Watshara Shoombuatong², Balachandran Manavalan⁵

Affiliations

¹ Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand.
² Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
³ Department of Computer Science and Technology, University of Cambridge, Cambridge CB3 0FD, UK.
⁴ Artificial Intelligence & Digital Health, School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD 4072, Australia.
⁵ Computational Biology and Bioinformatics Laboratory, Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea.

Abstract

Discovery of potential drugs requires rapid and precise identification of drug targets. Although traditional experimental methodologies can accurately identify drug targets, they are time-consuming and inappropriate for high-throughput screening. Computational approaches based on machine learning (ML) algorithms can expedite the prediction of druggable proteins; however, the performance of the existing computational methods remains unsatisfactory. This study proposes a computational tool, SPIDER, to enhance the accurate prediction of druggable proteins. SPIDER employs various feature descriptors pertaining to several aspects, including physicochemical properties, compositional information, and composition-transition-distribution information, coupled with well-known ML algorithms to facilitate the construction of the final meta-predictor. The experimental results showed that SPIDER enabled more precise and robust prediction of druggable proteins than the baseline models and current existing methods in terms of the independent test dataset. An online web server was established and made freely available online.

Keywords: Artificial intelligence; Artificial intelligence applications; Computational chemistry; Drugs.