Using Bibliometric Analysis and Machine Learning to Identify Compounds Binding to Sialidase-1

ACS Omega. 2021 Jan 20;6(4):3186-3193. doi: 10.1021/acsomega.0c05591. eCollection 2021 Feb 2.

Abstract

Rare diseases impact hundreds of millions of individuals worldwide. However, few therapies exist to treat the rare disease population because financial resources are limited, the number of patients affected is low, bioactivity data is often nonexistent, and very few animal models exist to support preclinical development efforts. Sialidosis is an ultrarare lysosomal storage disorder in which mutations in the NEU1 gene result in the deficiency of the lysosomal enzyme sialidase-1. This enzyme catalyzes the removal of sialic acid moieties from glycoproteins and glycolipids. Therefore, the defective or deficient protein leads to the buildup of sialylated glycoproteins as well as several characteristic symptoms of sialidosis including visual impairment, ataxia, hepatomegaly, dysostosis multiplex, and developmental delay. In this study, we used a bibliometric tool to generate links between lysosomal storage disease (LSD) targets and existing bioactivity data that could be curated in order to build machine learning models and screen compounds in silico. We focused on sialidase as an example, and we used the data curated from the literature to build a Bayesian model which was then used to score compound libraries and rank these molecules for in vitro testing. Two compounds were identified from in vitro testing using microscale thermophoresis, namely sulfameter (K d 2.15 ± 1.02 μM) and mexenone (K d 8.88 ± 4.02 μM), which validated our approach to identifying new molecules binding to this protein, which could represent possible drug candidates that can be evaluated further as potential chaperones for this ultrarare lysosomal disease for which there is currently no treatment. Combining bibliometric and machine learning approaches has the ability to assist in curating small molecule data and model building, respectively, for rare disease drug discovery. This approach also has the capability to identify new compounds that are potential drug candidates.