UDP-glucuronosyltransferases (UGTs) play a critical role in drug metabolism by catalyzing the glucuronidation of structurally diverse compounds. However, accurately predicting UGT-mediated sites of metabolism (SOMs) remains a challenge due to the limited availability of annotated data. In this study, we introduce UGTformer, a unified graph transformer-based framework that simultaneously performs UGT substrate classification and SOM prediction. UGTformer employs a hierarchical architecture integrating multi-hop message propagation with hop-aware and node-level transformer encoders. The model was pretrained on large-scale molecular graphs via chemically informed self-supervised tasks, and fine-tuned on a manually curated UGT metabolism data set covering four major metabolic reaction categories. In five-fold cross-validation, UGTformer achieved an AUC of 0.833 for substrate classification and 0.884 for SOM identification, outperforming multiple GNN baselines. On an independent external validation set, it maintained robust performance, demonstrating strong generalization to previously unseen molecules. By integrating chemically meaningful structural encodings and a joint learning paradigm, UGTformer delivers interpretable and biologically consistent predictions, offering a reliable and scalable approach for UGT-related metabolism prediction. The UGTformer model is freely accessible at https://lmmd.ecust.edu.cn/UGTformer/ .
Keywords: Drug metabolism; Graph transformer; Site of metabolism prediction; Substrate classification; UDP-glucuronosyltransferase (UGT).
© 2026. The Author(s).