Identification of Drug-Induced Liver Injury Biomarkers from Multiple Microarrays Based on Machine Learning and Bioinformatics Analysis

Int J Mol Sci. 2022 Oct 8;23(19):11945. doi: 10.3390/ijms231911945.


Drug-induced liver injury (DILI) is the most common adverse effect of numerous drugs and a leading cause of drug withdrawal from the market. In recent years, the incidence of DILI has increased. However, diagnosing DILI remains challenging because of the lack of specific biomarkers. Hence, we used machine learning (ML) to mine multiple microarrays and identify useful genes that could contribute to diagnosing DILI. In this prospective study, we screened six eligible microarrays from the Gene Expression Omnibus (GEO) database. First, 21 differentially expressed genes (DEGs) were identified in the training set. Subsequently, a functional enrichment analysis of the DEGs was performed. We then used six ML algorithms to identify potentially useful genes. Based on receiver operating characteristic (ROC), four genes, DDIT3, GADD45A, SLC3A2, and RBM24, were identified. The average values of the area under the curve (AUC) for these four genes were higher than 0.8 in both the training and testing sets. In addition, the results of immune cell correlation analysis showed that these four genes were highly significantly correlated with multiple immune cells. Our study revealed that DDIT3, GADD45A, SLC3A2, and RBM24 could be biomarkers contributing to the identification of patients with DILI.

Keywords: biomarker; diagnosis; drug-induced liver injury; machine learning; multiple microarrays.

MeSH terms

  • Biomarkers / metabolism
  • Chemical and Drug Induced Liver Injury* / diagnosis
  • Chemical and Drug Induced Liver Injury* / genetics
  • Computational Biology* / methods
  • Humans
  • Machine Learning
  • Prospective Studies
  • RNA-Binding Proteins


  • Biomarkers
  • RBM24 protein, human
  • RNA-Binding Proteins