Bioinformatic analysis of underlying mechanisms of Kawasaki disease via Weighted Gene Correlation Network Analysis (WGCNA) and the Least Absolute Shrinkage and Selection Operator method (LASSO) regression model

BMC Pediatr. 2023 Feb 24;23(1):90. doi: 10.1186/s12887-023-03896-4.

Abstract

Background: Kawasaki disease (KD) is a febrile systemic vasculitis involvingchildren younger than five years old. However, the specific biomarkers and precise mechanisms of this disease are not fully understood, which can delay the best treatment time, hence, this study aimed to detect the potential biomarkers and pathophysiological process of KD through bioinformatic analysis.

Methods: The Gene Expression Omnibus database (GEO) was the source of the RNA sequencing data from KD patients. Differential expressed genes (DEGs) were screened between KD patients and healthy controls (HCs) with the "limma" R package. Weighted gene correlation network analysis (WGCNA) was performed to discover the most corresponding module and hub genes of KD. The node genes were obtained by the combination of the least absolute shrinkage and selection operator (LASSO) regression model with the top 5 genes from five algorithms in CytoHubba, which were further validated with the receiver operating characteristic curve (ROC curve). CIBERSORTx was employed to discover the constitution of immune cells in KDs and HCs. Functional enrichment analysis was performed to understand the biological implications of the modular genes. Finally, competing endogenous RNAs (ceRNA) networks of node genes were predicted using online databases.

Results: A total of 267 DEGs were analyzed between 153 KD patients and 92 HCs in the training set, spanning two modules according to WGCNA. The turquoise module was identified as the hub module, which was mainly enriched in cell activation involved in immune response, myeloid leukocyte activation, myeloid leukocyte mediated immunity, secretion and leukocyte mediated immunity biological processes; included type II diabetes mellitus, nicotinate and nicotinamide metabolism, O-glycan biosynthesis, glycerolipid and glutathione metabolism pathways. The node genes included ADM, ALPL, HK3, MMP9 and S100A12, and there was good performance in the validation studies. Immune cell infiltration analysis revealed that gamma delta T cells, monocytes, M0 macrophage, activated dendritic cells, activated mast cells and neutrophils were elevated in KD patients. Regarding the ceRNA networks, three intact networks were constructed: NEAT1/NORAD/XIST-hsa-miR-524-5p-ADM, NEAT1/NORAD/XIST-hsa-miR-204-5p-ALPL, NEAT1/NORAD/XIST-hsa-miR-524-5p/hsa-miR-204-5p-MMP9.

Conclusion: To conclude, the five-gene signature and three ceRNA networks constructed in our study are of great value in the early diagnosis of KD and might help to elucidate our understanding of KD at the RNA regulatory level.

Keywords: CIBERSORT; Kawasaki disease; LASSO regression model; Weighted gene correlation network analysis; ceRNA network.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Child, Preschool
  • Computational Biology
  • Diabetes Mellitus, Type 2*
  • Fever
  • Humans
  • Matrix Metalloproteinase 9
  • MicroRNAs*
  • Mucocutaneous Lymph Node Syndrome*

Substances

  • Matrix Metalloproteinase 9
  • MIRN204 microRNA, human
  • MicroRNAs
  • MIRN-524 microRNA, human