Background: Immune cells are involved in rheumatoid arthritis (RA), but the link between other blood cell indices and the disease activity of RA, along with the underlying mechanisms, is unclear.
Objective: This study aimed to develop an interpretable machine learning model based on blood cell parameters to assess RA disease severity and assist in personalized treatment decisions.
Methods: A retrospective case-control study was conducted with blood routine and biochemical detection data from 4401 patients at the First Affiliated Hospital of Guangxi Medical University, spanning from January 1, 2018, to January 1, 2024. The primary outcome was disease severity stratification. Recursive feature elimination was applied to identify key variables, and 10 machine learning algorithms were benchmarked on 55 clinical features with internal validation. Model interpretability was assessed with SHAP, while logistic regression and restricted cubic spline models were used to examine associations between blood cell indices and disease severity. In addition, Mendelian randomization analysis was performed to explore potential causal relationships.
Design: This was a retrospective case-control study.
Results: Blood cell indices were identified as the primary factors associated with RA severity. In model evaluation, the Random Forest achieved the best performance, with test set AUCs of 0.870 and 0.874. Mendelian randomization supported a causal relationship between blood cell indices and RA risk.
Conclusion: These results reinforce the associations between blood cell indices and RA severity. The machine learning model demonstrates good predictive capabilities for RA severity and may assist clinicians in developing personalized treatment strategies.
Keywords: Mendelian randomization; blood cell indices; machine learning; predictive model; rheumatoid arthritis.
Blood tests predict rheumatoid arthritis severity to customize treatments What is already known on this topic? Rheumatoid arthritis (RA) is a chronic disease where the immune system attacks joints, leading to pain, swelling, and potential permanent damage. Blood tests track inflammation via proteins like CRP, but they’re not precise for gauging RA severity. Immune cells drive RA, yet links between routine blood cell counts (e.g., red/white cells, platelets) and disease worsening—and their causes—are unclear. What this study adds? We analyzed records from 4,401 RA patients at a Chinese hospital (2018–2024). We built a computer tool from blood tests to classify RA as mild or severe, aiding tailored treatments. Using 10 algorithms on 55 features, we selected top blood indicators (e.g., lymphocytes, platelets), explained predictions with SHAP, and probed causality via Mendelian randomization and nonlinear models. What was found? Blood cells were top predictors of severity. Random Forest excelled (87% accuracy). Low lymphocytes and high platelet variation boosted severe RA odds (ORs 0.54–2.17). Causal ties confirmed: these changes drive RA risk. Nonlinear patterns showed extremes (too low/high counts) worsen outcomes. What the results mean for patients and the public? This tool uses affordable blood tests for fast severity checks, enabling personalized meds to curb damage and pain. Patients get targeted care, fewer trial errors, better daily life. It promotes equitable RA management in resource-limited settings and inspires blood-focused prevention strategies.
© The Author(s), 2026.