Machine-learning-based knowledge discovery in rheumatoid arthritis-related registry data to identify predictors of persistent pain

Pain. 2020 Jan;161(1):114-126. doi: 10.1097/j.pain.0000000000001693.


Early detection of patients with chronic diseases at risk of developing persistent pain is clinically desirable for timely initiation of multimodal therapies. Quality follow-up registries may provide the necessary clinical data; however, their design is not focused on a specific research aim, which poses challenges on the data analysis strategy. Here, machine-learning was used to identify early parameters that provide information about a future development of persistent pain in rheumatoid arthritis (RA). Data of 288 patients were queried from a registry based on the Swedish Epidemiological Investigation of RA. Unsupervised data analyses identified the following 3 distinct patient subgroups: low-, median-, and high-persistent pain intensity. Next, supervised machine-learning, implemented as random forests followed by computed ABC analysis-based item categorization, was used to select predictive parameters among 21 different demographic, patient-rated, and objective clinical factors. The selected parameters were used to train machine-learned algorithms to assign patients pain-related subgroups (1000 random resamplings, 2/3 training, and 1/3 test data). Algorithms trained with 3-month data of the patient global assessment and health assessment questionnaire provided pain group assignment at a balanced accuracy of 70%. When restricting the predictors to objective clinical parameters of disease severity, swollen joint count and tender joint count acquired at 3 months provided a balanced accuracy of RA of 59%. Results indicate that machine-learning is suited to extract knowledge from data queried from pain- and disease-related registries. Early functional parameters of RA are informative for the development and degree of persistent pain.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Algorithms
  • Arthritis, Rheumatoid / physiopathology*
  • Female
  • Humans
  • Knowledge Discovery
  • Machine Learning*
  • Male
  • Middle Aged
  • Pain / diagnosis
  • Pain / physiopathology*
  • Registries
  • Risk Factors
  • Sweden
  • Young Adult