Early detection of patients with chronic diseases at risk of developing persistent pain is clinically desirable for timely initiation of multimodal therapies. Quality follow-up registries may provide the necessary clinical data; however, their design is not focused on a specific research aim, which poses challenges on the data analysis strategy. Here, machine-learning was used to identify early parameters that provide information about a future development of persistent pain in rheumatoid arthritis (RA). Data of 288 patients were queried from a registry based on the Swedish Epidemiological Investigation of RA. Unsupervised data analyses identified the following 3 distinct patient subgroups: low-, median-, and high-persistent pain intensity. Next, supervised machine-learning, implemented as random forests followed by computed ABC analysis-based item categorization, was used to select predictive parameters among 21 different demographic, patient-rated, and objective clinical factors. The selected parameters were used to train machine-learned algorithms to assign patients pain-related subgroups (1000 random resamplings, 2/3 training, and 1/3 test data). Algorithms trained with 3-month data of the patient global assessment and health assessment questionnaire provided pain group assignment at a balanced accuracy of 70%. When restricting the predictors to objective clinical parameters of disease severity, swollen joint count and tender joint count acquired at 3 months provided a balanced accuracy of RA of 59%. Results indicate that machine-learning is suited to extract knowledge from data queried from pain- and disease-related registries. Early functional parameters of RA are informative for the development and degree of persistent pain.