Estimating Prevalence, Demographics, and Costs of ME/CFS Using Large Scale Medical Claims Data and Machine Learning

Front Pediatr. 2019 Jan 8;6:412. doi: 10.3389/fped.2018.00412. eCollection 2018.


Techniques of data mining and machine learning were applied to a large database of medical and facility claims from commercially insured patients to determine the prevalence, gender demographics, and costs for individuals with provider-assigned diagnosis codes for myalgic encephalomyelitis (ME) or chronic fatigue syndrome (CFS). The frequency of diagnosis was 519-1,038/100,000 with the relative risk of females being diagnosed with ME or CFS compared to males 1.238 and 1.178, respectively. While the percentage of women diagnosed with ME/CFS is higher than the percentage of men, ME/CFS is not a "women's disease." Thirty-five to forty percent of diagnosed patients are men. Extrapolating from this frequency of diagnosis and based on the estimated 2017 population of the United States, a rough estimate for the number of patients who may be diagnosed with ME or CFS in the U.S. is 1.7 million to 3.38 million. Patients diagnosed with CFS appear to represent a more heterogeneous group than those diagnosed with ME. A machine learning model based on characteristics of individuals diagnosed with ME was developed and applied, resulting in a predicted prevalence of 857/100,000 (p > 0.01), or roughly 2.8 million in the U.S. Average annual costs for individuals with a diagnosis of ME or CFS were compared with those for lupus (all categories) and multiple sclerosis (MS), and found to be 50% higher for ME and CFS than for lupus or MS, and three to four times higher than for the general insured population. A separate aspect of the study attempted to determine if a diagnosis of ME or CFS could be predicted based on symptom codes in the insurance claims records. Due to the absence of specific codes for some core symptoms, we were unable to validate that the information in insurance claims records is sufficient to identify diagnosed patients or suggest that a diagnosis of ME or CFS should be considered based solely on looking for presence of those symptoms. These results show that a prevalence rate of 857/100,000 for ME/CFS is not unreasonable; therefore, it is not a rare disease, but in fact a relatively common one.

Keywords: ME/CFS; chronic fatigue syndrome; costs; data mining; machine learning; myalgic encephalomyelitis; prevalence.