Online doctor reviews: do they track surgeon volume, a proxy for quality of care?

J Med Internet Res. 2012 Apr 10;14(2):e50. doi: 10.2196/jmir.2005.

Abstract

Background: Increasingly, consumers are accessing the Internet seeking health information. Consumers are also using online doctor review websites to help select their physician. Such websites tally numerical ratings and comments from past patients. To our knowledge, no study has previously analyzed whether doctors with positive online reputations on doctor review websites actually deliver higher quality of care typically associated with better clinical outcomes and better safety records.

Objective: For a number of procedures, surgeons who perform more procedures have better clinical outcomes and safety records than those who perform fewer procedures. Our objective was to determine if surgeon volume, as a proxy for clinical outcomes and patient safety, correlates with online reputation.

Methods: We investigated the numerical ratings and comments on 9 online review websites for high- and low-volume surgeons for three procedures: lumbar surgery, total knee replacement, and bariatric surgery. High-volume surgeons were randomly selected from the group within the highest quartile of claims submitted for reimbursement using the procedures' relevant current procedural terminology (CPT) codes. Low-volume surgeons were randomly selected from the lowest quartile of submitted claims for the procedures' relevant CPT codes. Claims were collated within the Normative Health Information Database, covering multiple payers for more than 25 million insured patients.

Results: Numerical ratings were found for the majority of physicians in our sample (547/600, 91.2%) and comments were found for 385/600 (64.2%) of the physicians. We found that high-volume (HV) surgeons could be differentiated from low-volume (LV) surgeons independently by analyzing: (1) the total number of numerical ratings per website (HV: mean = 5.85; LV: mean = 4.87, P<.001); (2) the total number of text comments per website (HV: mean = 2.74; LV: mean = 2.30, P=.05); (3) the proportion of glowing praise/total comments about quality of care (HV: mean = 0.64; LV: mean = 0.51, P=.002); and (4) the proportion of scathing criticism/total comments about quality of care (HV: mean = 0.14; LV: mean = 0.23, P= .005). Even when these features were combined, the effect size, although significant, was still weak. The results revealed that one could accurately identify a physician's patient volume via discriminant and classification analysis 61.6% of the time. We also found that high-volume surgeons could not be differentiated from low-volume surgeons by analyzing (1) standardized z score numerical ratings (HV: mean = 0.07; LV: mean = 0, P=.27); (2) proportion of glowing praise/total comments about customer service (HV: mean = 0.24; LV: mean = 0.22, P=.52); and (3) proportion of scathing criticism/total comments about customer service (HV: mean = 0.19; LV: mean = 0.21, P=.48).

Conclusions: Online review websites provide a rich source of data that may be able to track quality of care, although the effect size is weak and not consistent for all review website metrics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • General Surgery*
  • Humans
  • Internet*
  • Physicians / standards*
  • Quality of Health Care*
  • Workforce