Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Dec;46(6pt1):1778-802.
doi: 10.1111/j.1475-6773.2011.01299.x. Epub 2011 Aug 11.

Examining multiple sources of differential item functioning on the Clinician & Group CAHPS® survey

Affiliations

Examining multiple sources of differential item functioning on the Clinician & Group CAHPS® survey

Hector P Rodriguez et al. Health Serv Res. 2011 Dec.

Abstract

Objective: To evaluate psychometric properties of a widely used patient experience survey.

Data sources: English-language responses to the Clinician & Group Consumer Assessment of Healthcare Providers and Systems (CG-CAHPS®) survey (n = 12,244) from a 2008 quality improvement initiative involving eight southern California medical groups.

Methods: We used an iterative hybrid ordinal logistic regression/item response theory differential item functioning (DIF) algorithm to identify items with DIF related to patient sociodemographic characteristics, duration of the physician-patient relationship, number of physician visits, and self-rated physical and mental health. We accounted for all sources of DIF and determined its cumulative impact.

Principal findings: The upper end of the CG-CAHPS® performance range is measured with low precision. With sensitive settings, some items were found to have DIF. However, overall DIF impact was negligible, as 0.14 percent of participants had salient DIF impact. Latinos who spoke predominantly English at home had the highest prevalence of salient DIF impact at 0.26 percent.

Conclusions: The CG-CAHPS® functions similarly across commercially insured respondents from diverse backgrounds. Consequently, previously documented racial and ethnic group differences likely reflect true differences rather than measurement bias. The impact of low precision at the upper end of the scale should be clarified.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Test Characteristic (A) and Information (B) Curves. Notes. (a) The test characteristic curve, which is a plot of the most likely standard score associated with each level of overall primary care experience. This curve shows that the distribution of items is not uniform across the range measured by the scale, as the slope of the curve is higher to the left of about 0 than to the right. This finding suggests problems with using standard scores in regression models; item response theory (IRT) scores should be used instead (Crane et al. 2008a). (b) The test information curve (black curve) and the standard error of measurement curve (gray curve) at each level of overall primary care experience are shown. These curves further document the uneven distribution of items across the scale. Test information is adequate to the left of 0 but drops to the right of 0. This is reflected by standard error of measurement curve, which is characterized by large amounts of measurement error at the top end of the scale. This figure is analogous to the alpha coefficient commonly reported using classical test theory. Unlike classical test theory, however, IRT does not assume that measurement precision is consistent across the entire scale, and it does not summarize measurement with a single omnibus statistic such as the alpha coefficient. See McDonald (1999) for further discussion.
Figure 2
Figure 2
Differential Item Functioning (DIF) Impact, by Patient Race, Ethnicity, and Primary Language Spoken at Home. Notes. This figure plots the distributions of differences between naive scores ignoring DIF and scores that account for all sources of DIF across the six race-ethnicity groups evaluated in the study. Differences of 0 indicate no DIF impact. We use the median standard error of measurement for the scale to demarcate levels of DIF impact that can be distinguished from negligible effects (dark vertical lines); DIF greater than this level is referred to as “salient” DIF. The box represents the 25th and 75th percentiles of the distribution, and the whiskers represent 1½ times the distribution of the box. Observations more extreme than the whiskers are shown with dots. This graph depicts that the box denoting the interquartile range is very close to 0, and that of the whiskers are well within the dark vertical lines denoting the standard error of measurement. A small and negligible number of people have salient DIF impact when accounting for all the sources of DIF considered here. See text for further details.

Similar articles

Cited by

References

    1. Ballard C, Margallo-Lana M, Juszczak E, Douglas S, Swann A, Thomas A, O'Brien J, Everratt A, Sadler S, Maddison C, Lee L, Bannister C, Elvish R, Jacoby R. “Quetiapine and Rivastigmine and Cognitive Decline in Alzheimer's Disease: Randomised Double Blind Placebo Controlled Trial.”. British Medical Journal. 2005;330(7496):874. - PMC - PubMed
    1. Bann CM, Iannacchione VG, Sekscenski ES. “Evaluating the Effect of Translation on Spanish Speakers’ Ratings of Medicare.”. Health Care Financing Review. 2005;26(4):51–65. - PMC - PubMed
    1. Beal A, Hernandez S, Doty M. “Latino Access to the Patient-Centered Medical Home.”. Journal of General Internal Medicine. 2009;24(Suppl 3):514–20. - PMC - PubMed
    1. Beattie PF, Nelson RM, Lis A. “Spanish-Language Version of the Medrisk Instrument for Measuring Patient Satisfaction with Physical Therapy Care (MRPS): Preliminary Validation”. Physical Therapy. 2007;87(6):793–800. - PubMed
    1. Beauducel A, Herzberg PY. “On the Performance of Maximum Likelihood versus Means and Variance Adjusted Weighted Least Squares Estimation in CFA.”. Structural Equation Modeling. 2006;13(2):186–203.

Publication types