This study examined the relative precision (RP) of two methods of scoring the 10-item Physical Functioning Scale (PF-10) from a large sample of patients (n = 3445) of the Medical Outcomes Study. Based on a Likert scaling model, the PF-10 summated scoring method was compared with a Rasch Item Response Theory (IRT) scaling model in which raw scores were transformed into a latent trait variable of physical functioning. Potential differences between scoring methods were hypothesized to be attributed to: (1) the logarithmic nature of the Rasch transformation; (2) the unevenness of the PF-10 item distributions; and (3) reduction of within-group variance. RP ratios favored the Rasch model in discriminating between patients who differed in disease severity. The Rasch and Likert scoring models performed similarly for tests involving sensitivity to change over a two-year follow-up period. In all comparisons, differences between methods were most apparent in clinical groups whose scores most approximated the extremes of the score distribution. Further research is necessary to test for differences between scoring models in discrimination and sensitivity to change among clinical groups whose scores are sufficiently spread across the continuum of physical functioning, in particular patients with either very high or low physical functioning. The Rasch model of scoring may have important implications for the clinical interpretation of individual scores at all ranges of the scale.