Indexes developed to measure physical functioning as an essential component of general health status are often based on sets of hierarchically-structured items intended to represent a broad underlying concept. Rasch Item Response Theory (IRT) provides a methodology to examine the hierarchical structure, unidimensionality, and reproducibility of item positions (calibrations) along a scale. Data gathered on the 10-item Physical Functioning Scale (PF-10) from a large sample of Medical Outcomes Study patients (N = 3445) were used to examine the hierarchical order, unidimensionality, and reproducibility of item calibrations. Rasch-IRT analyses generated an empirical item hierarchy, confirmed the unidimensionality of the PF-10 for most patients, and established the reproducibility of item calibrations across patient populations and repeated tests. These findings support the content validity of the PF-10 as a measure of physical functioning and suggest that valid Rasch-IRT summary scores could be generated as an alternative to the current Likert summative scores. Unidimensionality and reproducibility of the item scale are essential prerequisites for the development of Rasch-based person measures of physical functioning that can be used across populations and over repeated tests.