Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2017 Jun;7(3):317-324.
doi: 10.21037/cdt.2017.03.12.

Assessing observer variability: a user's guide

Affiliations
Free PMC article
Review

Assessing observer variability: a user's guide

Zoran B Popović et al. Cardiovasc Diagn Ther. 2017 Jun.
Free PMC article

Abstract

Some form of the assessment of observer variability may be the most frequent statistical task in medical literature. Still, very little attempt is made to make the reported methods uniform and clear to the reader. This paper provides overview of various measures of observer variability, and a rationale of why using standard error of measurement (SEM) is preferable to other measures of observer variability. The supplemental file contains examples on how to design a proper repeatability and reproducibility assessment, determine appropriate sample size, and test for significance of its findings.

Keywords: Observer variability; statistics.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: The authors have no conflicts of interest to declare.

Figures

Figure 1
Figure 1
An illustration of how observer variability behaves when the measurement error correlates with true value of quantity measured, using systolic strain rate as an example. (A) Two longitudinal systolic strain rate measurements, performed on mice, rats, rabbits, dogs and humans obtained by the same observer. All subjects taken from healthy populations. Notice that, as systolic strain rate increases with decreasing animal size, there is an increase in difference between two measurements (increased variability), illustrating the dependence of error on the mean value of the measurement; (B) Bland Altman plot of the same data, showing increasing distribution width of the data points with increasing average value; and (C) Bland Altman plot of the data expressed as percentage differences, with similar distribution throughout the range of average values.
Figure 2
Figure 2
Intraclass Correlation of Coefficient (ICC) as a measure of intraobserver variability plotted against corresponding SEMintra. Data were obtained by 6 sonographers measuring two times left ventricular strain in 6 healthy subjects, and are shown in Supplemental Table S3. Note no relationship between two measures of intraobserver variability with wide fluctuations in ICC.
Figure 3
Figure 3
Relationships between standard error of measurement (SEM), the width of the ±95% confidence interval (CI), and the minimum detectable difference (MDD), illustrated using the example of left ventricular end-diastolic dimension (LVEDD) measured from the M mode echocardiography. SEM is simply a standard deviation of the distribution of repeated measurements of LVEDD. 95% CIs are obtained by multiplying SEM by 1.96. MDD represents minimum difference between the two measurements (e.g., at baseline and at follow up) obtained on a same patient that can be deemed significant, and is obtained by multiplying CI by a square root of two. SEM is always lower when the repeated measurements are performed by a same person.

Similar articles

Cited by

References

    1. Measurement Systems Analysis Workgroup AIAG. Measurement and systems analysis reference manual. Auromotive Industry Action Group; 2010.
    1. Thavendiranathan P, Popovic ZB, Flamm SD, et al. Improved interobserver variability and accuracy of echocardiographic visual left ventricular ejection fraction assessment through a self-directed learning program using cardiac magnetic resonance images. J Am Soc Echocardiogr 2013;26:1267-73. 10.1016/j.echo.2013.07.017 - DOI - PubMed
    1. Lim P, Buakhamsri A, Popovic ZB, et al. Longitudinal strain delay index by speckle tracking imaging: A new marker of response to cardiac resynchronization therapy. Circulation 2008;118:1130-7. 10.1161/CIRCULATIONAHA.107.750190 - DOI - PubMed
    1. Kusunose K, Penn MS, Zhang Y, et al. How similar are the mice to men? Between-species comparison of left ventricular mechanics using strain imaging. PLoS One 2012;7:e40061 10.1371/journal.pone.0040061 - DOI - PMC - PubMed
    1. Bland JM, Altman DG. Measurement error and correlation coefficients. BMJ 1996;313:41-2. 10.1136/bmj.313.7048.41 - DOI - PMC - PubMed