Study objective: ST-segment elevation is used to make early decisions about using thrombolytic therapy in patients with suspected myocardial infarction. This study was performed to assess interobserver and intraobserver variation in subjects' measurements of ST-segment elevation in isolated ECG complexes.
Methods: We performed a masked, paired-sample experiment. Emergency physicians, emergency medicine residents, and senior medical students were asked to measure ST-segment elevations in a packet of 40 isolated single ECG complexes from patients with proven myocardial infarction. Each packet consisted of 20 pairs of randomly ordered ECG complexes. Subjects were masked to the duplicate nature of the pairs of complexes. Estimates of each ST-segment measurement pair were subtracted; summary statistics were calculated for these differences. Agreement between first and second ST-segment estimates was analyzed by using weighted kappa values. Tests for differences in the paired ST-segment estimations among groups were carried out with 1-factor analysis of variance.
Results: Fifty-two subjects completed the study and measured a total of 2,070 ST segments (1,035 pairs). The mean difference in segment height among all groups was 0.28 mm (95% confidence interval [CI], 0.27 to 0.30). The median was 0.2 mm, the 80th percentile was 0.5 mm, and the 95th percentile was 0.9 mm. Statistical agreement between paired ST-segment measurements was very good (kappa=0.85; 95% CI, 0.83 to 0.87). However, by using a threshold of at least 2.0 mm for ST-segment elevation, 143 (14%) of 1,035 paired estimations (95% CI, 12% to 16%) produced inconsistent classifications. Physicians, residents, and students had similar median differences (P =.77). Some subjects exhibited more intraobserver variation than did others, and some complexes were associated with greater intraobserver variation than others (P <.001).
Conclusion: One fifth of the time, intraobserver measurements of paired ST-segment elevations differed by more than half a millimeter. Independent interpretations of the same ST segment by the same reader as greater than or equal to 2.0 mm or not were different 14% of the time. This could result in misclassification of candidates for thrombolytic therapy.