Objectives: To assess interobserver variability of Apgar scores assigned with video recordings of neonatal resuscitation (AS(video)) and compare the scores assigned by observers of videos to the Apgar score given by staff attending the delivery (AS(del)).
Study design: Ten-second clips of 30 newborns taken at 5 minutes were shown to observers. Infants were 23 to 40 weeks' gestation, received varying degrees of resuscitation, and were monitored with pulse oximetry. Forty-two observers (neonatal/obstetric medical/nursing staff) scored infants' respiratory effort, muscle tone, reflex irritability, and color. The value for heart rate was assigned from the oximeter, which was masked in all clips. All 42 AS(video) and the AS(del) were represented graphically for each infant. Interobserver reliability was assessed by use of a variance components model.
Results: AS(video) varied widely between observers. Variability was large for all 4 elements of the score observers assigned and was seen irrespective of the infant's level of illness. AS(del) was greater than AS(video) in most cases, on average by 2.4 points. There was no evidence that the level of discrepancy was substantially different between groupings of staff.
Conclusion: The Apgar score has poor interobserver reliability. More objective and precise measures of newborns' condition are required.