Interobserver and intraobserver variation in day 3 embryo grading

Fertil Steril. 2006 Dec;86(6):1608-15. doi: 10.1016/j.fertnstert.2006.05.037. Epub 2006 Oct 30.


Objective: Variations in pregnancy rates (PR) between IVF programs are due to multiple factors, including embryo quality. Standardized embryo grading systems have been developed to improve communication between embryologists and clinicians. However, these grading systems have not been validated. We sought to quantify both interobserver and intraobserver variability using a standardized day 3 embryo grading system (Veeck scale).

Design: Prospective, sample-randomized, controlled, blinded study.

Setting: University hospital.

Patient(s): Twenty-six practicing embryologists.

Intervention(s): Observation and grading of 35 video clips of day 3 embryos.

Main outcome measure(s): Interobserver and intraobserver variability. Embryologists were also assessed by education level, years of experience, size of IVF program, and type of grading system used. Kappa scores and intraclass correlation coefficients were calculated.

Result(s): Practicing embryologists differed from control (Lucinda Veeck) by as much as two grades, despite using the same grading system (Kappa = 0.24, interclass correlation coefficient = 0.98). There was also variability in grading the same embryo (Kappa = 0.69, interclass correlation coefficient = 0.88). Programs with higher cycle numbers per year had lower variability.

Conclusion(s): There is substantial interobserver variability and moderate intraobserver variability among embryologists. Such variability could alter both the expected quality of embryos transferred, as well as the number transferred, both of which directly impact IVF program success.

MeSH terms

  • Embryo Transfer / standards
  • Embryo, Mammalian / anatomy & histology*
  • Embryology / methods*
  • Embryology / standards
  • Fertilization in Vitro*
  • Humans
  • Observer Variation*
  • Quality Assurance, Health Care / methods*
  • Quality Assurance, Health Care / standards
  • Reproducibility of Results
  • Sensitivity and Specificity
  • United States