Cervical cytology reproducibility and associated clinical and demographic factors

Hyunsoo Hwang; Michele Follen; Martial Guillaud; Michael Scheurer; Calum MacAulay; Calum MacAulay; Gregg A Staerkel; Dirk van Niekerk; Jose-Miguel Yamal

doi:10.1002/dc.24325

Cervical cytology reproducibility and associated clinical and demographic factors

Diagn Cytopathol. 2020 Jan;48(1):35-42. doi: 10.1002/dc.24325. Epub 2019 Oct 22.

Authors

Hyunsoo Hwang¹, Michele Follen^{2

3}, Martial Guillaud⁴, Michael Scheurer⁵, Calum MacAulay⁴, Calum MacAulay⁴, Gregg A Staerkel⁶, Dirk van Niekerk⁷, Jose-Miguel Yamal¹

Affiliations

¹ Department of Biostatistics and Data Science, The University of Texas School of Public Health, Houston, Texas.
² Department of Obstetrics and Gynecology, New York Medical College, Valhalla, New York.
³ Department of Obstetrics and Gynecology, SUNY Downstate College of Medicine, Brooklyn, New York.
⁴ Department of Integrative Oncology, BC Cancer Research Center, Vancouver, British Columbia, Canada.
⁵ Department of Pediatrics, Section of Hematology/Oncology Baylor College of Medicine, Houston, Texas.
⁶ Department of Pathology, Division of Pathology and Laboratory Medicine, MD Anderson Cancer Center, Houston, Texas.
⁷ Department of Pathology and Laboratory Medicine, BC Cancer Agency, Vancouver, British Columbia, Canada.

Abstract

Background: Although the Pap test has been the standard screening method for cervical precancer/cancer detection, it has been criticized for having a relatively low sensitivity and a low reproducibility between pathologists. There is limited knowledge about inter-rater agreement and what clinical and demographic factors are associated with disagreements between pathologists reading the same Pap smear.

Methods: This study aimed to assess inter- and intra- rater agreement of the Pap smear in 1619 cytologic slides with biopsy confirmation, using kappa statistics. Clinical and demographic factors associated with higher odds of inter-rater agreement were also examined and stratified by histologic diagnosis grade.

Results: Using a five grade classification system, the overall kappa statistics for total, inter-rater, and intra-rater samples were 0.62, 0.57, and 0.88 (unweighted) and 0.83, 0.81, and 0.95 (weighted), respectively. In stratified analyses by histologic grade, total kappas ranged from 0.40 (atypia) to 0.64 (human papilloma virus/CIN 1). Factors such as referral for abnormal Pap test (diagnostic vs screening population), recruiting site, and parity were found to be associated with higher agreement between the two cytologic readings.

Conclusions: We observed relatively higher levels of agreement compared with other studies. However, variability was considerable and agreement was generally moderate, suggesting that cervical screening test accuracy and reproducibility needs to be improved.

Keywords: IRR; Pap; cytologic diagnosis; inter-rater reliability; kappa.

MeSH terms

Adult
Cervix Uteri / cytology*
Cervix Uteri / pathology
Early Detection of Cancer / methods*
Female
Humans
Mass Screening / methods
Papanicolaou Test / methods*
Reproducibility of Results
Sensitivity and Specificity
Uterine Cervical Dysplasia / diagnosis*
Uterine Cervical Dysplasia / pathology
Uterine Cervical Neoplasms / diagnosis*
Vaginal Smears / methods

Abstract

MeSH terms

Grants and funding