Inter-rater reliability of sleep cyclic alternating pattern (CAP) scoring and validation of a new computer-assisted CAP scoring method

Clin Neurophysiol. 2005 Mar;116(3):696-707. doi: 10.1016/j.clinph.2004.09.021. Epub 2004 Nov 10.


Objective: To assess inter-rater reliability between different scorers, from different qualified sleep research groups, in scoring visually the Cyclic Alternating Pattern (CAP), to evaluate the performances of a new tool for the computer-assisted detection of CAP, and to compare its output with the data from the different scorers.

Methods: CAP was scored in 11 normal sleep recordings by four different raters, coming from three sleep laboratories. CAP was also scored in the same recordings by means of a new computer-assisted method, implemented in the Hypnolab 1.2 (SWS Soft, Italy) software. Data analysis was performed according to the following steps: (a) the inter-rater reliability of CAP parameters between the four different scorers was carried out by means of the Kendall W coefficient of concordance; (b) the analysis of the agreement between the results of the visual and computer-assisted analysis of CAP parameters was also carried out by means of the Kendall W coefficient; (c) a 'consensus' scoring was obtained, for each recording, from the four scorings provided by the different raters, based on the score of the majority of scorers; (d) the degree of agreement between each scorer and the consensus score and between the computer-assisted analysis and the consensus score was quantified by means of the Cohen's k coefficient; (e) the differences between the number of false positive and false negative detections obtained in the visual and in the computer-assisted analysis were also evaluated by means of the non-parametric Wilcoxon test.

Results: The inter-rater reliability of CAP parameters quantified by the Kendall W coefficient of concordance between the four different scorers was high for all the parameters considered and showed values above 0.9 for total CAP time, CAP time in sleep stage 2 and percentage of A phases in sequence; also CAP rate showed a high value (0.829). The most important global parameters of CAP, including total CAP rate and CAP time, scored by the computer-assisted analysis showed a significant concordance with those obtained by the raters. The agreement between the computer-assisted analysis and the consensus scoring for the assignment of the CAP A phase subtype was not distinguishable from that expected from a human scorer. However, the computer-assisted analysis provided a number of false positives and false negatives significantly higher than that of the visual scoring of CAP.

Conclusions: CAP scoring shows good inter-rater reliability and might be compared in different laboratories the results of which might also be pooled together; however, caution should always be taken because of the variability which can be expected in the classical sleep staging. The computer-assisted detection of CAP can be used with some supervision and correction in large studies when only general parameters such as CAP rate are considered; more editing is necessary for the correct use of the other results.

Significance: This article describes the first attempt in the literature to evaluate in a detailed way the inter-rater reliability in scoring CAP parameters of normal sleep and the performances of a human-supervised computerized automatic detection system.

Publication types

  • Comparative Study

MeSH terms

  • Adult
  • Brain
  • Electroencephalography*
  • Electronic Data Processing*
  • Electrooculography / methods
  • Female
  • Functional Laterality / physiology
  • Humans
  • Male
  • Polysomnography*
  • Reproducibility of Results
  • Sleep / physiology*
  • Wakefulness / physiology