Eye-tracking experiments rely heavily on good data quality of eye-trackers. Unfortunately, it is often the case that only the spatial accuracy and precision values are available from the manufacturers. These two values alone are not sufficient to serve as a benchmark for an eye-tracker: Eye-tracking quality deteriorates during an experimental session due to head movements, changing illumination or calibration decay. Additionally, different experimental paradigms require the analysis of different types of eye movements; for instance, smooth pursuit movements, blinks or microsaccades, which themselves cannot readily be evaluated by using spatial accuracy or precision alone. To obtain a more comprehensive description of properties, we developed an extensive eye-tracking test battery. In 10 different tasks, we evaluated eye-tracking related measures such as: the decay of accuracy, fixation durations, pupil dilation, smooth pursuit movement, microsaccade classification, blink classification, or the influence of head motion. For some measures, true theoretical values exist. For others, a relative comparison to a reference eye-tracker is needed. Therefore, we collected our gaze data simultaneously from a remote EyeLink 1000 eye-tracker as the reference and compared it with the mobile Pupil Labs glasses. As expected, the average spatial accuracy of 0.57° for the EyeLink 1000 eye-tracker was better than the 0.82° for the Pupil Labs glasses (N = 15). Furthermore, we classified less fixations and shorter saccade durations for the Pupil Labs glasses. Similarly, we found fewer microsaccades using the Pupil Labs glasses. The accuracy over time decayed only slightly for the EyeLink 1000, but strongly for the Pupil Labs glasses. Finally, we observed that the measured pupil diameters differed between eye-trackers on the individual subject level but not on the group level. To conclude, our eye-tracking test battery offers 10 tasks that allow us to benchmark the many parameters of interest in stereotypical eye-tracking situations and addresses a common source of confounds in measurement errors (e.g., yaw and roll head movements). All recorded eye-tracking data (including Pupil Labs' eye videos), the stimulus code for the test battery, and the modular analysis pipeline are freely available (https://github.com/behinger/etcomp).
Keywords: Accuracy and precision; Blinks; Calibration decay; Eye-tracker benchmark; EyeLink 1000; Head movements; Microsaccades; Pupil Labs glasses; Pupil dilation; Smooth pursuit.