In sport science, athlete tracking and motion analysis are essential for monitoring and optimizing training programs, with the goal of increasing success in competition and preventing injury. At present, contact-free, camera-based, multi-athlete detection and tracking have become a reality, mainly due to the advances in machine learning regarding computer vision and, specifically, advances in artificial convolutional neural networks (CNN), used for human pose estimation (HPE-CNN) in image sequences. Sport science in general, as well as coaches and athletes in particular, would greatly benefit from HPE-CNN-based tracking, but the sheer amount of HPE-CNNs available, as well as their complexity, pose a hurdle to the adoption of this new technology. It is unclear how many HPE-CNNs which are available at present are ready to use in out-of-the-box inference to squash, to what extent they allow motion analysis and if detections can easily be used to provide insight to coaches and athletes. Therefore, we conducted a systematic investigation of more than 250 HPE-CNNs. After applying our selection criteria of open-source, pre-trained, state-of-the-art and ready-to-use, five variants of three HPE-CNNs remained, and were evaluated in the context of motion analysis for the racket sport of squash. Specifically, we are interested in detecting player's feet in videos from a single camera and investigated the detection accuracy of all HPE-CNNs. To that end, we created a ground-truth dataset from publicly available squash videos by developing our own annotation tool and manually labeling frames and events. We present heatmaps, which depict the court floor using a color scale and highlight areas according to the relative time for which a player occupied that location during matchplay. These are used to provide insight into detections. Finally, we created a decision flow chart to help sport scientists, coaches and athletes to decide which HPE-CNN is best for player detection and tracking in a given application scenario.
Keywords: human pose estimation; racket sports; sports analysis; video tracking.