Sequential Order-Aware Coding-Based Robust Subspace Clustering for Human Action Recognition in Untrimmed Videos

IEEE Trans Image Process. 2023:32:13-28. doi: 10.1109/TIP.2022.3224877. Epub 2022 Dec 20.

Abstract

Human action recognition (HAR) is one of most important tasks in video analysis. Since video clips distributed on networks are usually untrimmed, it is required to accurately segment a given untrimmed video into a set of action segments for HAR. As an unsupervised temporal segmentation technology, subspace clustering learns the codes from each video to construct an affinity graph, and then cuts the affinity graph to cluster the video into a set of action segments. However, most of the existing subspace clustering schemes not only ignore the sequential information of frames in code learning, but also the negative effects of noises when cutting the affinity graph, which lead to inferior performance. To address these issues, we propose a sequential order-aware coding-based robust subspace clustering (SOAC-RSC) scheme for HAR. By feeding the motion features of video frames into multi-layer neural networks, two expressive code matrices are learned in a sequential order-aware manner from unconstrained and constrained videos, respectively, to construct the corresponding affinity graphs. Then, with the consideration of the existence of noise effects, a simple yet robust cutting algorithm is proposed to cut the constructed affinity graphs to accurately obtain the action segments for HAR. The extensive experiments demonstrate the proposed SOAC-RSC scheme achieves the state-of-the-art performance on the datasets of Keck Gesture and Weizmann, and provides competitive performance on the other 6 public datasets such as UCF101 and URADL for HAR task, compared to the recent related approaches.