Getting a good feature representation of data is paramount for Human Activity Recognition (HAR) using wearable sensors. An increasing number of feature learning approaches-in particular deep-learning based-have been proposed to extract an effective feature representation by analyzing large amounts of data. However, getting an objective interpretation of their performances faces two problems: the lack of a baseline evaluation setup, which makes a strict comparison between them impossible, and the insufficiency of implementation details, which can hinder their use. In this paper, we attempt to address both issues: we firstly propose an evaluation framework allowing a rigorous comparison of features extracted by different methods, and use it to carry out extensive experiments with state-of-the-art feature learning approaches. We then provide all the codes and implementation details to make both the reproduction of the results reported in this paper and the re-use of our framework easier for other researchers. Our studies carried out on the OPPORTUNITY and UniMiB-SHAR datasets highlight the effectiveness of hybrid deep-learning architectures involving convolutional and Long-Short-Term-Memory (LSTM) to obtain features characterising both short- and long-term time dependencies in the data.
Keywords: deep neural networks; evaluation framework; feature learning; human activity recognition; multimodal time series processing.