Automatic machine-learning based identification of jogging periods from accelerometer measurements of adolescents under field conditions

PLoS One. 2017 Sep 7;12(9):e0184216. doi: 10.1371/journal.pone.0184216. eCollection 2017.

Abstract

Background: Assessment of health benefits associated with physical activity depend on the activity duration, intensity and frequency, therefore their correct identification is very valuable and important in epidemiological and clinical studies. The aims of this study are: to develop an algorithm for automatic identification of intended jogging periods; and to assess whether the identification performance is improved when using two accelerometers at the hip and ankle, compared to when using only one at either position.

Methods: The study used diarized jogging periods and the corresponding accelerometer data from thirty-nine, 15-year-old adolescents, collected under field conditions, as part of the GINIplus study. The data was obtained from two accelerometers placed at the hip and ankle. Automated feature engineering technique was performed to extract features from the raw accelerometer readings and to select a subset of the most significant features. Four machine learning algorithms were used for classification: Logistic regression, Support Vector Machines, Random Forest and Extremely Randomized Trees. Classification was performed using only data from the hip accelerometer, using only data from ankle accelerometer and using data from both accelerometers.

Results: The reported jogging periods were verified by visual inspection and used as golden standard. After the feature selection and tuning of the classification algorithms, all options provided a classification accuracy of at least 0.99, independent of the applied segmentation strategy with sliding windows of either 60s or 180s. The best matching ratio, i.e. the length of correctly identified jogging periods related to the total time including the missed ones, was up to 0.875. It could be additionally improved up to 0.967 by application of post-classification rules, which considered the duration of breaks and jogging periods. There was no obvious benefit of using two accelerometers, rather almost the same performance could be achieved from either accelerometer position.

Conclusions: Machine learning techniques can be used for automatic activity recognition, as they provide very accurate activity recognition, significantly more accurate than when keeping a diary. Identification of jogging periods in adolescents can be performed using only one accelerometer. Performance-wise there is no significant benefit from using accelerometers on both locations.

MeSH terms

  • Accelerometry / instrumentation*
  • Adolescent
  • Algorithms
  • Automation
  • Databases as Topic
  • Humans
  • Jogging / physiology*
  • Machine Learning*
  • Models, Theoretical

Grants and funding

This study was part of the 15-year followup of the GINIplus cohort (German Infant Study on the influence of Nutrition Intervention PLUS environmental and genetic influences on allergy development). This study was partially financed by the Faculty of Computer Science and Engineering at the Ss. Cyril and Methodius University in Skopje, Macedonia. This study was partially supported from the European Commission, as part of ERAWEB project (ERASMUS–WESTERN BALKANS), financed by the Erasmus Mundus Action 2 Partnerships. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.