Sensor networks in real-world environments, such as smart cities or ambient intelligent platforms, provide applications with large and heterogeneous sets of data streams. Outliers-observations that do not conform to an expected behavior-has then turned into a crucial task to establish and maintain secure and reliable databases in this kind of platforms. However, the procedures to obtain accurate models for erratic observations have to operate with low complexity in terms of storage and computational time, in order to attend the limited processing and storage capabilities of the sensor nodes in these environments. In this work, we analyze three binary classifiers based on three statistical prediction models-ARIMA (Auto-Regressive Integrated Moving Average), GAM (Generalized Additive Model), and LOESS (LOcal RegrESSion)-for outlier detection with low memory consumption and computational time rates. As a result, we provide (1) the best classifier and settings to detect outliers, based on the ARIMA model, and (2) two real-world classified datasets as ground truths for future research.
Keywords: Ambient Intelligence platform; abnormal data; binary classifier; outlier detection; prediction model; sensor.