QUADRIVEN: A Framework for Qualitative Taxi Demand Prediction Based on Time-Variant Online Social Network Data Analysis

Sensors (Basel). 2019 Nov 8;19(22):4882. doi: 10.3390/s19224882.

Abstract

Road traffic pollution is one of the key factors affecting urban air quality. There is a consensus in the community that the efficient use of public transport is the most effective solution. In that sense, much effort has been made in the data mining discipline to come up with solutions able to anticipate taxi demands in a city. This helps to optimize the trips made by such an important urban means of transport. However, most of the existing solutions in the literature define the taxi demand prediction as a regression problem based on historical taxi records. This causes serious limitations with respect to the required data to operate and the interpretability of the prediction outcome. In this paper, we introduce QUADRIVEN (QUalitative tAxi Demand pRediction based on tIme-Variant onlinE social Network data analysis), a novel approach to deal with the taxi demand prediction problem based on human-generated data widely available on online social networks. The result of the prediction is defined on the basis of categorical labels that allow obtaining a semantically-enriched output. Finally, this proposal was tested with different models in a large urban area, showing quite promising results with an F1 score above 0.8.

Keywords: air pollution; machine learning; online social networks; smart cities; social media analysis; taxi demand.