31/03/2017
The analysis of data streams offers a great opportunity for development of new methodologies and applications in the area of Intelligent Transportation Systems. In this paper, we propose a new incremental learning approach for the travel time prediction problem for taxi GPS data streams in different scenarios and compare the same with four other existing methods. An extensive performance evaluation using
four real life datasets indicate that when the drop-off location is known and the training data sizes are small to moderate the Support Vector Regression method is the best choice considering both prediction accuracy and total computation time. However when the training data size becomes large the Randomized K-Nearest Neighbor Regression with Spherical Distance becomes the method of choice. Even when
the drop-off location is unknown then the Support Vector Regression method is the best choice when the training data size is small to moderate while for large training data size the Linear Regression method is a good choice. Finally, when continuous prediction of remaining travel time and continuous updating of total travel time along the trajectory of a trip are considered we find that the Support Vector
Regression method has the best predictive accuracy. We also propose a new hybrid method which improves the prediction accuracy of the
SVR method in the later part of a trip.