Predicting the Next State of Traffic by Data Mining Classification Techniques


1 Department of Mathematical and Computer Science, Amirkabir University of Technology, Tehran, Iran

2 Department of Computer Engineering, Isfahan University of Technology, Isfahan, Iran.

3 Young Researchers and Elite club, Central Tehran Branch, Islamic Azad University, Tehran, Iran.


Traffic prediction systems can play an essential role in intelligent transportation systems (ITS). Prediction and patterns comprehensibility of traffic characteristic parameters such as average speed, flow, and travel time could be beneficiary both in advanced traveler information systems (ATIS) and in ITS traffic control systems. However, due to their complex nonlinear patterns, these systems are burdensome. In this paper, we have applied some supervised data mining techniques (i.e. Classification Tree, Random Forest, Naïve Bayesian and CN2) to predict the next state of Traffic by a categorical traffic variable (level of service (LOS)) in different short-time intervals and also produce simple and easy handling if-then rules to reveal road facility characteristic. The analytical results show prediction accuracy of 80% on average by using methods


[1] Abdulhai, B. P. ,“Short-term traffic flow prediction using neuro-genetic algorithms”. ITS Journal, Vol.7, pp.3-41, 2002.
[2] P. Allaby, B. Hellinga, and Bullock, M. ,“Variable Speed Limits: Safety and Operational Impacts of a Candidate Control Strategy for an Urban Freeway”, IEEE Intelligent Transportation Systems Conference. Toronto, Canada, 2006.
[3] Y. Amit, & D. Geman ,“Shape Quantization and Recognition with Randomized Trees”. NEURAL COMPUTATION, Vol.9, Issue.7, pp.1545-1588, 1997.
[4] F. Attneave ,“Applications of information theory to psychology: a summary of basic concepts, methods, and results”. Holt, 1959.
[5] M. Ben-Bassat, “Use of Distance Measures, Information Measures and Error Bounds in Feature Evaluation”, Handbook of Statistics, Classification, Pattern Recognition and Reduction of Dimensionality, Vol.2, pp.773-791, 1982.
[6] L. Breiman, “Bagging predictors”, Machine Learning, Vol.24, Issue.2, pp.123-140, 1996.
[7] L. Breiman, J. H. Friedman, R. A. Olshen, & C. J. Stone, “Classification and Regression Trees”, Chapman & Hall, New York, 1984.
[8] L. Brieman, “Random Forests”. Machine Learning, Vol.45, Issue.1, pp.5-32, 2001.
[9] M. Carey, M. Bowers, “A Review of Properties of Flow–Density Functions”, Transport Reviews, Vol.32, Issue.1, pp.49-73, 2012.
[10] M. Castro-Neto, Y.-S. Jeong, M.-K. Jeong, & L. Han,” Online-SVR for short-term traffic flow prediction under typical and atypical traffic conditions”. Expert Systems with Applications, Vol.36, Issue.3, pp.6164-6173, 2009.
[11] B. Cestnik, “Estimating probabilities: A crucial task in machine learning”, Ninth European Conference on Artificial Intelligenc, Stokholm, pp.147-149, 1990.
[12] C. Chen, Y. Wang, L. Li, J. Hu, & Z. Zhang. “The retrieval of intra-day trend and its influence on traffic prediction”. Transportation Research Part C, Vol.22, Issue(June, 2012), pp.103-118, 2012.
[13] R. Chrobok, O. Kaumann, J. Wahle, M. Schreckenberg, “Different methods of traffic forecast based on real data”. European Journal of Operational Research , Vol.155 Issue.3, pp.558-568, 2004.
[14] P. Clark, R. Boswell, “Rule induction with CN2: Some recent improvements”. In Y. Kodratoff (Ed.) Proceedings of the 5th European conference, pp.151-163, 1991.
[15] P. Clark, & T. Niblett, “The CN2 Induction Algorithm. Machine Learning”, Vol.3, Issue.4, pp.261-283, 1989.
[16] E. Cook, L. Goldman,” Empiric comparison of multivariate analytic techniques: Advantages and disadvantages of recursive partitioning analysis”, Journal of Chronic Diseases, Vol.37, pp.721-731, 1984.
[17] T. G. Dietterich., “An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting and Randomization”, Machine Learning, Vol.40, Issue.2, pp.139-157, 2000.
[18] S. Dzeroski, B. Cestnik , I. Petrovski., “Using the m-estimate in rule induction”, Journal of Computing and Information Technology, Vol.1, Issue.1, pp.37-46, 1993.
[19] A. H. Ghods, L. Fu, A. Rahimi-Kian, “An Efficient Optimization Approach to Real-Time Coordinated and Integrated Freeway Traffic Control”, IEEE Transactions on Intelligent Transportation Systems, Vol.11, Issue.4, pp.872-884, 2010.
[20] J. Guo, B. Williams, B. Smith, “Data collection time intervals for stochastic short-term traffic flow forecasting”, Transportation Research Record: Journal of the Transportation Research Board, Issue.2024, pp.18-26, 2007.  [21] J. Han, M. Kamber, J. Pei. “Data Mining Concepts and Techniques”, Morgan Kaufmann; 3rd edition, July 6, 2011.
[22] A. Hegyi, B. Schutter. “Optimal Coordination of Variable Speed Limits to Suppress Shock Waves”, Transportation Research Record, No.1852, pp.167-174, 2003.
[23] T. K. Ho, “The Random Subspace Method for Constructing Decision Forests”, IEEE Transactions on Pattern Analysis and Machine Intelligence Pami, Vol.20, Issue.8, pp.832-844, 1998.
[24] W.-C. Hong., “Traffic Flow Forecasting by Seasonal SVR with Chaotic Simulated Annealing Algorithm”, Neurocomputing, Vol.74, Issue.12-13, pp.2096-2107, 2011.
[25] G. V. Kass.,” An Exploratory Technique for Investigating Large Quantities of Categorical Data”, Applied Statistics, Vol.29, Issue.2, pp.119-127, 1980.
[26] N. Lavrac, B. Kavsek, P. Flach, L. Todorovski, “Subgroup Discovery with CN2-SD”, Journal of Machine Learning Research, Vol.5, pp.153-188, 2004.
[27] J. Li, Q. Chen, D. Ni, H. Wang., “Analysis of LWR Model with Fundamental Diagram Subject to Uncertainty”, Greenshields 75 Symposium. Woods Hole MA: Transportation Research Board, pp.74-83, 2011.
[28] D. Lili, S. Peeta, Y. Hoon Kim. “An adaptive information fusion model to predict the short-term link travel time distribution in dynamic traffic networks”. Transportation Research Part B, Vol.46, pp.235-252, 2012.
[29] W.Y. Loh, Y-S shih., “Split selection methods for classification trees”, Statistics Sinica, Vol.7, pp.815-840, 1997.
[30] R. Michalski., “On the quasi-minimal solution of the general covering problem”, 5th Int. Symposium on Information Processing, pp.125-128, Bled, Yugoslavia 1969.
[31] M. Mozina, J. Demsar, M. Kattan, B. Zupan., “Nomograms for Visualization of Naive Bayesian Classifier”, Lecture Notes in Computer Science, Vol.3202, pp.337-348, 2004.
[32] T. Oda., “An algorithm for prediction of travel time using vehicle sensor data”, Third International Conference on Road Traffic Control, pp.40-44. London, England, 1990.
[33] M. Papageorgiou, I. Papamichail, A. Messmer, Y. Wang., “Traffic Simulation with METANET”, Fundamentals of Traffic Simulation, International Series in Operations Research & Management Science, pp.399-430. New York Dordrecht Heidelberg London, Springer, 2010.
[34] D. Park, L. R. Rilett, “Forecasting multiple-period freeway link travel times using modular neural networks”. Transportation Research Record, Vol.1617, pp.63-70, 1998.
[35] J. Quinlan, “Induction of decision trees”, Machine Learning, pp.81-106, 1986.
[36] J. Quinlan, “Simplifying decision trees”. International Journal of Machine Studies, Vol.27, pp.221-234, 1987.
[37] J. R. Quinlan, “C4.5: Programs for Machine Learning”, Morgan Kaufmann, 1993.
[38] L. Rokach and O. Maimon. “Decision trees”. In Lior Rokach and Oded Maimon (eds) Data Mining and Knowledge Discovery Handbook, pp.165-192, Springer, NY, 2010.
[39] L. Rokach, O. Maimon, “Top-Down Induction of Decision Trees Classifiers — A Survey”, IEEE Transaction on Systems, Man and Cybernetics—part C: applications and reviews, Vol.35, Issue.4, pp.476-487, 2005.
[40] B. Smith, M. Demetsky,“Traffic flow forecasting: comparison of modeling approaches”, Journal of Transportation Engineering, Vol.123, Issue.4, pp.261-266, 1997.
International Journal of Smart Electrical Engineering, Vol.1, No.3, Fall 2012 ISSN: 2251-9246
[41] B. Smith, B. Williams, R. Oswald. “Comparison of parametric and nonparametric models for traffic flow forecasting”, Transportation Research Part C. Emerging Technologies, Vol.10, Issue.4, pp.303-32, 2002.
[42] “Transportation Research Board”. Highway Capacity Manual. Washington DC: the National Research Council, 2000.
[43] J. van Lint, “Online Learning Solutions for Freeway Travel Time Prediction”, IEEE Transactions on Intelligent Transportation Systems, pp.38-47, 2008.
[44] C. Wu, C. Wei, D. Su, M. Chang, J. Ho.,“Travel time prediction with support vector regression”, Intelligent Transportation Systems, pp.1438-1442, Shanghai, China, 2003.
[45] K. Wunderlich, D. Kaufman, R. Smith,“Travel time prediction for decentralized route guidance architectures”, IEEE Transactions on Intelligent Transportation Systems, Vol.1, Issue.1, pp.4-14, 2000.
[46] F. Yang, Z. Yin, H. Liu, B. Ran.,“On line recursive algorithm for short-term traffic prediction”, Transportation Research Record: Journal of the Transportation Research Board, Vol.1879, pp.1-8, 2004.
[47] J. Yang.,“A Study of Travel Time Modeling Via Time Series Analysis”, IEEE Conference on Control Applications, pp.855-860, Toronto, Canada, 2005.
[48] X. Zhang, J.Rice,“Short-term Travel Time Prediction”. Transportation Research Part C, Vol.11, Issue.3-4, pp.187-210, 2003.
[49] Y. Zhang, Y. Liu,“Comparison of Parametric and Nonparametric Techniques for Non-peak Traffic Forecasting”, World Academic of Science and Engineering Technology, Vol.51, 2009. [50] M. Zhong, S. Sharma, P. Lingras,“Analyzing the performance of genetically designed short-term traffic prediction models based on road types and functional classes”, Lecture Notes in Computer Science, Vol.3029, pp.1133-1145, 2004.