Airline delay prediction by machine learning algorithms

Document Type : Article

Authors

1 Department of Transportation Engineering and Planning, School of Civil Engineering, Iran University of Science & Technology, Tehran, Iran

2 Department of Transportation Engineering and Planning, School of Civil Engineering, Iran University of Science & Technology,Tehran, Iran

Abstract

Flight planning, as one of the challenging issue in the industrial world, is faced with many uncertain conditions. One such condition is delay occurrence, which stems from various factors and imposes considerable costs on airlines, operators, and travelers. With these considerations in mind, we implemented flight delay prediction through proposed approaches that are based on machine learning algorithms. Parameters that enable the effective estimation of delay are identified, after which Bayesian modeling, decision tree, cluster classification, random forest, and hybrid method are applied to estimate the occurrences and magnitude of delay in a network. These methods were tested on a U.S. flight dataset and then refined for a large Iranian airline network. Results showed that the parameters affecting delay in US networks are visibility, wind, and departure time, whereas those affecting delay in Iranian airline flights are fleet age and aircraft type. The proposed approaches exhibited an accuracy of more than 70% in calculating delay occurance and magnitude in both the whole-network US and Iranian. It is hoped that the techniques put forward in this work will enable airline companies to accurately predict delays, improve flight planning, and prevent delay propagation.

Keywords

Main Subjects


1. Barnhart, C. and Smith, B. Quantitative problem solving methods in the airline industry", International Series in Operation Research & Management Science, Springer Science + Business Media, LLC (2012). 2. Ball, M., Barnhart, C., and Drenser, M., Total Delay Impact Study, The National Center of Excellence for Aviation Operation Research (2010). 3. Bazargan, M., Airline Operations and Scheduling, MPG Book Group, 2th Ed., UK (2010). 4. Barnhart, C. and Amy, C. Airline schedule planning: accomplishments and opportunities", Manufacturing & Service Operations Management, 6(1), pp. 3-22 (2004). 5. Gopalakrishnan, B. and Johnson, E.L. Airline crew scheduling: state-of-the-art", Annals of Operations Research, 140, pp. 305-337 (2005). 6. Sherali, H.D., Bish. E.K., and Zhu, X. Airline eet assignment concepts, models, and algorithms", European Journal of Operational Research, 172, pp. 1-30 (2005). 7. Listes, O. and Dekker, R. A scenario aggregationbased approach for determining a robust airline eet composition for dynamic capacity allocation", Transportation Science, 39(3), pp. 367-382 (2005). 8. Akartunal, K., Boland, N., Evans, I., Wallace, M., and Waterer, H. Airline planning benchmark problemspart II: passenger groups, utility and demand allocation", Computers & Operations Research, 40, pp. 793- 804 (2013). 9. Kohla, N., Larsen, A., Larsen, J., Ross, A., and Tiourine S. Airline disruption managementperspectives, experiences and outlook", Journal of Air Transport Management, 13, pp. 149-162 (2007). 10. Clausena, J., Larsen, A., Larsen, J., and Rezanova, N.J. Disruption management in the airline industryconcepts, models and methods", Computers & Operations Research, 37, pp. 809- 821 (2010). 11. http://www.transtats.bts.gov/OT Delay/OT Delay Cause1.asp. 12. Transportation Research Board De_ning and measuring aircraft delay and airport capacity thresholds", ACRP Report 104 (2014). 13. Tu, Y., Ball, M., and Jank, W. Estimating ight departure delay distributions -a statistical approach with long-term trend and short-term pattern", Journal of the American Statistical Association, 103, pp. 112- 125 (2008). 14. Mueller, E.R. and Chatterji, G.B. Analysis of aircraft arrival and departure delay characteristics", Proceeding of the 2th AIAA's Aircraft Technology, Integration, and Operations (ATIO) Conference, Los Angeles, California, USA (2002). 15. Avijit, M., Lovell, D.J., Ball, M.O., Odoni, A.R., and Zerbib, G. Modeling delays and cancellation probabilities to support strategic simulations", Proceedings of the 6th Air Tra_c Management Research and Development Seminar, Baltimore, MD, USA (2005). 16. Sridhar, B., Wang, Y., Klein, A., and Jehlen, R. Modeling ight delays and cancellations at the national, regional and airport levels in the United States", Proceedings of the 9th Air Tra_c Management Research and Development Seminar, Berlin, Germany (2011). 17. Lu, Z., Alarming Large Scale of Flight Delays: An Application of Machine Learning, Machine Learning. In Tech publishing, pp. 239-250 (2010). 18. Lu, Z., Wang, J., and Zheng, G. A new method to alarm large scale of ights delay based on machine learning, in knowledge acquisition and modeling", KAM '08. International Symposium on, pp. 589-592 (Dec. 2008). 19. Bola~nos, M.E. and Murphy, D. How much delay does New York inject into the national airspace system? A graph theory analysis", Proceeding of the 11th AIAA's Aircraft Technology, Integration, and Operations (ATIO) Conference; Los Angeles, California, USA (2013). 20. Rebollo, J.J. and Balakrishnan, H. Characterization and prediction of air tra_c delays", Transportation Research Part C, 44, pp. 231-241 (2014). 21. Oza, S., Sharma, S., Sangoi, H., Raut, R., and Kotak, V.C. Flight delay prediction system using weighted multiple linear regression", International Journal of Engineering and Computer Science, 4(4), pp. 11668- 11677 (2015). 22. Allan, S.S., Gaddy, S.G., and Evans, J.E., Delay Causality and Reduction at the New York City Airport Using Terminal Weather Information, Massachusett Institute of Technology (2011). 23. Wu, C. Inherent delays and operational of airline schedules", Journal of Air Transportation Management, 11(4), pp. 273-282 (2005). 24. Wang, P., Schaefer, L., and Wojcik, L. Flight connections and their impacts on delay propagation", Technical Report, MITRE (2003). 25. Janic, M. Modeling the large scale disruption of an airline network", Journal of Transportation Engineering, 131(4), pp. 249-260 (2005). 26. Hsiao, C. and Hansen, M. An econometric analysis of us airline ight delays with time-of-day e_ects", Proceedings of TRB 2006 Annual Meeting (2006). 27. Rupp, N. Investigating the causes of ight delays", Working paper, Department of Economics, East Carolina University (2007). 28. Boswell, S.B. and Evans, J.E. Analysis of downstream impacts of air tra_c delay", Lincoln Laboratory, Massachusetts Institute of Technology (1997). 29. Chen, H., Wang, J., and Yan, X. A fuzzy support vector machine with weighted margin for ight delay early warning", In Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on, 3, PP. 331-335 (Oct. 2008). 30. Liao, S.H., Chu, P.H., and Hsiao, P.Y. Data mining techniques and applications - A decade review from 2000 to 2011", Expert Systems with Applications, 39, pp. 11303-11311 (2012). 31. Vehtari, A. and Ojanen, J. A survey of Bayesian predictive methods for model assessment, selection and comparison", Statistics Surveys, 6, pp. 142-228 (2012). 32. Han, J., Kamber, M., and Pei, J., Data Mining Concepts and Techniques, Morgan Kaufmann Publishers, 3th Ed. (2012). 33. Breiman, L. Random forests", Machine Learning, 45(1), pp. 5-35 (2001). 34. Singh, K., Malik, D., and Sharma, N. Evolving limitations in K-means algorithm in data mining and their removal", IJCEM International Journal of Computational Engineering & Management, 12(1), pp. 105-109 (2011). 35. Abbas, O.A. Comparisons between data clustering algorithms", The International Arab Journal of Information Technology, 5(3), pp. 320-325 (2008). 36. https://www.wunderground.com. 37. Liu, Y., Yu, X., and Huang, J.X. Combining integrated sampling with SVM ensembles for learning from imbalanced datasets", Information Processing & Management, 47(4), pp. 617-631 (2011). 38. Mirkin, B. Core concepts in data analysis: summarization, correlation and visualization", Springer: Verlag London Limited: Available from Researchgate: http://www.researchgate.net/ pro_le/Boris Mirkin/ publication/232282057 Core Concepts in Data Analysis Summarization Correlation and Visualization/ links/0912f51090564b6e36000000.pdf (2011). 39. Maimon, O. and Rokachm, L. Data Mining and Knowledge Discovery Handbook, Springer, pp. 853-867, ISBN: 978-0-387-24435-8 (Print), 978-0-387-25465-4 (Online) (2005). 40. https://rapidminer.com/ 41. Sharma, T.C., Jain, M., and abad, F. WEKA approach for comparative study of classi_cation algorithm", International Journal of Advanced Research in Computer and Communication Engineering, 2(4), pp. 1925-1931 (2013).