Autonomous Taxi Driving Environment Using Reinforcement Learning Algorithms

Full Text (PDF, 548KB), PP.88-102

Views: 0 Downloads: 0

Author(s)

Showkat A. Dar 1,* S. Palanivel 1 M. Kalaiselvi Geetha 1

1. Department of Computer Science and Engineering, Annamalai University, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2022.03.06

Received: 4 Jan. 2022 / Revised: 18 Feb. 2022 / Accepted: 25 Mar. 2022 / Published: 8 Jun. 2022

Index Terms

Environment learning, autonomous driving, reinforcement learning (RL), Q-learning, deep Q network (DQN), and state action reward state action (SARSA), convolution deep Q network (ConvDQN), and deep learning.

Abstract

Autonomous driving is predicted to alter the transportation industry in the near future. For decades, carmakers, researchers, and administrators have already been working in this sector, with tremendous development. Nevertheless, there are still many uncertainties and obstacles to solve, not only in terms of technical technology, as well as in terms of human consciousness, culture, and present traffic infrastructure. With respect to technological challenges, precise route identification, avoiding the improper location, time delay, erroneous drop-off, unsafe path, and automated navigation in the environment are only a few. RL (Reinforcement Learning) has evolved into a robust learning model which can learn about complications in high dimensional settings, owing to the advent of deep representation learning. Environment learning has been shown to reduce the required time delay, reduce cost of travel, and improve the performance of the agent by discovering a successful drop-off. The major goal is to ensure that an autonomous vehicle driving can reach passengers, pick them up, and transport them to drop-off points as quickly as possible. For performing this task, RL methods like DQNs (Deep Q Networks), Q-LNs (Q-Learning networks) , SARSAs (state action reward state actions), and ConvDQNs (convolution DQNs) are proposed for driving Taxis autonomously. RL agent’s decisions are based on MDPs (Markov Decision Processes). The agent has effectively learnt the closest path, safety, and lower cost, gradually obtaining the capacity to travel bigger areas of the successful drop-off without negative incentive for reaching the target using these RL approaches. This scenario was chosen based on a set of requirements for simulating autonomous vehicles using RL algorithms. Results indicate that ConvDQNs are capable of successfully controlling cars in simulation environments than other RL methods. ConvDQNs are a combinations of CNNs (Convolution Neural Networks) and DQNs. These networks show better results than other methods as their combining of procedures gives improved results. Results indicate that ConvDQNs are capable of successfully controlling a car to navigate around a Taxi-v2 environment than the existing RL methods.

Cite This Paper

Showkat A. Dar. S. Palanivel, M. Kalaiselvi Geetha, "Autonomous Taxi Driving Environment Using Reinforcement Learning Algorithms", International Journal of Modern Education and Computer Science(IJMECS), Vol.14, No.3, pp. 88-102, 2022. DOI:10.5815/ijmecs.2022.03.06

Reference

[1]Sheng-Zu G. U., “Theoretical considerations and strategic choice on the development of smart city,” China Population Resources Environment, vol. 208, pp. 94–97, 2012.
[2]Neirotti P., A. De Marco, A. C. Cagliano, G. Mangano, and F. Scorrano, “Current trends in smart city initiatives: some stylised facts,” Cities, vol. 38, pp. 25–36, 2014.
[3]Hernafi, Y., Ahmed, M.B. and Bouhorma, M., 2016, An approaches' based on intelligent transportation systems to dissect driver behavior and smart mobility in smart city. In 2016 4th IEEE international colloquium on information science and technology (CiSt) ,pp. 886-895.
[4]Jin J., J. Gubbi, S. Marusic, and M. Palaniswami, “An information framework for creating a smart city through internet of things,” IEEE Internet of things Journal, vol. 1, no. 2, pp. 112–121, 2014.
[5]Zawieska, J. and Pieriegud, J., 2018. Smart city as a tool for sustainable mobility and transport decarbonisation. Transport Policy, 63, pp.39-50.
[6]Olszewski R., P. Pałka, and A. Turek, “Solving smart city transport problems by designing carpooling gamification schemes with multi-agent systems: the case of the so-called mordor of Warsaw,” Sensors, vol. 18, no. 2, pp.1-25, 2018.
[7]Kourti E., C. Christodoulou, L. Dimitriou, S. Christodoulou, and C. Antoniou, “Quantifying demand dynamics for Journal of Advanced Transportation 11 supporting optimal taxi services strategies,” Transportation Research Procedia, vol. 22, pp. 675–684, 2017.
[8]Vhaduri, Sudip, Christian Poellabauer, Aaron Striegel, Omar Lizardo, and David Hachen. "Discovering places of interest using sensor data from smartphones and wearables." In 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), pp. 1-8, 2017.
[9]Dong W., L. Qian, X. Zhu, C. Jie, Y. Huang, and W. Chen, “Understanding travel behavior of private cars via trajectory big data analysis in urban environments,” in Proceedings of the 2017 IEEE 15th International Conference on Dependable, Autonomic and Secure Computing, 15th International Conference on Pervasive Intelligence and Computing, 3rd International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress, Orlando, FL, USA, November 2017,pp. 917-924.
[10]Kyaw, T., Oo, N.N. and Zaw, W., 2018, Building travel speed estimation model for Yangon City from public transport trajectory data. In International Conference on Big Data Analysis and Deep Learning Applications, pp. 250-257.
[11]Dimitriou L., E. Kourti, C. Christodoulou, and V. Gkania, “Dynamic estimation of optimal dispatching locations for taxi services in mega-cities based on detailed GPS information,” IFAC-PapersOnLine, vol. 49, no. 3, pp. 197–202, 2016.
[12]Wei D., C. Yuan, H. Liu, D. Wu, and W. Kumfer, “The impact of service refusal to the supply demand equilibrium in the taxicab market,” Networks and Spatial Economics, vol. 17, no. 1, pp. 225–253, 2017.
[13]Rong H., Z. Wang, Z. Hui et al., “Mining efficient taxi operation strategies from large scale geo-location data,” IEEE Access, vol. 5, pp. 25623–25634, 2017.
[14]Kang C. and K. Qin, “Understanding operation behaviors of taxicabs in cities by matrix factorization,” Computers, Environment and Urban Systems, vol. 60, pp. 79–88, 2016.
[15]Zhang, W.; Honnappa, H.; Ukkusuri, S.V. Modeling urban taxi services with e-hailings: A queueing network approach. Transp. Res. Part C Emerg. Technol. 2020, 113, 332–349.
[16]Yuan, N.J.; Zheng, Y.; Zhang, L.; Xie, X. T-Finder: A Recommender System for Finding Passengers and Vacant Taxis. IEEE Trans. Knowl. Data Eng. 2013, 25, pp.2390–2403.
[17]Hwang, R.-H.; Hsueh, Y.-L.; Chen, Y.-T. An effective taxi recommender system based on a spatio-temporal factor analysis model. Inf. Sci. 2015, 314, pp.28–40.
[18]Luo, Z.; Lv, H.; Fang, F.; Zhao, Y.; Liu, Y.; Xiang, X.; Yuan, X. Dynamic Taxi Service Planning by Minimizing Cruising Distance Without Passengers. IEEE Access 2018, 6, pp.70005–70016.
[19]Ghosh, S.; Ghosh, S.K.; Buyya, R. MARIO: A spatio-temporal data mining framework on Google Cloud to explore mobility dynamics from taxi trajectories. J. Netw. Comput. Appl. 2020, 164, 102692.
[20]Ji, S.; Wang, Z.; Li, T.; Zheng, Y. Spatio-temporal feature fusion for dynamic taxi route recommendation via deep reinforcement learning. Knowl. Based Syst. 2020, 205, pp.1-17.
[21]Musolino, G.; Rindone, C.; Vitetta, A. Passengers and freight mobility with electric vehicles: A methodology to plan green transport and logistic services near port areas. Transp. Res. Procedia 2019, 37, pp.393–400.
[22]Croce, A.I.; Musolino, G.; Rindone, C.; Vitetta, A. Sustainable mobility and energy resources: A quantitative assessment of transport services with electrical vehicles. Renew. Sustain. Energy Rev. 2019, 113, pp.1-13.
[23]Croce, A.I.; Musolino, G.; Rindone, C.; Vitetta, A. Transport System Models and Big Data: Zoning and Graph Building with Traditional Surveys, FCD and GIS. ISPRS Int. J. Geo-Inf. 2019, 8, pp.1-18.
[24]Gao, Y.; Jiang, D.; Xu, Y. Optimize taxi driving strategies based on reinforcement learning. Int. J. Geogr. Inf. Sci. 2018, 32, pp.1677–1696.
[25]Shou, Z.; Di, X.; Ye, J.; Zhu, H.; Zhang, H.; Hampshire, R. Optimal passenger-seeking policies on E-hailing platforms using Markov decision process and imitation learning. Transp. Res. Part C Emerg. Technol. 2020, 111, pp.91–113.
[26]Mao, C.; Liu, Y.; Shen, Z.J.M. Dispatch of autonomous vehicles for taxi services: A deep reinforcement learning approach. Transp. Res. Part C Emerg. Technol. 2020, 115, pp.1-17.
[27]Wang, Z.; Qin, Z.; Tang, X.; Ye, J.; Zhu, H. Deep Reinforcement Learning with Knowledge Transfer for Online Rides Order Dispatching. In Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM), Singapore, 17–20 November 2018; pp. 617–626.
[28]Xu, Z.; Li, Z.; Guan, Q.; Zhang, D.; Ke, W.; Li, Q.; Nan, J.; Liu, C.; Bian, W.; Ye, J. Large-scale order dispatch in on-demand ridesharing platforms: A learning and planning approach. In Proceedings of the 24rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, UK, 19–23 August 2018.
[29]Chen, J., Yuan, B. and Tomizuka, M., 2019, Model-free deep reinforcement learning for urban autonomous driving. In 2019 IEEE intelligent transportation systems conference (ITSC), pp. 2765-2771.
[30]Verma, T., Varakantham, P., Kraus, S. and Lau, H.C., 2017, Augmenting decisions of taxi drivers through reinforcement learning for improving revenues. In Proceedings of the International Conference on Automated Planning and Scheduling, Vol. 27, No. 1, pp. 409-417.
[31]Kiran, B.R., Sobh, I., Talpaert, V., Mannion, P., Al Sallab, A.A., Yogamani, S. and Pérez, P., 2021. Deep reinforcement learning for autonomous driving: A survey. IEEE Transactions on Intelligent Transportation Systems, pp.1-18.
[32]Rasheed, I., Hu, F. and Zhang, L., 2020. Deep reinforcement learning approach for autonomous vehicle systems for maintaining security and safety using LSTM-GAN. Vehicular Communications, 26, pp.1-11.
[33]Jin, K., Wang, W., Hua, X. and Zhou, W., 2020. Reinforcement Learning for Optimizing Driving Policies on Cruising Taxis Services. Sustainability, 12(21), pp.1-19.
[34]Xu, Z.; Li, Z.; Guan, Q.; Zhang, D.; Ke, W.; Li, Q.; Nan, J.; Liu, C.; Bian, W.; Ye, J. Large-scale order dispatch in on-demand ridesharing platforms: A learning and planning approach. In Proceedings of the 24rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, UK, 19–23 August 2018.
[35]Gao, Y.; Jiang, D.; Xu, Y. Optimize taxi driving strategies based on reinforcement learning. Int. J. Geogr. Inf. Sci. 2018, 32, 1677–1696.
[36]Tang, X.; Qin, Z.; Zhang, F.; Wang, Z.; Xu, Z.; Ma, Y.; Zhu, H.; Ye, J. A Deep Value-network Based Approach for Multi-Driver Order Dispatching. In Proceedings of the 24rd ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; Association for Computing Machinery (ACM), London, UK, 19–23 August 2018.
[37]Fan, J., Wang, Z., Xie, Y. and Yang, Z., 2020, A theoretical analysis of deep Q-learning. In Learning for Dynamics and Control (pp. 486-489). PMLR.
[38]Zhou, S., Liu, X., Xu, Y. and Guo, J., 2018, A deep Q-network (DQN) based path planning method for mobile robots. In 2018 IEEE International Conference on Information and Automation (ICIA) ,pp. 366-371.
[39]Yu, D., Ni, K. and Liu, Y., 2020. Deep Q-Network with Predictive State Models in Partially Observable Domains. Mathematical Problems in Engineering, vol.2020, no. 1596385, pp.1-9.
[40]Spano, S., Cardarilli, G.C., Di Nunzio, L., Fazzolari, R., Giardino, D., Matta, M., Nannarelli, A. and Re, M., 2019. An efficient hardware implementation of reinforcement learning: The q-learning algorithm. IEEE Access, 7, pp.186340-186351.