Traffic&Transportation Journal
Sign In / Sign Up


Improving Traffic Efficiency in a Road Network by Adopting Decentralised Multi-Agent Reinforcement Learning and Smart Navigation
Hung Tuan Trinh, Sang-Hoon Bae, Quang Duy Tran
Keywords:multi-agent reinforcement learning (MARL), multi-agent advantage actor-critic (MA-A2C), deep reinforcement learning (DRL), deep neural network (DNN), connected and autonomous vehicles (CAVs), traffic signal control


In the future, mixed traffic flow will consist of human-driven vehicles (HDVs) and connected autonomous vehicles (CAVs). Effective traffic management is a global challenge, especially in urban areas with many intersections. Much research has focused on solving this problem to increase intersection network performance. Reinforcement learning (RL) is a new approach to optimising traffic signal lights that overcomes the disadvantages of traditional methods. In this paper, we propose an integrated approach that combines the multi-agent advantage actor-critic (MA-A2C) and smart navigation (SN) to solve the congestion problem in a road network under mixed traffic conditions. The A2C algorithm combines the advantages of value-based and policy-based methods to stabilise the training by reducing the variance. It also overcomes the limitations of centralised and independent MARL. In addition, the SN technique reroutes traffic load to alternate paths to avoid congestion at intersections. To evaluate the robustness of our approach, we compare our model against independent-A2C (I-A2C) and max pressure (MP). These results show that our proposed approach performs more efficiently than others regarding average waiting time, speed and queue length. In addition, the simulation results also suggest that the model is effective as the CAV penetration rate is greater than 20%.


[1] Downs A. Stuck in traffic Coping with peak-hour traffic congestion. The Lincoln Institute of an Policy Cambridge, Massachusetts; 1992. DOI: 10.1177/0739456X9301200312.
[2] Bilbao-Ubillos J. The costs of urban congestion: Estimation of welfare losses arising from congestion on cross-town link roads. Transportation Research Part A: Policy and Practice. 2008;42(8):1098-1108. DOI: 10.1016/j.tra.2008.03.015.
[3] Chin YK, et al. Multiple intersections traffic signal timing optimization with genetic algorithm. IEEE International Conference on Control System, Computing and Engineering, 2011, Penang, Malaysia. 2011. DOI: 10.1109/ICCSCE.2011.6190569.
[4] Mondal MA, Rehena Z. Priority-based adaptive traffic signal control system for smart cities. SN Computer Science. 2022;3:417. DOI: 10.1007/s42979-022-01316-5.
[5] Lewis FF, Liu D. Reinforcement learning and approximate dynamic programming for feedback control. IEEE Press; 2012. DOI: 10.1002/9781118453988.
[6] Mannion P, Duggan J, Howley E. An experimental review of reinforcement learning algorithms for adaptive traffic signal control. In: McCluskey T, et al. (eds) Autonomic road transport support systems. Birkhäuser, Cham; 2016. p. 47-66. DOI: 10.1007/978-3-319-25808-9_4.
[7] Chu T, Wang J. Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Transactions on Intelligent Transportation Systems. 2020;21(3). DOI: 10.1109/TITS.2019.2901791.
[8] Guo J, Cheng L, Wang S. CoTV: Cooperative control for traffic light signals and connected autonomous vehicles using deep reinforcement learning. IEEE Transactions on Intelligent Transportation Systems. 2023;24(10): 10501-10512. DOI: 10.1109/TITS.2023.3276416.
[9] Miletic M, Ivanjko E, Greguric M, Kusic K. A review of reinforcement learning applications in adaptive traffic signal control. IET Intelligent Transport Systems. 2022;16:1269-1285. DOI: 10.1049/itr2.12208.
[10] Kiran BR, et al. Deep reinforcement learning for autonomous driving: A survey. IEEE Transactions on Intelligent Transportation Systems. 2022;23(6):4909-4926. DOI: 10.1109/TITS.2021.3054625.
[11] Ge H, et al. Multi-agent transfer reinforcement learning with multi-view encoder for adaptive traffic signal control. IEEE Transactions on Intelligent Transportation Systems. 2022;23(8):12572-12587. DOI: 10.1109/TITS.2021.3115240.
[12] Kuutti S, et al. End-to-end reinforcement learning for autonomous longitudinal control using advantage actor critic with temporal context. IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand. 2019. DOI: 10.1109/ITSC.2019.8917387.
[13] Google Maps, Google. [Accessed 14 Mar. 2023].
[14] TomTom-Mapping and Location Technology, Tom Tom Technology. [Accessed 14 Mar. 2023].
[15] Branke J, Goldate P, Prothmann H. Actuated traffic signal optimisation using evolutionary algorithms. Proceedings of the 6th European Congress and Exhibition on Intelligent Transport Systems and Services (ITS07), Jun 2007 Aalborg, Denmark. 2007.
[16] Varaiya P. Max pressure control of a network of signalized intersections. Transp. Res. Part C Emerg. Technol. 2017;36:177-195.
[17] Ferreira M, et al. Self-organized traffic control. Proceedings of the seventh ACM international workshop on vehicular internetworking, Chicago, IL, USA. 2010. p. 85-90. DOI: 10.1145/1860058.1860077.
[18] Bretherton RD. Scoot urban traffic control system — Philosophy and evaluation. IFAC Proceedings Volumes.1990;237-239. DOI: 10.1016/S1474-6670(17)52676-2.
[19] Lowrie PR. SCATS: Sydney Co-Ordinated Adaptive Traffic System: A traffic responsive method of controlling urban traffic. Darlinghurst, NSW, Australia: Roads and traffic authority NSW; 1990.
[20] Greguric M, Vujic M, Alexopoulos C, Miletic M. Application of deep reinforcement learning in traffic signal control: An overview and impact of open traffic data. Applied Sciences. 2020;10(11). DOI: 10.3390/app10114011.
[21] Trinh TH, Bae SH, Duy QT. Deep reinforcement learning for vehicle platooning at a signalized intersection in mixed traffic with partial detection. Applied Sciences. 2022;12(19). DOI: 10.3390/app121910145.
[22] Liang X, Du X, Wang G, Han Z. A deep reinforcement learning network for traffic light cycle control. IEEE Transactions on Vehicular Technology. 2019;68:1243-1253. DOI: 10.1109/TVT.2018.2890726.
[23] Tran DQ, Bae SH. Proximal policy optimization through a deep reinforcement learning framework for multiple autonomous vehicles at a non-signalized intersection. Applied Sciences. 2020;10(16). DOI: 10.3390/app10165722.
[24] Schölkopf B, Platt J, Hofmann T. Advances in neural information processing systems 19: Proceedings of the 2006 Conference. The Annual Neural Information Processing Systems (NIPS) Conference, Vancouver. 2006.
[25] Ma D, Zhou B, Song X, Dai H. A deep reinforcement learning approach to traffic signal control with temporal traffic pattern mining. IEEE Transactions on Intelligent Transportation Systems. 2022;23(8). DOI: 10.1109/TITS.2021.3107258.
[26] Sun QW, et al. Deep reinforcement-learning based adaptive traffic signal control with real-time queue lengths. 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Prague, Czech Republic. 2022. DOI: 10.1109/SMC53654.2022.9945292.
[27] Wong A, et al. Deep multiagent reinforcement learning: Challenges and directions. Artificial Intelligence Review. 2022. DOI: 10.1007/s10462-022-10299-x.
[28] Wiering M. Multi-agent reinforcement learning for traffic light control. Proceedings 17th ICML. 2000.
[29] Prashanth LA, Bhatnagar S. Reinforcement learning with function approximation for traffic signal control. IEEE Transactions on Intelligent Transportation Systems. 2011;12(2). DOI: 10.1109/TITS.2010.2091408.
[30] El-Tantawy S, Abdulhai B, Abdelgawad H. Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (MARLIN-ATSC): Methodology and large-scale application on downtown Toronto. IEEE Transactions on Intelligent Transportation Systems. 2013;14(3). DOI: 10.1109/TITS.2013.2255286.
[31] Ge H, et al. Cooperative deep q-learning with Q-value transfer for multi-intersection signal control. IEEE Access. 2019;7:40797-40809. DOI: 10.1109/ACCESS.2019.2907618.
[32] Dijkstra EW. A note on two problems in connexion with graphs. Numer. Math. 1959;1(1). DOI: 10.1007/BF01386390.
[33] A start algorithm.*_search_algorithm. [Accessed 14 Mar. 2023].
[34] Dorigo M, Maniezzo V, Colorni A. Ant system: Optimization by a colony of cooperating agents. IEEE Trans. Syst. Man Cybern. 1996;26(1):29-41. DOI: 10.1109/3477.484436.
[35] Koh S, et al. Real-time deep reinforcement learning based vehicle navigation. Applied Soft Computing. 2020;96:106694. DOI: 10.1016/j.asoc.2020.106694.
[36] Claes R, Holvoet T, Weyns D. A decentralized approach for anticipatory vehicle routing using delegate multiagent systems. IEEE Transactions on Intelligent Transportation Systems. 2011;12(2):364-373. DOI: 10.1109/TITS.2011.2105867.
[37] Lopez PA, et al. Microscopic traffic simulation using SUMO. IEEE, 21st International Conference on Intelligent Transportation Systems (ITSC), 2018, Maui, USA. 2018. DOI: 10.1109/ITSC.2018.8569938.
Copyright (c) 2023 Hung Tuan Trinh, Sang-Hoon Bae, Quang Duy Tran

Published by
University of Zagreb, Faculty of Transport and Traffic Sciences
Online ISSN
Print ISSN
SCImago Journal & Country Rank
Publons logo
© Traffic&Transportation Journal