Licheng Wen

I am currently a young researcher in ADLab at Shanghai AI Laboratory. I earned my MSc degree from Zhejiang University in 2022, where I was a member of the APRIL Lab under the guidance of Dr. Yong Liu. Prior to this, I got my bachelor’s degree from Zhejiang University in 2019.

My research centers on addressing complex interaction challenges through closed-loop knowledge-driven autonomous driving. I am passionate about the future of autonomous driving and artificial general intelligence (AGI), and I am thrilled to contribute to these rapidly advancing fields.

Research Interests

The following are my current research interests:

Autonomous Driving
Emobodied AI
Mobile Robots

News

Oct 2, 2024	🎉 Paper Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving is accepted by NeurIPS 2024 ! 📢 We have also developed a closed-loop high-fidelity simulation platform called DriveArena!
Apr 1, 2024	🎉 Paper LimSim++: A Closed-Loop Platform for Deploying Multimodal LLMs in Autonomous Driving is aceepted by IV 2024 !
Jan 17, 2024	🎉 Paper DiLu🐴: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models is aceepted by ICLR 2024 !
Jan 2, 2024	🎉 Paper How drivers perform under different scenarios: Ability-related driving style extraction for large-scale dataset is published on Journal of Accident Analysis & Prevention.
Jul 13, 2023	🎉 Paper LimSim: A Long-term Interactive Multi-scenario Traffic Simulator is accepted by 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC) !

Selected Publications

2024

ICLR
DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models

Licheng Wen*, Daocheng Fu*, Xin Li*, and 7 more authors

In The Eleventh International Conference on Learning Representations (ICLR), 2024

Abs arXiv Bib Demo Code

Recent advancements in autonomous driving have relied on data-driven approaches, which are widely adopted but face challenges including dataset bias, overfitting, and uninterpretability. Drawing inspiration from the knowledge-driven nature of human driving, we explore the question of how to instill similar capabilities into autonomous driving systems and summarize a paradigm that integrates an interactive environment, a driver agent, as well as a memory component to address this question. Leveraging large language models with emergent abilities, we propose the DiLu framework, which combines a Reasoning and a Reflection module to enable the system to perform decision-making based on common-sense knowledge and evolve continuously. Extensive experiments prove DiLu’s capability to accumulate experience and demonstrate a significant advantage in generalization ability over reinforcement learning-based methods. Moreover, DiLu is able to directly acquire experiences from real-world datasets which highlights its potential to be deployed on practical autonomous driving systems. To the best of our knowledge, we are the first to instill knowledge-driven capability into autonomous driving systems from the perspective of how humans drive.
@inproceedings{wen2023dilu, title = {DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models}, author = {Wen*, Licheng and Fu*, Daocheng and Li*, Xin and Cai, Xinyu and Ma, Tao and Cai, Pinlong and Dou, Min and Shi, Botian and He, Liang and Qiao, Yu}, booktitle = {The Eleventh International Conference on Learning Representations (ICLR)}, year = {2024}, demo = {https://pjlab-adg.github.io/DiLu/}, }
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving

Licheng Wen*, Xuemeng Yang*, Daocheng Fu*, and 14 more authors

In ICLR 2024 Workshop on Large Language Model (LLM) Agents, 2024

Abs arXiv Bib Code

The development of autonomous driving technology depends on merging perception, decision, and control systems. Traditional strategies have struggled to understand complex driving environments and other road users’ intent. This bottleneck, especially in constructing common sense reasoning and nuanced scene understanding, affects the safe and reliable operations of autonomous vehicles. The introduction of Visual Language Models (VLM) opens up possibilities for fully autonomous driving. This report evaluates the potential of GPT-4V(ision), the latest state-of-the-art VLM, as an autonomous driving agent. The evaluation primarily assesses the model’s ultimate ability to act as a driving agent under varying conditions, while also considering its capacity to understand driving scenes and make decisions. Findings show that GPT-4V outperforms existing systems in scene understanding and causal reasoning. It has the potential in handling unexpected scenarios, understanding intentions, and making informed decisions. However, limitations remain in direction determination, traffic light recognition, vision grounding, and spatial reasoning tasks, highlighting the need for further research.
@inproceedings{wen2024road, title = {On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving}, author = {Wen*, Licheng and Yang*, Xuemeng and Fu*, Daocheng and Wang*, Xiaofeng and Cai, Pinlong and Li, Xin and Ma, Tao and Li, Yingxuan and Xu, Linran and Shang, Dengke and Zhu, Zheng and Sun, Shaoyan and Bai, Yeqi and Cai, Xinyu and Dou, Min and Hu, Shuanglu and Shi, Botian}, booktitle = {ICLR 2024 Workshop on Large Language Model (LLM) Agents}, year = {2024}, }

2023

ITSC
LimSim: A Long-term Interactive Multi-scenario Traffic Simulator

Licheng Wen*, Daocheng Fu*, Song Mao, and 3 more authors

In 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC), 2023

Abs arXiv Bib Video Code

With the growing popularity of digital twin and autonomous driving in transportation, the demand for simulation systems capable of generating high-fidelity and reliable scenarios is increasing. Existing simulation systems suffer from a lack of support for different types of scenarios, and the vehicle models used in these systems are too simplistic. Thus, such systems fail to represent driving styles and multi-vehicle interactions, and struggle to handle corner cases in the dataset. In this paper, we propose LimSim, the Long-term Interactive Multi-scenario traffic Simulator, which aims to provide a long-term continuous simulation capability under the urban road network. LimSim can simulate fine-grained dynamic scenarios and focus on the diverse interactions between multiple vehicles in the traffic flow. This paper provides a detailed introduction to the framework and features of the LimSim, and demonstrates its performance through case studies and experiments. LimSim is now open source on GitHub.
@inproceedings{wen2023limsim, title = {LimSim: A Long-term Interactive Multi-scenario Traffic Simulator}, author = {Wen*, Licheng and Fu*, Daocheng and Mao, Song and Cai, Pinlong and Dou, Min and Li, Yikang}, year = {2023}, booktitle = {2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC)}, video = {https://www.youtube.com/playlist?list=PLNeNtm096CAyYD1JJnkQ4gMaoFSdFLn2y}, }
AAMAS
Bringing Diversity to Autonomous Vehicles: An Interpretable Multi-Vehicle Decision-Making and Planning Framework

Licheng Wen, Pinlong Cai, Daocheng Fu, and 2 more authors

In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

Abs arXiv Bib

With the development of autonomous driving, it is becoming increasingly common for autonomous vehicles (AVs) and human-driven vehicles (HVs) to share the same roads. We propose a hierarchical multi-vehicle decision-making and planning framework with several advantages. The framework makes decisions jointly for all vehicles within the traffic flow and reacts promptly to the dynamic environment through a high-frequency planning module. The decision module produces interpretable action sequences that can explicitly communicate self-intentions to the surrounding HVs. We also present the cooperation factor and the trajectory weight set, which bring diversity to autonomous vehicles in traffic at both the social and individual levels.
@inproceedings{10.5555/3545946.3599005, author = {Wen, Licheng and Cai, Pinlong and Fu, Daocheng and Mao, Song and Li, Yikang}, title = {Bringing Diversity to Autonomous Vehicles: An Interpretable Multi-Vehicle Decision-Making and Planning Framework}, year = {2023}, isbn = {9781450394321}, publisher = {International Foundation for Autonomous Agents and Multiagent Systems}, address = {Richland, SC}, booktitle = {Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems}, pages = {2571–2573}, numpages = {3}, keywords = {autonomous driving, monte-carlo tree search, trajectory generation, vehicle flow}, location = {London, United Kingdom}, series = {AAMAS '23}, }

2022

CL-MAPF: Multi-Agent Path Finding for Car-Like robots with kinematic and spatiotemporal constraints

Licheng Wen, Yong Liu, and Hongliang Li

Robotics and Autonomous Systems, 2022

Abs arXiv Bib Video Code

Multi-Agent Path Finding has been widely studied in the past few years due to its broad application in the field of robotics and AI. However, previous solvers rely on several simplifying assumptions. This limits their applicability in numerous real-world domains that adopt nonholonomic car-like agents rather than holonomic ones. In this paper, we give a mathematical formalization of the Multi-Agent Path Finding for Car-Like robots (CL-MAPF) problem. We propose a novel hierarchical search-based solver called Car-Like Conflict-Based Search to address this problem. It applies a body conflict tree to address collisions considering the shapes of the agents. We introduce a new algorithm called Spatiotemporal Hybrid-State A* as the single-agent planner to generate agents’ paths satisfying both kinematic and spatiotemporal constraints. We also present a sequential planning version of our method, sacrificing a small amount of solution quality to achieve a significant reduction in runtime. We compare our method with two baseline algorithms on a dedicated benchmark and validate it in real-world scenarios. The experiment results show that the planning success rate of both baseline algorithms is below 50% for all six scenarios, while our algorithm maintains that of over 98%. It also gives clear evidence that our algorithm scales well to 100 agents in 300 m × 300 m scenario and is able to produce solutions that can be directly applied to Ackermann-steering robots in the real world. The benchmark and source code are released in https://github.com/APRIL-ZJU/CL-CBS. The video of the experiments can be found on YouTube.
@article{10.1016/j.robot.2021.103997, title = {CL-MAPF: Multi-Agent Path Finding for Car-Like robots with kinematic and spatiotemporal constraints}, journal = {Robotics and Autonomous Systems}, volume = {150}, pages = {103997}, year = {2022}, issn = {0921-8890}, doi = {https://doi.org/10.1016/j.robot.2021.103997}, url = {https://www.sciencedirect.com/science/article/pii/S0921889021002530}, author = {Wen, Licheng and Liu, Yong and Li, Hongliang}, keywords = {Multi-agent systems, Path planning, Mobile robots}, video = {https://www.youtube.com/watch?v=KThsX04ABvc}, }

Talks

On October 31st, 2024, I had the honor of presenting a talk titled "Empowering Automated Driving with LLMs: A Knowledge-driven Paradigm" to SAE International as part of their AI Webinar series.