LLM Test Generation
Title: LLM Test Generation
DNr: Berzelius-2025-67
Project Type: LiU Berzelius
Principal Investigator: Qunying Song <qunying.song@ucl.ac.uk>
Affiliation: University College London
Duration: 2025-02-24 – 2025-09-01
Classification: 10205
Keywords:

Abstract

Rigorous and comprehensive testing is essential to validate the functionality and safety of autonomous vehicles (AVs) or autonomous driving systems (ADS) before they are deployed on public roads. Scenario-based testing has become a widely adopted approach for evaluating AVs/ADS under various driving conditions in simulation environments (such as Carla and AirSim). However, generating realistic and relevant test scenarios remains a significant challenge. Current methods rely heavily on domain experts to manually create test scenarios or collect real-world data from test AVs, both of which are resource-intensive and costly. An emerging approach that has received increasing interest in research is leveraging generative AI, such as Large Language Models (LLMs), to automate the creation of test scenarios based on existing information sources. By utilizing AV manufacturers' disengagement reports and transforming textual descriptions of real-world disengagement events into executable test scenarios, LLMs (such as GPT models from OpenAI and Llama models from Meta) can provide an efficient and scalable solution for AV testing. Therefore, this project aims to assess the effectiveness of LLMs in generating executable test scenarios for AV/ADS using real-world AV disengagement data. The ultimate goal is to enhance AV/ADS testing and safety, facilitating their deployment in real-world traffic.