Paper Detail

Evaluating LLM-Based Test Generation Under Software Evolution软件演化环境下基于大语言模型的测试生成有效性评估

cs.SE大语言模型Transformer热门获取

Research Team

2026年03月25日

arXiv: 2603.23443v1

作者人数

1

标签数量

3

内容状态

含 PDF

原文 + 中文

同页查看标题和摘要的双语信息

PDF 预览

直接在详情页阅读或下载论文全文

深度分析

继续下钻到 AI 生成的结构化解读

摘要 / Abstract

Large Language Models are increasingly being applied for automated unit test generation in software engineering. This paper presents a comprehensive empirical study evaluating the effectiveness of LLM-generated tests under program code evolution. Through a mutation-driven framework analyzing 22,374 program variants and 8 different LLMs, the researchers assess how generated tests respond to semantic-altering and semantic-preserving changes. The study reveals that while LLMs achieve strong baseline performance with 79% line coverage and 76% branch coverage on original programs, test quality degrades significantly during software evolution. This work provides critical insights into the limitations of current LLM-based testing approaches and highlights the need for more robust test generation methods that can adapt to code changes.

摘要 / Abstract

分类 / Categories

深度分析