Paper Detail

YOLOv10 with Kolmogorov-Arnold networks and vision-language foundation models for interpretable object detection and trustworthy multimodal AI in computer vision perceptionYOLOv10结合Kolmogorov-Arnold网络与视觉-语言基础模型的可解释目标检测及计算机视觉感知可信多模态AI研究

cs.CV自动驾驶CV热门获取目标检测多模态

Unknown

2026年03月24日

arXiv: 2603.23037v1

作者人数

1

标签数量

5

内容状态

含 PDF

原文 + 中文

同页查看标题和摘要的双语信息

PDF 预览

直接在详情页阅读或下载论文全文

深度分析

继续下钻到 AI 生成的结构化解读

摘要 / Abstract

This paper presents an interpretable object detection framework using Kolmogorov-Arnold networks to enhance trustworthiness in autonomous vehicle perception systems. The approach addresses the critical limitation of limited transparency in confidence scores during visually degraded or ambiguous driving scenarios. A Kolmogorov-Arnold network serves as an interpretable post-hoc surrogate model for YOLOv10 detections, utilizing seven geometric and semantic features to assess detection reliability. The additive spline-based architecture enables direct visualization of feature contributions, revealing when confidence scores are well-supported versus unreliable. Experimental validation on COCO dataset and University of Bath campus images demonstrates accurate trustworthiness estimation for autonomous driving perception.

摘要 / Abstract

分类 / Categories

深度分析