返回论文列表
Paper Detail
Memory Over Maps: 3D Object Localization Without Reconstruction超越地图的记忆:无需重建的3D物体定位
cs.CVCV热门获取具身智能SLAM多模态
Anonymous
2026年03月21日
arXiv: 2603.20530v1

作者人数

1

标签数量

5

内容状态

含 PDF

原文 + 中文

同页查看标题和摘要的双语信息

PDF 预览

直接在详情页阅读或下载论文全文

深度分析

继续下钻到 AI 生成的结构化解读

摘要 / Abstract

This paper addresses the fundamental question of whether complete 3D scene reconstruction is necessary for object localization in embodied tasks. The authors propose a map-free pipeline that stores only posed RGB-D keyframes as a lightweight visual memory, eliminating the need for global 3D representations. At query time, the method retrieves candidate views and re-ranks them using a vision-language model for semantic reasoning. A sparse on-demand 3D estimate of the target is constructed through depth backprojection, enabling efficient localization without expensive reconstruction. This approach significantly reduces mapping time, storage overhead, and scalability limitations while maintaining effective performance for navigation and manipulation tasks.

本文探讨了在具身任务中进行物体定位是否必须进行完整的3D场景重建。作者提出了一种无地图管线,仅存储带位姿的RGB-D关键帧作为轻量级视觉记忆,无需全局3D表示。在查询时,该方法利用视觉-语言模型进行语义推理,检索并重排序候选视角。实验表明,该方法显著降低了建图时间、存储开销和可扩展性限制,同时在导航和操纵任务中保持了有效的性能。

PDF 预览
1
在 arXiv 查看下载 PDF

分类 / Categories

cs.CVcs.RO

深度分析

AI 深度理解论文内容,生成具有洞见性的总结