作者人数
标签数量
内容状态
原文 + 中文
同页查看标题和摘要的双语信息
PDF 预览
直接在详情页阅读或下载论文全文
深度分析
继续下钻到 AI 生成的结构化解读
摘要 / Abstract
This paper presents OmniVTA, a world-model-based visuo-tactile manipulation framework designed for contact-rich robotic manipulation tasks such as wiping and assembly. The work introduces OmniViTac, a large-scale dataset comprising 21,000+ trajectories across 86 tasks and 100+ objects with six physics-grounded interaction patterns. The framework integrates four tightly coupled modules including a self-supervised tactile encoder and a two-stream visuo-tactile world model for predicting contact dynamics. The research addresses limitations in existing methods by treating tactile signals actively to model contact dynamics and enable explicit closed-loop control, moving beyond passive observation approaches.
本文提出 OmniVTA,一个基于世界模型的视觉-触觉操作框架,专为擦拭和装配等接触丰富的机器人操作任务设计。该工作引入了 OmniViTac,一个大规模数据集,包含 86 项任务和 100 多种物体的 21,000 余条轨迹及六种基于物理的交互模式。该框架集成了四个紧耦合模块,包括自监督触觉编码器和双流视觉-触觉世界模型用于预测接触动力学。本研究通过主动处理触觉信号来建模接触动力学并实现显式闭环控制,解决了现有方法中被动观察方式的局限性。
分类 / Categories
深度分析
AI 深度理解论文内容,生成具有洞见性的总结