UniCoD：基于大规模预训练的通用机器人策略学习

cs.AI updates on arXiv.org 10月14日 12:18

UniCoD：基于大规模预训练的通用机器人策略学习

本文提出UniCoD，通过在大规模指令操作视频上进行预训练，使机器人能够动态建模高维视觉特征，并在机器人本体数据上微调，学习从预测表示到动作标记的映射。实验结果表明，该方法在模拟环境和现实世界中的任务上均优于基线方法。

arXiv:2510.10642v1 Announce Type: cross Abstract: Building generalist robot policies that can handle diverse tasks in open-ended environments is a central challenge in robotics. To leverage knowledge from large-scale pretraining, prior work has typically built generalist policies either on top of vision-language understanding models (VLMs) or generative models. However, both semantic understanding from vision-language pretraining and visual dynamics modeling from visual-generation pretraining are crucial for embodied robots. Recent unified models of generation and understanding have demonstrated strong capabilities in both comprehension and generation through large-scale pretraining. We posit that robotic policy learning can likewise benefit from the combined strengths of understanding, planning and continuous future representation learning. Building on this insight, we introduce UniCoD, which acquires the ability to dynamically model high-dimensional visual features through pretraining on over 1M internet-scale instructional manipulation videos. Subsequently, UniCoD is fine-tuned on data collected from the robot embodiment, enabling the learning of mappings from predictive representations to action tokens. Extensive experiments show our approach consistently outperforms baseline methods in terms of 9\% and 12\% across simulation environments and real-world out-of-distribution tasks.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

机器人策略学习大规模预训练视觉特征建模动作学习

相关文章

Octo: An Open-Sourced Large Transformer-based Generalist Robot Policy Trained on 800k Trajectories from the Open X-Embodiment Dataset

通用机器人里程碑！MIT提出策略组合框架PoCo，解决数据源异构难题，实现机器人多任务灵活执行

Precision home robots learn with real-to-sim-to-real

解锁具身 Scaling Law 需要先搞定异构数据吗？

Google DeepMind Researchers Propose RT-Affordance: A Hierarchical Method that Uses Affordances as an Intermediate Representation for Policies

T机器人专家简纪-202411121.近期进展：10月中供应商大会（世运电路）确有参加；目前三花、北特、鸣志已经确定中标；机器人本体已经定型，就差把灵巧手装上。目前...

把RLHF带给VLA模型！通过偏好对齐来优化机器人策略，代码已开源

空间具身通用操作模型！百万真实数据训练，预训练代码全开源 | 上海AI Lab/TeleAI/上科大等团队新作

刚刚，全球首个机器人拳王出炉！互殴现场比马拉松还抓马，笑疯了

EmbodieDreamer: Advancing Real2Sim2Real Transfer for Policy Training via Embodied World Modeling