MetaVLA：提升VLA模型泛化能力的新框架

cs.AI updates on arXiv.org 10月08日 12:06

MetaVLA：提升VLA模型泛化能力的新框架

本文提出MetaVLA，一种用于提升视觉-语言-动作（VLA）模型泛化能力的统一后训练框架。通过引入上下文感知元协同训练和轻量级元学习机制，实现高效且可扩展的对齐。在LIBERO基准测试中，MetaVLA在长周期任务上比OpenVLA提升8.0%，训练步骤减少至75K，GPU时间减少约76%。

arXiv:2510.05580v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models show promise in embodied reasoning, yet remain far from true generalists-they often require task-specific fine-tuning, and generalize poorly to unseen tasks. We propose MetaVLA, a unified, backbone-agnostic post-training framework for efficient and scalable alignment. MetaVLA introduces Context-Aware Meta Co-Training, which consolidates diverse target tasks into a single fine-tuning stage while leveraging structurally diverse auxiliary tasks to improve in-domain generalization. Unlike naive multi-task SFT, MetaVLA integrates a lightweight meta-learning mechanism-derived from Attentive Neural Processes-to enable rapid adaptation from diverse contexts with minimal architectural change or inference overhead. On the LIBERO benchmark, MetaVLA with six auxiliary tasks outperforms OpenVLA by up to 8.0% on long-horizon tasks, reduces training steps from 240K to 75K, and cuts GPU time by ~76%. These results show that scalable, low-resource post-training is achievable-paving the way toward general-purpose embodied agents. Code will be available.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

MetaVLA VLA模型泛化能力元学习训练效率

相关文章

Trends in Reinforcement Learning with Simon Osindero - TWiML Talk #217

Sparse Maximal Update Parameterization (SμPar): Optimizing Sparse Neural Networks for Superior Training Dynamics and Efficiency

Optimizing for Choice: Novel Loss Functions Enhance AI Model Generalizability and Performance

如何快速学习一项新技能？

Exploring Offline Reinforcement Learning RL: Offering Practical Advice for Domain-Specific Practitioners and Future Algorithm Development

WAIC 首日集锦：AI 春晚，大佬都说了啥？

Dropout: A Revolutionary Approach to Reducing Overfitting in Neural Networks

通用机器人里程碑！MIT提出策略组合框架PoCo，解决数据源异构难题，实现机器人多任务灵活执行

生成式AI之父Jürgen Schmidhuber：机器学习编年史与宇宙未来丨智源独家

Generalizable Reward Model (GRM): An Efficient AI Approach to Improve the Generalizability and Robustness of Reward Learning for LLMs