RAVEN：增强弱监督到强模型的泛化能力

cs.AI updates on arXiv.org 10月27日 14:25

RAVEN：增强弱监督到强模型的泛化能力

本文提出RAVEN框架，用于解决未来超人类模型复杂度提高后，人类难以准确监督其行为的问题。RAVEN通过动态学习弱模型的最佳组合，实现鲁棒的弱监督到强模型的泛化。实验表明，RAVEN在图像分类、文本分类和偏好对齐任务中均优于其他基准方法。

arXiv:2510.21332v1 Announce Type: cross Abstract: As future superhuman models become increasingly complex, accurately supervising their behavior may exceed human capabilities. Recent works have demonstrated that in such scenarios, weak models can effectively supervise strong models, a phenomenon known as weak-to-strong generalization. However, we find that naive weak-to-strong generalization fails under distribution shifts, often leading to worse performance of the strong model than its weak supervisors. To address this, we propose RAVEN, a robust weak-to-strong generalization framework that dynamically learns the optimal combinations of weak models in addition to parameters of the strong model. We demonstrate the effectiveness of RAVEN on image classification, text classification, and preference alignment tasks. RAVEN outperforms alternative baselines by over 30% on out-of-distribution tasks while matching or surpassing existing methods on in-distribution tasks. Moreover, our results show that RAVEN assigns higher weights to more accurate weak models, demonstrating its ability to automatically identify trustworthy supervision.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

RAVEN 弱监督强模型泛化鲁棒性

相关文章

Fairness and Robustness in Federated Learning with Virginia Smith -#504

High-Dimensional Robust Statistics with Ilias Diakonikolas - #351

RABBITS: A Specialized Dataset and Leaderboard to Aid in Evaluating LLM Performance in Healthcare

Comprehensive Analysis of The Performance of Vision State Space Models (VSSMs), Vision Transformers, and Convolutional Neural Networks (CNNs)

多模态大模型看懂图片也会答错，智源联合多家机构推出多模态模型鲁棒性测试基准

北航沙磊教授：当Agentic RAG照进现实——Agent Insights

Generalizable Reward Model (GRM): An Efficient AI Approach to Improve the Generalizability and Robustness of Reward Learning for LLMs

LayerShuffle: Robust Vision Transformers for Arbitrary Layer Execution Orders

击败人类又怎样？“超人”AI简直不堪一击？研究发现：ChatGPT等大模型也不行

Advancing Robustness in Neural Information Retrieval: A Comprehensive Survey and Benchmarking Framework