C^2-Eval：统一评估FM创造力的基准

cs.AI updates on arXiv.org 10月07日 12:07

C^2-Eval：统一评估FM创造力的基准

本文提出C^2-Eval，一个用于统一评估基础模型（FM）创造力的整体基准。通过区分收敛创造力和发散创造力，并采用社会科学理论中的细粒度标准来评估有用性、原创性和惊喜，分析了现有FMs在追求创造智能方面的优势和挑战。

arXiv:2510.04009v1 Announce Type: new Abstract: The meteoric rise of foundation models (FMs) has expanded their capabilities far beyond conventional tasks. Creativity, long regarded as a hallmark of human intelligence and a driver of innovation, is now increasingly recognized as a critical dimension of machine intelligence in the era of generative FMs, complementing traditional measures of accuracy. However, existing evaluation frameworks for creativity remain fragmented, relying on ad hoc metrics not firmly grounded in established theories. To address this gap, we introduce C^2-Eval, a holistic benchmark for unified assessment of creativity in FMs. C^2-Eval distinguishes between two complementary forms of creativity: convergent creativity, where tasks admit constrained solutions (e.g., code generation), and divergent creativity, where tasks are open-ended (e.g., storytelling). It evaluates both dimensions using fine-grained criteria derived from social-science theory, focusing on Usefulness, Originality, and Surprise (U-O-S). Through extensive experiments on leading proprietary and open-source models, we analyze trade-offs in their creative capabilities. Our results highlight both the strengths and challenges of current FMs in pursuing a creative machine mind, showing that C^2-Eval is an effective lens for examining the evolving landscape of creative AI.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

基础模型创造力评估 C^2-Eval 创造性AI 创造力基准

相关文章

Comment on What should the UK’s £100 million Foundation Model Taskforce do? by Import AI 334: Better distillation; the UK’s AI taskforce; money and AI | Import AI

Comment on What should the UK’s £100 million Foundation Model Taskforce do? by Government-issued digital money gets closer - The World News Papers

Paris-based AGI Startup The “H” Company Secures $220M in Seed Funding

AmbientGPT: An Open-Source and Multimodal MacOS Foundation Model GUI

Transparency in Foundation Models: The Next Step in Foundation Model Transparency Index FMTI

Synthetic Data Generation in Foundation Models and Differential Privacy: Three Papers from Microsoft Research

From Simple Rules to Smart Exploration: Intelligent Go-Explore IGE Bridges the Gap with Foundation Models in Autonomous Systems

Ask HN: "最佳 "法律硕士和基础模型教材推荐？

The Missing Piece: Combining Foundation Models and Open-Endedness for Artificial Superhuman Intelligence ASI

Stability AI: The “weight” is nearly over! Today, at @ComputexTaipei, our Co-CEO, @chrlaf, officially announced the open release date of Stable Diff...