热点
"模型局限性" 相关文章
[程序员] 一道三年级数学题把大模型难住了,不是说数学推理都很厉害吗
V2EX 2025-10-29T05:59:46.000000Z
WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality
cs.AI updates on arXiv.org 2025-10-22T04:23:52.000000Z
Visible Yet Unreadable: A Systematic Blind Spot of Vision Language Models Across Writing Systems
cs.AI updates on arXiv.org 2025-09-18T05:09:27.000000Z
LLMs Don't Know Their Own Decision Boundaries: The Unreliability of Self-Generated Counterfactual Explanations
cs.AI updates on arXiv.org 2025-09-12T04:19:13.000000Z
Why Computer Science Is No Good, Redux
Communications of the ACM - Artificial Intelligence 2025-08-05T17:24:32.000000Z
陶哲轩亲测点赞o3-mini:专家级证明,我收到了一个完美的答案
36氪 - 科技频道 2025-03-11T04:07:34.000000Z
英语才是AI的母语?科学家发现模型的多模态推理全靠它
MIT 科技评论 - 本周热榜 2025-02-23T16:16:47.000000Z
Qwen开源视觉推理模型QVQ,更睿智地看世界!
魔搭ModelScope社区 2024-12-25T13:26:50.000000Z
Multimodal Situational Safety Benchmark (MSSBench): A Comprehensive Benchmark to Analyze How AI Models Evaluate Safety and Contextual Awareness Across Varied Real-World Situations
MarkTechPost@AI 2024-10-11T19:36:22.000000Z
ReliabilityBench: Measuring the Unpredictable Performance of Shaped-Up Large Language Models Across Five Key Domains of Human Cognition
MarkTechPost@AI 2024-09-28T12:20:50.000000Z
美國就業數據爲何大幅調整?
富途牛牛头条 2024-07-28T08:48:53.000000Z
“13.11和13.8哪个大”,为什么让大模型集体失智?
虎嗅 2024-07-17T07:06:38.000000Z