热点
"LLM基准测试" 相关文章
BixBench: a Comprehensive Benchmark for LLM-based Agents in Computational Biology
cs.AI updates on arXiv.org 2025-10-10T04:21:08.000000Z
Import AI 423: Multilingual CLIP; anti-drone tracking; and Huawei kernel design
Import AI 2025-08-04T10:05:57.000000Z
Jim Fan再谈基准测试之弊!Hugging Face开源套件LightEval领跑LLM评估新篇章
智源社区 2024-10-08T06:09:31.000000Z