热点
关于我们
xx
xx
"
多语言评估
" 相关文章
Beyond MCQ: An Open-Ended Arabic Cultural QA Benchmark with Dialect Variants
cs.AI updates on arXiv.org
2025-10-29T04:27:47.000000Z
VoiceAgentBench: Are Voice Assistants ready for agentic tasks?
cs.AI updates on arXiv.org
2025-10-10T04:07:41.000000Z
CEAID: Benchmark of Multilingual Machine-Generated Text Detection Methods for Central European Languages
cs.AI updates on arXiv.org
2025-10-01T06:01:24.000000Z
Scaling Truth: The Confidence Paradox in AI Fact-Checking
cs.AI updates on arXiv.org
2025-09-11T15:51:45.000000Z
Multilingual Performance Biases of Large Language Models in Education
cs.AI updates on arXiv.org
2025-08-06T04:38:38.000000Z
Effective cross-lingual LLM evaluation with Amazon Bedrock
AWS Machine Learning Blog
2025-07-08T15:49:18.000000Z
每周AI论文速递(250421-250425)
掘金 人工智能
2025-04-27T11:07:55.000000Z