多语言评估_Fishai

热点

"多语言评估" 相关文章

Beyond MCQ: An Open-Ended Arabic Cultural QA Benchmark with Dialect Variants

cs.AI updates on arXiv.org 2025-10-29T04:27:47.000000Z

VoiceAgentBench: Are Voice Assistants ready for agentic tasks?

cs.AI updates on arXiv.org 2025-10-10T04:07:41.000000Z

CEAID: Benchmark of Multilingual Machine-Generated Text Detection Methods for Central European Languages

cs.AI updates on arXiv.org 2025-10-01T06:01:24.000000Z

Scaling Truth: The Confidence Paradox in AI Fact-Checking

cs.AI updates on arXiv.org 2025-09-11T15:51:45.000000Z

Multilingual Performance Biases of Large Language Models in Education

cs.AI updates on arXiv.org 2025-08-06T04:38:38.000000Z

Effective cross-lingual LLM evaluation with Amazon Bedrock

AWS Machine Learning Blog 2025-07-08T15:49:18.000000Z

每周AI论文速递（250421-250425）

掘金人工智能 2025-04-27T11:07:55.000000Z

Copyright © 2019 FISHAI.All Rights Reserved