热点
关于我们
xx
xx
"
元评估基准
" 相关文章
MDSEval: A Meta-Evaluation Benchmark for Multimodal Dialogue Summarization
cs.AI updates on arXiv.org
2025-10-03T04:16:52.000000Z
MDSEval: A Meta-Evaluation Benchmark for Multimodal Dialogue Summarization
cs.AI updates on arXiv.org
2025-10-03T04:16:52.000000Z
SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks
cs.AI updates on arXiv.org
2025-07-02T04:03:50.000000Z