热点
"直接对比" 相关文章
Arena-Lite: Efficient and Reliable Large Language Model Evaluation via Tournament-Based Direct Comparisons
cs.AI updates on arXiv.org 2025-08-05T11:29:22.000000Z