评估框架_Fishai

热点

"评估框架" 相关文章

RAGalyst: Automated Human-Aligned Agentic Evaluation for Domain-Specific RAG

cs.AI updates on arXiv.org 2025-11-07T05:51:16.000000Z

Opus: A Quantitative Framework for Workflow Evaluation

cs.AI updates on arXiv.org 2025-11-07T05:44:27.000000Z

Scalable Evaluation and Neural Models for Compositional Generalization

cs.AI updates on arXiv.org 2025-11-06T05:22:45.000000Z

Zero-shot data citation function classification using transformer-based large language models (LLMs)

cs.AI updates on arXiv.org 2025-11-06T05:09:06.000000Z

LLM Strategic Reasoning: Agentic Study through Behavioral Game Theory

cs.AI updates on arXiv.org 2025-11-05T05:31:16.000000Z

Speech-DRAME: A Framework for Human-Aligned Benchmarks in Speech Role-Play

cs.AI updates on arXiv.org 2025-11-05T05:30:15.000000Z

PreferThinker: Reasoning-based Personalized Image Preference Assessment

cs.AI updates on arXiv.org 2025-11-05T05:14:16.000000Z

Best Practices for Biorisk Evaluations on Open-Weight Bio-Foundation Models

cs.AI updates on arXiv.org 2025-11-03T05:19:53.000000Z

Vintage Code, Modern Judges: Meta-Validation in Low Data Regimes

cs.AI updates on arXiv.org 2025-11-03T05:19:23.000000Z

LLM-based Multi-class Attack Analysis and Mitigation Framework in IoT/IIoT Networks

cs.AI updates on arXiv.org 2025-11-03T05:18:55.000000Z

CATArena: Evaluation of LLM Agents through Iterative Tournament Competitions

cs.AI updates on arXiv.org 2025-11-03T05:17:15.000000Z

OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs

cs.AI updates on arXiv.org 2025-10-30T04:23:19.000000Z

[职场话题] 现在对 AI 产品经理要求都这么高了？

V2EX 2025-10-30T04:16:00.000000Z

[职场话题] 现在对 AI 产品经理要求都这么高了？

V2EX 2025-10-30T03:56:35.000000Z

[职场话题] 现在对 AI 产品经理要求都这么高了？

V2EX 2025-10-30T03:36:44.000000Z

[职场话题] 现在对 AI 产品经理要求都这么高了？

V2EX 2025-10-30T03:17:24.000000Z

[职场话题] 现在对 AI 产品经理要求都这么高了？

V2EX 2025-10-30T02:58:05.000000Z

[职场话题] 现在对 AI 产品经理要求都这么高了？

V2EX 2025-10-30T02:38:11.000000Z

[职场话题] 现在对 AI 产品经理要求都这么高了？

V2EX 2025-10-30T01:57:50.000000Z

[职场话题] 现在对 AI 产品经理要求都这么高了？

V2EX 2025-10-30T01:38:00.000000Z

Copyright © 2019 FISHAI.All Rights Reserved