热点
关于我们
xx
xx
"
人类评估
" 相关文章
Contrastive Decoding Mitigates Score Range Bias in LLM-as-a-Judge
cs.AI updates on arXiv.org
2025-10-22T04:20:38.000000Z
Evaluate LLMs and RAG a practical example using Langchain and Hugging Face
philschmid RSS feed
2025-09-30T11:11:44.000000Z
Creativity Benchmark: A benchmark for marketing creativity for LLM models
cs.AI updates on arXiv.org
2025-09-15T08:15:24.000000Z
Estimating Facial Attractiveness Prediction for Livestreams
Unite.AI
2025-01-08T14:46:44.000000Z
Meta 发布视频生成和编辑模型,来看看项目负责人的论文导读
AIGC Weekly
2024-12-22T16:39:14.000000Z
Chatbot Arena Conversation Dataset Release
无
2024-10-02T06:00:21.000000Z
o1谎称自己没有CoT?清华UC伯克利:RLHF让模型学会撒谎摸鱼,伪造证据PUA人类
36kr
2024-09-23T09:19:07.000000Z