热点
关于我们
xx
xx
"
RLHF
" 相关文章
强化学习AI系统的设计实现及未来发展
36氪 - 科技频道
2025-11-04T13:25:04.000000Z
小互 💬 : AI 公司们为了获取“高质量”的语料,把大量 19 世纪末、20 世纪初的文学作品喂给了模型。 AI 忠实地学习了那个年代的写作风格——其中就包括对“破折号”的狂热喜爱。
小互推特
2025-11-03T05:55:10.000000Z
Beyond Scale: Why RLHF Is the Future of Specialized AI
Cogito Tech
2025-10-30T11:48:01.000000Z
Beyond Scale: Why RLHF Is the Future of Specialized AI
Cogito Tech
2025-10-30T11:48:01.000000Z
Why do AI models use so many em-dashes?
https://www.seangoedecke.com/rss.xml
2025-10-30T11:33:30.000000Z
Why do AI models use so many em-dashes?
https://www.seangoedecke.com/rss.xml
2025-10-30T11:33:30.000000Z
What can we learn from parent-child-alignment for AI?
少点错误
2025-10-29T08:13:02.000000Z
What can we learn from parent-child-alignment for AI?
少点错误
2025-10-29T08:13:02.000000Z
Greedy Sampling Is Provably Efficient for RLHF
cs.AI updates on arXiv.org
2025-10-29T04:31:05.000000Z
PaTaRM: Bridging Pairwise and Pointwise Signals via Preference-Aware Task-Adaptive Reward Modeling
cs.AI updates on arXiv.org
2025-10-29T04:27:00.000000Z
When Personalization Meets Reality: A Multi-Faceted Analysis of Personalized Preference Learning
cs.AI updates on arXiv.org
2025-10-28T04:14:38.000000Z
Anthropic、Thinking Machines Lab论文曝光:30万次压力测试揭示AI规范缺陷
机器之心
2025-10-27T09:42:21.000000Z
一个小技巧,帮你显著提高 AI 的回答质量!
掘金人工智能本月最热
2025-10-23T21:49:23.000000Z
一个小技巧,帮你显著提高 AI 的回答质量!
掘金人工智能本月最热
2025-10-23T21:49:23.000000Z
LLM Training Data Optimization: Fine-Tuning, RLHF & Red Teaming
Cogito Tech
2025-10-23T05:35:13.000000Z
LLM Training Data Optimization: Fine-Tuning, RLHF & Red Teaming
Cogito Tech
2025-10-23T05:35:13.000000Z
Direct Preference Optimization with Unobserved Preference Heterogeneity: The Necessity of Ternary Preferences
cs.AI updates on arXiv.org
2025-10-20T04:09:37.000000Z
GPT-5 核心成员详解 RL:Pre-training 只有和 RL 结合才能走向 AGI
海外独角兽
2025-10-18T16:26:25.000000Z
GPT-5 核心成员详解 RL:Pre-training 只有和 RL 结合才能走向 AGI
海外独角兽
2025-10-18T16:26:25.000000Z
GPT越来越保守?斯坦福Manning团队提出Verbalized Sampling,让模型重新“多想一点”
PaperWeekly
2025-10-17T14:33:46.000000Z