RLHF_Fishai

热点

"RLHF" 相关文章

强化学习AI系统的设计实现及未来发展

36氪 - 科技频道 2025-11-04T13:25:04.000000Z

小互 💬 : AI 公司们为了获取“高质量”的语料，把大量 19 世纪末、20 世纪初的文学作品喂给了模型。 AI 忠实地学习了那个年代的写作风格——其中就包括对“破折号”的狂热喜爱。

小互推特 2025-11-03T05:55:10.000000Z

Beyond Scale: Why RLHF Is the Future of Specialized AI

Cogito Tech 2025-10-30T11:48:01.000000Z

Beyond Scale: Why RLHF Is the Future of Specialized AI

Cogito Tech 2025-10-30T11:48:01.000000Z

Why do AI models use so many em-dashes?

https://www.seangoedecke.com/rss.xml 2025-10-30T11:33:30.000000Z

Why do AI models use so many em-dashes?

https://www.seangoedecke.com/rss.xml 2025-10-30T11:33:30.000000Z

What can we learn from parent-child-alignment for AI?

少点错误 2025-10-29T08:13:02.000000Z

What can we learn from parent-child-alignment for AI?

少点错误 2025-10-29T08:13:02.000000Z

Greedy Sampling Is Provably Efficient for RLHF

cs.AI updates on arXiv.org 2025-10-29T04:31:05.000000Z

PaTaRM: Bridging Pairwise and Pointwise Signals via Preference-Aware Task-Adaptive Reward Modeling

cs.AI updates on arXiv.org 2025-10-29T04:27:00.000000Z

When Personalization Meets Reality: A Multi-Faceted Analysis of Personalized Preference Learning

cs.AI updates on arXiv.org 2025-10-28T04:14:38.000000Z

Anthropic、Thinking Machines Lab论文曝光：30万次压力测试揭示AI规范缺陷

机器之心 2025-10-27T09:42:21.000000Z

一个小技巧，帮你显著提高 AI 的回答质量！

掘金人工智能本月最热 2025-10-23T21:49:23.000000Z

一个小技巧，帮你显著提高 AI 的回答质量！

掘金人工智能本月最热 2025-10-23T21:49:23.000000Z

LLM Training Data Optimization: Fine-Tuning, RLHF & Red Teaming

Cogito Tech 2025-10-23T05:35:13.000000Z

LLM Training Data Optimization: Fine-Tuning, RLHF & Red Teaming

Cogito Tech 2025-10-23T05:35:13.000000Z

Direct Preference Optimization with Unobserved Preference Heterogeneity: The Necessity of Ternary Preferences

cs.AI updates on arXiv.org 2025-10-20T04:09:37.000000Z

GPT-5 核心成员详解 RL：Pre-training 只有和 RL 结合才能走向 AGI

海外独角兽 2025-10-18T16:26:25.000000Z

GPT-5 核心成员详解 RL：Pre-training 只有和 RL 结合才能走向 AGI

海外独角兽 2025-10-18T16:26:25.000000Z

GPT越来越保守？斯坦福Manning团队提出Verbalized Sampling，让模型重新“多想一点”

PaperWeekly 2025-10-17T14:33:46.000000Z

Copyright © 2019 FISHAI.All Rights Reserved