热点
"人类反馈" 相关文章
Greedy Sampling Is Provably Efficient for RLHF
cs.AI updates on arXiv.org 2025-10-29T04:31:05.000000Z
Improving Video Generation with Human Feedback
cs.AI updates on arXiv.org 2025-10-28T04:14:38.000000Z
RobotArena $\infty$: Scalable Robot Benchmarking via Real-to-Sim Translation
cs.AI updates on arXiv.org 2025-10-28T04:14:36.000000Z
Rectifying Shortcut Behaviors in Preference-based Reward Learning
cs.AI updates on arXiv.org 2025-10-23T04:09:39.000000Z
Offline and Online KL-Regularized RLHF under Differential Privacy
cs.AI updates on arXiv.org 2025-10-16T04:27:51.000000Z
Provably Mitigating Corruption, Overoptimization, and Verbosity Simultaneously in Offline and Online RLHF/DPO Alignment
cs.AI updates on arXiv.org 2025-10-08T04:12:10.000000Z
General Exploratory Bonus for Optimistic Exploration in RLHF
cs.AI updates on arXiv.org 2025-10-07T04:12:56.000000Z
Rethinking KL Regularization in RLHF: From Value Estimation to Gradient Optimization
cs.AI updates on arXiv.org 2025-10-03T04:15:24.000000Z
Feedback Forensics: A Toolkit to Measure AI Personality
cs.AI updates on arXiv.org 2025-10-01T06:01:44.000000Z
Avoiding $\mathbf{exp(R_{max})}$ scaling in RLHF through Preference-based Exploration
cs.AI updates on arXiv.org 2025-09-29T04:17:17.000000Z
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards
cs.AI updates on arXiv.org 2025-09-26T04:23:10.000000Z
AI Alignment Is Trivial
Zeroth Principles of AI 2025-09-25T10:02:09.000000Z
Opal: An Operator Algebra View of RLHF
cs.AI updates on arXiv.org 2025-09-16T05:32:50.000000Z
研究人员提出AI对齐新方法,通过“交互式分解”改善人类提供反馈过程
MIT 科技评论 - 本周热榜 2025-09-11T15:46:44.000000Z
Fine-tuning GPT-2 from human preferences
OpenAI blog 2025-09-06T09:45:28.000000Z
从原理到实战:RLHF(人类反馈强化学习)完整流程
掘金 人工智能 2025-09-03T06:10:41.000000Z
Ambiguity Resolution with Human Feedback for Code Writing Tasks
cs.AI updates on arXiv.org 2025-08-21T04:04:14.000000Z
Inclusion Arena: An Open Platform for Evaluating Large Foundation Models with Real-World Apps
cs.AI updates on arXiv.org 2025-08-18T04:21:21.000000Z
Forest vs Tree: The $(N, K)$ Trade-off in Reproducible ML Evaluation
cs.AI updates on arXiv.org 2025-08-06T04:02:14.000000Z
Augmented Reinforcement Learning Framework For Enhancing Decision-Making In Machine Learning Models Using External Agents
cs.AI updates on arXiv.org 2025-08-05T11:29:32.000000Z