优化方法_Fishai

热点

"优化方法" 相关文章

Linear Causal Discovery with Interventional Constraints

cs.AI updates on arXiv.org 2025-10-31T04:07:28.000000Z

Metadata-Driven Retrieval-Augmented Generation for Financial Question Answering

cs.AI updates on arXiv.org 2025-10-29T04:28:05.000000Z

Discovering the curriculum with AI: A proof-of-concept demonstration with an intelligent tutoring system for teaching project selection

cs.AI updates on arXiv.org 2025-10-22T04:26:16.000000Z

Injection, Attack and Erasure: Revocable Backdoor Attacks via Machine Unlearning

cs.AI updates on arXiv.org 2025-10-16T04:26:51.000000Z

Dual Perspectives on Non-Contrastive Self-Supervised Learning

cs.AI updates on arXiv.org 2025-10-15T05:13:46.000000Z

Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization

cs.AI updates on arXiv.org 2025-10-08T04:10:49.000000Z

Improving Consistency in Retrieval-Augmented Systems with Group Similarity Rewards

cs.AI updates on arXiv.org 2025-10-07T04:17:00.000000Z

SPOGW: a Score-based Preference Optimization method via Group-Wise comparison for workflows

cs.AI updates on arXiv.org 2025-10-07T04:07:40.000000Z

DRIFT: Learning from Abundant User Dissatisfaction in Real-World Preference Learning

cs.AI updates on arXiv.org 2025-10-06T04:23:06.000000Z

Consolidating Reinforcement Learning for Multimodal Discrete Diffusion Models

cs.AI updates on arXiv.org 2025-10-06T04:20:23.000000Z

Rethinking KL Regularization in RLHF: From Value Estimation to Gradient Optimization

cs.AI updates on arXiv.org 2025-10-03T04:15:24.000000Z

Rethinking RoPE Scaling in Quantized LLM: Theory, Outlier, and Channel-Band Analysis with Weight Rescaling

cs.AI updates on arXiv.org 2025-10-02T04:16:43.000000Z

Spectral Logit Sculpting: Adaptive Low-Rank Logit Transformation for Controlled Text Generation

cs.AI updates on arXiv.org 2025-10-01T05:59:43.000000Z

[macOS] Mac 端钉钉 CPU 降下来的办法

V2EX 2025-09-30T08:33:08.000000Z

Thinking Machines Lab 提出“模块化流形”方法优化权重矩阵

oschina.net 2025-09-30T03:57:23.000000Z

POLO: Preference-Guided Multi-Turn Reinforcement Learning for Lead Optimization

cs.AI updates on arXiv.org 2025-09-29T04:14:58.000000Z

Multi-Objective Reinforcement Learning for Large Language Model Optimization: Visionary Perspective

cs.AI updates on arXiv.org 2025-09-29T04:14:42.000000Z

Reinforcement Learning on Pre-Training Data

cs.AI updates on arXiv.org 2025-09-26T04:24:09.000000Z

Are We Scaling the Right Thing? A System Perspective on Test-Time Scaling

cs.AI updates on arXiv.org 2025-09-25T05:45:57.000000Z

Analysis of approximate linear programming solution to Markov decision problem with log barrier function

cs.AI updates on arXiv.org 2025-09-25T05:06:50.000000Z

Copyright © 2019 FISHAI.All Rights Reserved