KL散度_Fishai

热点

"KL散度" 相关文章

Resampling Conserves Redundancy & Mediation (Approximately) Under the Jensen-Shannon Divergence

少点错误 2025-10-31T01:16:12.000000Z

Sculpting Latent Spaces With MMD: Disentanglement With Programmable Priors

cs.AI updates on arXiv.org 2025-10-15T04:53:35.000000Z

Sculpting Latent Spaces With MMD: Disentanglement With Programmable Priors

cs.AI updates on arXiv.org 2025-10-15T04:53:35.000000Z

Deceptive Exploration in Multi-armed Bandits

cs.AI updates on arXiv.org 2025-10-13T04:13:21.000000Z

Unlocking Reasoning Capabilities in LLMs via Reinforcement Learning Exploration

cs.AI updates on arXiv.org 2025-10-07T04:15:51.000000Z

交叉熵：深度学习中最常用的损失函数

掘金人工智能 2025-09-17T09:55:57.000000Z

研究人员提出新型表示学习框架，填补深度学习模型缺乏因果刻画的空白

DeepTech深科技 2025-09-15T14:09:06.000000Z

西交利物浦大学 | 针对大型语言模型的目标导向生成式提示注入攻击

安全学术圈 2025-09-11T20:14:02.000000Z

SFT真不如RL？MIT团队抛出“RL的剃刀”，砍掉遗忘直通终身学习

PaperWeekly 2025-09-11T19:36:22.000000Z

SFT真不如RL？MIT团队抛出“RL的剃刀”，砍掉遗忘直通终身学习

PaperWeekly 2025-09-11T10:55:06.000000Z

SFT远不如RL？永不过时的剃刀原则打开「终身学习」大模型训练的大门

机器之心 2025-09-11T04:10:33.000000Z

西交利物浦大学 | 针对大型语言模型的目标导向生成式提示注入攻击

安全学术圈 2025-08-27T15:46:33.000000Z

Selective Generalization: Improving Capabilities While Maintaining Alignment

少点错误 2025-07-16T21:37:00.000000Z

Off-Policy Reinforcement Learning RL with KL Divergence Yields Superior Reasoning in Large Language Models

MarkTechPost@AI 2025-06-02T04:56:04.000000Z

$500 + $500 Bounty Problem: An (Approximately) Deterministic Maximal Redund Always Exists

少点错误 2025-05-06T23:07:26.000000Z

【带你读】花书《深度学习》导读第三章概率与信息论下

虎扑-热帖 2024-11-24T19:35:16.000000Z

如何准确且可解释地评估大模型量化效果？

智源社区 2024-08-10T08:07:28.000000Z

Beyond Accuracy: Evaluating LLM Compression with Distance Metrics

MarkTechPost@AI 2024-07-18T11:03:46.000000Z

Copyright © 2019 FISHAI.All Rights Reserved