cs.AI updates on arXiv.org 09月25日 13:50
Sobolev训练加速ReLU网络收敛
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文首次提出严格的Sobolev训练理论框架,证明其对ReLU网络的收敛加速作用。通过推导精确的群体梯度和Hessian公式,量化损失景观的改善和梯度流收敛速率,实验表明Sobolev训练对现代深度学习任务有益。

arXiv:2509.19773v1 Announce Type: cross Abstract: Sobolev training, which integrates target derivatives into the loss functions, has been shown to accelerate convergence and improve generalization compared to conventional $L^2$ training. However, the underlying mechanisms of this training method remain only partially understood. In this work, we present the first rigorous theoretical framework proving that Sobolev training accelerates the convergence of Rectified Linear Unit (ReLU) networks. Under a student-teacher framework with Gaussian inputs and shallow architectures, we derive exact formulas for population gradients and Hessians, and quantify the improvements in conditioning of the loss landscape and gradient-flow convergence rates. Extensive numerical experiments validate our theoretical findings and show that the benefits of Sobolev training extend to modern deep learning tasks.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Sobolev训练 ReLU网络 收敛加速 理论框架 深度学习
相关文章