cs.AI updates on arXiv.org 09月25日 14:09
量化与稀疏化新方法提升神经网络训练
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出一种新的去噪反量化变换,解决量化与稀疏化过程中引入的梯度下降问题,实现神经网络在超低精度和稀疏化状态下的稳定训练,达到业界领先效果。

arXiv:2409.09245v2 Announce Type: replace-cross Abstract: The discontinuous operations inherent in quantization and sparsification introduce a long-standing obstacle to backpropagation, particularly in ultra-low precision and sparse regimes. The standard Straight-Through Estimator (STE) is widely used to address this, but the well-understood mismatch between its quantization-aware forward pass and quantization-oblivious backward pass leads to unmanaged error that can corrupt the learning process. We solve this by introducing a denoising dequantization transform derived from a principled ridge regression objective. This transform makes the entire learning process aware of and robust to the quantization error that STE's surrogate gradient bypasses, by creating an explicit, corrective gradient path. We extend this principle to sparsification by viewing it as a special form of quantization that maps insignificant values to zero. Our unified framework allows existing models to be trained at a wide spectrum of precisions and sparsity levels with off-the-shelf recipes, achieving stable training of fully binary (A1W1) and sparse sub-1-bit networks where other methods falter. This approach yields state-of-the-art results and provides a theoretically-grounded path to hyper-efficient neural networks.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

神经网络训练 量化与稀疏化 梯度下降 去噪反量化 高效神经网络
相关文章