量化神经网络训练优化研究

cs.AI updates on arXiv.org 10月14日

量化神经网络训练优化研究

本文探讨了量化神经网络训练中，直通估计器(STE)对学习动态的影响，发现其在高维极限下趋于确定性常微分方程，并揭示了其训练过程的表现特征。

arXiv:2510.10693v1 Announce Type: cross Abstract: Quantized neural network training optimizes a discrete, non-differentiable objective. The straight-through estimator (STE) enables backpropagation through surrogate gradients and is widely used. While previous studies have primarily focused on the properties of surrogate gradients and their convergence, the influence of quantization hyperparameters, such as bit width and quantization range, on learning dynamics remains largely unexplored. We theoretically show that in the high-dimensional limit, STE dynamics converge to a deterministic ordinary differential equation. This reveals that STE training exhibits a plateau followed by a sharp drop in generalization error, with plateau length depending on the quantization range. A fixed-point analysis quantifies the asymptotic deviation from the unquantized linear model. We also extend analytical techniques for stochastic gradient descent to nonlinear transformations of weights and inputs.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

量化神经网络直通估计器学习动态常微分方程优化

相关文章

Optimization, Machine Learning and Intelligent Experimentation with Michael McCourt - #545

Neural Ordinary Differential Equations with David Duvenaud - #364

Automated Model Tuning with SigOpt - #324

Supporting Rapid Model Development at Two Sigma with Matt Adereth & Scott Clark - TWIML Talk #273

理想汽车开启新一轮人员调整，优化超过18%

破冰！浙江正式全面开启长三角船检通检互认试点

中金公司：算力硬件市场有望步入以价换量时代

Model Explorer: A Powerful Graph Visualization Tool that Helps One Understand, Debug, and Optimize Machine Learning Models

保立佳：全资子公司烟台保立佳停产搬迁

调研早知道|跨境电商再迎政策利好！板块普涨下哪只个股直接受益？